I've seen the Fisher-Rao metric introduced via the following argument (see Sidhu and Kok 2019):
There is a natural pairing between the simplex and its dual space of classical random variables: $\langle A,p\rangle=\sum_i A_i p^i$, where $A$ is the random variable and $p$ the probability distribution (I'll assume discrete distributions here). Moreover, we can naturally define a scalar product between random variables as $$\langle A,B\rangle = \sum_j A_j B_j p^j.$$
This suggests introducing the metric $h^{jk}=\delta^{jk} p^j$, which provides the above metric product structure. If we now take the inverse of this metric, that is, the metric on the dual space of the random variables, we get $h_{jk} = \delta_{jk}/p^j,$ which corresponds to the line element $$\mathrm ds_{\rm FR}^2 = \sum_j \frac{\mathrm dp^j \, \mathrm dp^j}{p^j}.$$
I'm a bit confused by this argument. We are thinking of random variables as dual elements of the probability distributions, which is fine. However, I don't understand the introduced metric: the metric should be defined between tangent vectors of the manifold we are considering. So here the random variables are tangent vectors?
But when we take the dual metric, $h_{jk}$, this is then supposed to operate on tangent vectors to the manifold of probability distributions. But the duals of random variables are probability distributions, not displacements in the simplex, so then the metric should define an inner product between probability distributions. So in summary, I'm confused, does the Fisher-Rao metric define an inner product/metric directly on probability distributions, or on tangent vectors/vector fields in the manifold of probability distributions?