In section 5.1 of Amari's book, when discussing statistical manifolds, the author states that, given a manifold whose points are probability distributions, one can identify tangent vectors $\boldsymbol e_i$ with the score functions, that is, $$\boldsymbol e_i \approx \partial_i \log p(x,\boldsymbol\xi).$$ Here $\boldsymbol\xi$ is the parameter defining the manifold, and $x$ labels the possible outcomes. I understand why tangent vectors should be functions of $x$, as in this context points are probability distributions, that is, functions $x\mapsto p(x,\boldsymbol\xi)$. However, I don't quite understand where this particular expression for the tangent vectors comes from. In fact, naively, at least for discrete distributions, I would have guessed tangent vectors to simply have the form $\partial_i p(x,\boldsymbol\xi)$, without the normalization factor arising from the log derivative.
This expression seems to be compatible with the Fisher information metric, introduced shortly thereafter as $$\langle\boldsymbol e_i,\boldsymbol e_j\rangle = \mathbb{E}[\partial_i \log p(x,\boldsymbol \xi)\partial_j \log p(x,\boldsymbol\xi)],$$ but if I understand the context correctly, this seems a bit backwards: the metric should comes from the expression for tangent vectors.
A possible solution to the conundrum is that what the author means is that one can choose a chart for the manifold with respect to which tangent vectors have this particular expression. This doesn't seem to be stated explicitly in the text, however. Is this correct, and if so, how can one see it explicitly?