3
$\begingroup$

Suppose I have $N$ samples $x^1, \ldots, x^N$ which were drawn iid from an unknown density $P(x)$.

Suppose I am interested in estimating the vector-valued function $g(x) = \nabla \log P (x)$.

One approach to this could be the following: for $h$ in some (sufficiently smooth, regular, etc.) function class $\mathcal{H}$, define the functional

\begin{align} Q(h) &= \int P (x) | h(x) - g(x) |^2 dx \\ &= \int P (x) | h(x) - \nabla \log P (x) |^2 dx \\ &= \int \left\{ P (x) | h(x)|^2 - 2 \langle \nabla P(x), h(x) \rangle + P(x) | \nabla \log P (x) |^2\right\} dx \\ &= \int \left\{ P (x) | h(x)|^2 + 2 P(x) \cdot \text{div}_x h(x) \right\} dx + c \\ &= \int P (x) \left\{ | h(x)|^2 + 2 \text{div}_x h(x) \right\} dx + c. \end{align}

If $g \in \mathcal{H}$, it is clear that $Q$ is minimised at $h = g$, and so it is putatively a sensible objective.

Now, replace the integral against $P$ by an empirical approximation based on our samples, i.e.

\begin{align} \hat{Q}^N(h) &= \frac{1}{N} \sum_{i = 1}^N \left\{ | h(x^i)|^2 + 2\left( \text{div}_x h\right) (x^i) \right\} dx. \end{align}

This is then a convex function of $h$, and should be possible to efficiently minimise, particularly if $\mathcal{H}$ is a vector space. In practice, one should probably also regularise $h$, but that's a separate issue in some ways.

My question is: is this a known approach for estimating $\nabla \log P(x)$ (`score estimation')? If so, can someone point me in the direction of a suitable reference?

$\endgroup$

1 Answer 1

3
$\begingroup$

This technique is described in:

https://arxiv.org/abs/1905.07088

`Sliced Score Matching: A Scalable Approach to Density and Score Estimation' - Yang Song, Sahaj Garg, Jiaxin Shi, Stefano Ermon

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.