5
$\begingroup$

Wikipedia says

... consider the map $f:p_{\theta }\mapsto p_{T\,\mid\, \theta }$ which takes each distribution on model parameter $\theta$ to its induced distribution on statistic $𝑇$. The statistic $T$ is said to be complete when $f$ is surjective, and sufficient when $f$ is injective.

(emphasis mine)

Is this claim true? ie does $f:p_{\theta }\mapsto p_{T\,\mid\, \theta }$ being injective imply statistic $T$ is sufficient?

My research so far:

I think wikipedia is incorrect, as I can prove by counterexample. ie I can provide an example where $f:p_{\theta }\mapsto p_{T\,\mid\, \theta }$ is injective but $T$ is not a sufficient statistic.

Consider this conditional probability distribution of samples $X$ given parameters $\theta$, ie $p_{X\,\mid \,\theta}$:

(table 1)

$\theta_1$ $\theta_2 $
$x_1$ $0.1 $ $0.2 $
$x_2$ $ 0.2$ $0.2 $
$x_3$ $0.3$ $0.3 $
$x_4$ $ 0.4 $ $0.3 $

and here is the map of samples $ X$ to statistic $T$, meaning that statistic $T$ calculated for sample $x_i$ (column 1) has value equal to $t_j$ (column 2) ie $T(x_i)=t_j$:

(table 2)

sample statistic
$x_1$ $t_1$
$x_2$ $t_1$
$x_3$ $t_2$
$x_4$ $t_2$

which leads to the following conditional probability distribution of statistic $T$, given parameters $\theta$, ie $p_{T\,\mid\,\theta}$:

(table 3)

$\theta_1$ $\theta_2$
$t_1$ $0.3$ $0.4$
$t_2$ $0.7$ $0.6 $

In this case $f:p_{\theta }\mapsto p_{T\,\mid\, \theta }$ is injective (can be deduced from table 3), but the statistic $ T$ is not sufficient, as for a given $T $ the conditional probability of $X $ is a function of $\theta$ (can be deduced form table 1 and table 2).

$\endgroup$
17
  • 5
    $\begingroup$ It's because you're not using Mathjax to construct it, which you really should do for all your math related stuff. $\endgroup$
    – jbowman
    Commented May 20 at 21:08
  • 1
    $\begingroup$ I cannot figure out what your table means: it doesn't appear to specify any kind of distribution. Could you explain? $\endgroup$
    – whuber
    Commented May 20 at 21:31
  • 2
    $\begingroup$ Could you please explain what your table showing a "map of samples to statistic" means? Because a statistic is, by definition, a numerical function of the sample and no such function is in evidence (what are the "$t_i$"?), it takes considerable guessing to read this post. You could further clarify it by using $\TeX$ markup to render the mathematical symbols as intended. $\endgroup$
    – whuber
    Commented Jun 6 at 11:23
  • 1
    $\begingroup$ @whuber I use this definition of sufficient statistic as given in Casella: A statistic $T(X)$ is a sufficient statistic for $\theta$ if the conditional distribution of the sample $X$ given the value of $T(X)$ does not depend on $θ$. ie $P(X=x|T(X)=T(x);\theta=\theta_1)=P(X=x|T(X)=T(x);\theta=\theta_2)$ . But we can see in my example that $P(X=x1|T(X)=t1;\theta=\theta1)=0.1/(0.1+0.2) = 1/3$ , which is not equal to $P(X=x1|T(X)=t1;\theta=\theta2)=0.2/(0.2+0.2)=1/2$ $\endgroup$
    – Shreyans
    Commented Jun 26 at 22:13
  • 1
    $\begingroup$ @Shreyans, it seems to me that you are right and wikipedia is wrong. I really don't understand their discussion of priors in the earlier paragraph, which is out-of-scope. The characterization they give for sufficiency seems to instead match identifiability? But I wonder if part of the issue here is that they have butchered the truth. I vaguely recall there being a relationship between sufficiency, completeness and some mapping being injective / surjective. Isn't that in the Casella book? I'll try to recall all I've forgotten on this topic. Maybe we should rewrite the page when we are done here $\endgroup$ Commented Jun 27 at 9:27

1 Answer 1

0
$\begingroup$

As your Table 1 shows, there are two distinct distributions on $\theta$ (the model is identifiable). But there are four distinct distributions conditional on $T$ & $\theta$:

$X$ $\Pr(X=x_i|t_1,\theta_1)$ $\Pr(X=x_i|t_1,\theta_2)$ $\Pr(X=x_i|t_2,\theta_1)$ $\Pr(X=x_i|t_2,\theta_2)$
$x_1$ $\frac{1}{3}$ $\frac{1}{2} $ $0 $ $0 $
$x_2$ $\frac{2}{3}$ $\frac{1}{2} $ $0 $ $0 $
$x_3$ $0$ $0 $ $\frac{3}{7}$ $\frac{1}{2}$
$x_4$ $ 0 $ $0 $ $\frac{4}{7}$ $\frac{1}{2}$

An injective mapping is one-to-one: were $T$ sufficient, there would indeed only be two distinct distributions in this table as then $\Pr(X=x_i|t_j,\theta_1) = \Pr(X=x_i|t_j,\theta_2)$ for $j=1,2$.

$\endgroup$
1
  • $\begingroup$ I agree with your conclusion, but confused about a lot of things. Why does the number of distributions on $\theta$ matter? I can think of situations where there are 4 distinct distributions on $\theta$ as well as 4 that are conditional on $T$ & $\theta$, yet the statistic is not sufficient. Same confusion about identifiable, why does it matter? Also what is the point of saying this "An injective mapping is one-to-one" ? It is just the definition of injective function, I don't understand how it relates to the sentence you wrote after it. $\endgroup$
    – Shreyans
    Commented Jun 27 at 19:24

Not the answer you're looking for? Browse other questions tagged or ask your own question.