Skip to main content
49 events
when toggle format what by license comment
Jul 17 at 13:10 comment added Shreyans Yes indeed restrictive, and probably useless ^_^. I do not understand the rest of your comment though. What would only include a discrete variable? I think the above relationship holds for continuous functions/distributions too. I'm also not following the second part: In case of a discrete distribution, the count of each value of the variable is same as having the entire sample, which is trivially sufficient. But its not clear how it is minimal sufficient? Also not sure what criteria is no longer respected?
Jul 17 at 12:52 comment added Guillaume Dehaene ah right. I think the claim that "surjectivity implies completeness" is correct, but it is also a very restrictive result. That would only include a discrete variable, and would correspond to the categorical variable over all possible values. In that case, the minimal sufficient statistic for multiple IID variables from this model is the count of each category and that no longer respects the criteria (despite being complete).
Jul 17 at 12:40 comment added Shreyans I have kind of messed up the sequence of the proof, and tbh I think we can ask and answer this in a separate question, but a summary of my claim: $f:p_{\theta }\mapsto p_{T\,\mid\, \theta }$ is surjective and the distribution is identifiable $\implies$ $f:\theta \mapsto p_{T\,\mid\, \theta }$ is surjective $\implies$ $p_{T\,\mid\, \theta }$ spans the whole space of functions of $T$ $\implies$ the statistic $T$ is complete for parameter $\theta$
Jul 17 at 12:37 comment added Shreyans Thirdly, to link this to the OG function from the question we are commenting on, if $f:p_{\theta }\mapsto p_{T\,\mid\, \theta }$ is surjective and the distribution is identifiable, then $f:\theta \mapsto p_{T\,\mid\, \theta }$ is also surjective.
Jul 17 at 12:28 comment added Shreyans [contd.] Secondly, if $f : \theta\mapsto p_{T\,\mid\,\theta}$ is surjective then $p_{T\,\mid\,\theta}$ necessarily spans the whole space of functions of T. I don't have a mathematical proof for this yet, but intuitively $p_{T\,\mid\,\theta}$ must contain the basis set of the function space of $T$, because it contains all functions that are 1 for a single value of $T$ and 0 for all other values of T (standard basis).
Jul 17 at 12:28 comment added Shreyans 4. I still think there is a link between surjectivity and completeness: I claim $f : \theta\mapsto p_{T\,\mid\,\theta}$ being surjective implies statistic $T$ being complete. Firstly a statistic $T$ is complete if and only if the functions $p_{T\,\mid\,\theta}$ for varying $\theta$ span the whole space of functions of T (proof). [contd.]
Jul 17 at 12:13 comment added Shreyans @GuillaumeDehaene 1. No I intentionally used surjective. 2. By target ensemble do you mean the co-domain of the surjective function? I want $f : \theta\mapsto p_{T\,\mid\,\theta}$ to be surjective so we should produce every possible function $p_{T\,\mid\,\theta}$ by varying $\theta$ 3. Actually in my example $f : \theta\mapsto p_{T\,\mid\,\theta}$ is not surjective but the statistic $T$ is complete for $\theta$. Which means indeed I was partially incorrect, completeness of statistic $T$ does not imply $f : \theta\mapsto p_{T\,\mid\,\theta}$ is surjective.
Jul 15 at 11:41 comment added Guillaume Dehaene I don't think your three points are true. 1. Have you misused surjective instead of injective? 2. If you meant to use surjective, what is the target ensemble we are trying to produce as our image? 3. In your example, don't we have the surjectivity without completeness? 4. I think I'm starting to recover the point about surjectivity and injectivity and the key properties. I don't think there is actually a link to completeness. In conclusion, double-check your stuff and let me know. I'll post an answer with the link to surjectivity and injectivity and sufficiency
Jul 15 at 6:29 comment added Shreyans By identifiable I mean $f : \theta\mapsto p_\theta$ is one-to-one. But strictly speaking $f : \theta\mapsto p_\theta$ being surjective is enough for the second point above.
Jul 15 at 6:25 comment added Shreyans Secondly, if the distribution is identifiable, and $f : p_\theta\mapsto p_{T\,\mid\,\theta}$ is surjective, then $f : \theta\mapsto p_{T\,\mid\,\theta}$ is also surjective, hence the statistic is complete (follows from the first point). ie Identifiable distribution and $f : p_\theta\mapsto p_{T\,\mid\,\theta}$ being surjective implies statistic is complete. But the converse is not necessarily true.
Jul 15 at 6:25 comment added Shreyans @GuillaumeDehaene I thought some more about the relationship between completeness and surjectivity, and have come to following conclusions: Firstly, Statistic being complete $\leftrightarrow$ $f : \theta\mapsto p_{T\,\mid\,\theta}$ being surjective. For this the geometric proof here is intuitive, although slightly incorrect as they tried to prove it for a different $f$. [contd.]
Jul 15 at 5:55 comment added Shreyans @GuillaumeDehaene no sir, we should rather be thanking you! ok lets meet midway and call it a joint effort :)
Jun 29 at 22:02 comment added Guillaume Dehaene I've edited the wikipedia page (and, feeling inspired, gone on an editing spree...). Thank you for your persistence in pointing this issue out. Wikipedia is now more accurate, thanks to you.
Jun 28 at 7:51 comment added Guillaume Dehaene I've quickly gone through the casella-berger (its here: mybiostats.wordpress.com/wp-content/uploads/2015/03/… ) and there's no mention of bijection in there. I've got no clue what was the mistake that led to that wikipedia page.
Jun 28 at 7:34 comment added Guillaume Dehaene Good job finding that other thread. I'll also edit that once I figure out exactly what I want to write. But I really can't shake out the feeling that there is some relationship between sufficiency, completeness and some form of injectivity / surjectivity somewhere, just not there... I taught only 5 years ago this but I've forgotten all of it given that it is quite useless ^_^
Jun 27 at 19:09 comment added Shreyans @GuillaumeDehaene Thank you for confirming my suspicions. There is indeed some theorems in Casella, but I actually stopped reading it as I got busy with my job. Maybe now is the time to start again. Anyways, what I do know is that the converse of my question is true. You can find a proof here , its not perfect but its correct. The same answer also has some incorrect proofs, you can read the comments for details. And yes we should definitely edit wikipedia, and while we are at it also the linked answer.
Jun 27 at 12:03 comment added Guillaume Dehaene I've thought it through over my lunch break. In your example, the minimal sufficient statistic is $x_1 \rightarrow s_1$ $x_2 \text{ or } x_3 \rightarrow s_3$ and $x_4 \rightarrow s_3$. It is sufficient by the factorization criterion, and it is complete. Your function T is indeed not sufficient, as you observed. That wikipedia page is a complete mess. I'll try to edit it in the future.
Jun 27 at 9:27 comment added Guillaume Dehaene @Shreyans, it seems to me that you are right and wikipedia is wrong. I really don't understand their discussion of priors in the earlier paragraph, which is out-of-scope. The characterization they give for sufficiency seems to instead match identifiability? But I wonder if part of the issue here is that they have butchered the truth. I vaguely recall there being a relationship between sufficiency, completeness and some mapping being injective / surjective. Isn't that in the Casella book? I'll try to recall all I've forgotten on this topic. Maybe we should rewrite the page when we are done here
Jun 27 at 2:59 history edited User1865345 CC BY-SA 4.0
edited body
Jun 27 at 2:41 history edited User1865345 CC BY-SA 4.0
added 2 characters in body
Jun 27 at 2:36 history edited User1865345 CC BY-SA 4.0
added 2 characters in body
Jun 27 at 2:30 history edited User1865345 CC BY-SA 4.0
added 65 characters in body; edited title
Jun 26 at 22:13 comment added Shreyans @whuber I use this definition of sufficient statistic as given in Casella: A statistic $T(X)$ is a sufficient statistic for $\theta$ if the conditional distribution of the sample $X$ given the value of $T(X)$ does not depend on $θ$. ie $P(X=x|T(X)=T(x);\theta=\theta_1)=P(X=x|T(X)=T(x);\theta=\theta_2)$ . But we can see in my example that $P(X=x1|T(X)=t1;\theta=\theta1)=0.1/(0.1+0.2) = 1/3$ , which is not equal to $P(X=x1|T(X)=t1;\theta=\theta2)=0.2/(0.2+0.2)=1/2$
Jun 26 at 21:29 history reopened User1865345
whuber
Jun 26 at 21:29 comment added whuber It would be interesting to see your demonstration that this statistic is not sufficient (because it is!).
Jun 26 at 19:06 comment added Shreyans @whuber Also I've clarified the confusion about the map in table 2 after your comment, as well as used $\TeX$ markup for the variables. Would you also like me to change the tables to $\TeX$?
Jun 26 at 19:04 comment added Shreyans @whuber I appreciate the time and effort you have spent trying to help me. You are right, the map is bijective, hence $f:p_{\theta }\mapsto p_{T|\theta }$ is injective. But the statistic is not sufficient (I can give details if its not clear). Hence wikipedia is incorrect, right? I'm pretty sure about the conclusion so strictly speaking I don't need to ask this question anymore. It would still be nice to re-open given the time both of us has spent on it :)
Jun 23 at 16:44 comment added whuber The map in your example is one-to-one (a bijection), because each of the two possible values of $\theta$ determines a distinct distribution of $T$ as shown in Table 3.
Jun 23 at 15:41 history left closed in review whuber Original close reason(s) were not resolved
S Jun 23 at 9:40 review Reopen votes
Jun 23 at 15:41
S Jun 23 at 9:40 history edited Shreyans CC BY-SA 4.0
formatting Added to review
Jun 6 at 11:23 history left closed in review User1865345
whuber
Original close reason(s) were not resolved
Jun 6 at 11:23 comment added whuber Could you please explain what your table showing a "map of samples to statistic" means? Because a statistic is, by definition, a numerical function of the sample and no such function is in evidence (what are the "$t_i$"?), it takes considerable guessing to read this post. You could further clarify it by using $\TeX$ markup to render the mathematical symbols as intended.
Jun 6 at 10:23 comment added Shreyans The question has been closed recently. In response I have edited the question in hopes of making it clearer. I would appreciate it if a mod pointed out what additional details can be added? Not to toot my own horn, but 3 upvotes with 77 views in 16 days looks to be a reasonably good performance for a question. My point is that it is entirely not obvious which part is unclear.
S Jun 5 at 8:22 review Reopen votes
Jun 6 at 11:23
S Jun 5 at 8:22 history edited Shreyans CC BY-SA 4.0
tried to make the question concise and clear Added to review
Jun 4 at 2:21 history closed kjetil b halvorsen Needs details or clarity
May 21 at 21:15 history edited Shreyans CC BY-SA 4.0
found proof
May 20 at 22:18 history edited Shreyans CC BY-SA 4.0
edited tags
May 20 at 22:13 comment added Shreyans @whuber I cleaned up the tables, and changed some names to lower-case
May 20 at 22:12 history edited Shreyans CC BY-SA 4.0
edited tags
May 20 at 22:07 history edited Shreyans CC BY-SA 4.0
edited tags
May 20 at 21:56 history edited Shreyans
edited tags
May 20 at 21:31 comment added whuber I cannot figure out what your table means: it doesn't appear to specify any kind of distribution. Could you explain?
May 20 at 21:11 history edited Shreyans CC BY-SA 4.0
added 1 character in body
May 20 at 21:08 comment added jbowman It's because you're not using Mathjax to construct it, which you really should do for all your math related stuff.
May 20 at 21:08 history edited Shreyans CC BY-SA 4.0
added 8 characters in body
May 20 at 21:06 comment added Shreyans I'm not sure why the table looks proper when i review it or try to edit it, but looks incorrect when finally posting the question
May 20 at 21:03 history asked Shreyans CC BY-SA 4.0