Conditional distribution of $(X_1,\cdots,X_n)\mid X_{(n)}$ where $X_i$'s are i.i.d $\mathcal U(0,\theta)$ variables

Question

Let $(X_1,X_2,\cdots,X_n)$ be a random sample drawn from $\mathcal U(0,\theta)$ distribution. It is a common exercise to prove that the maximum order statistic $X_{(n)}$ is sufficient for $\theta$ as an application of the Fisher-Neyman Factorization theorem. However, I am trying to prove this fact from the definition of a sufficient statistic.

Indeed, for some discrete distributions we can prove sufficiency of a given statistic from definition without using the Factorization theorem. But for continuous distributions like this one, I guess it is not that straightforward.

Joint density of the sample $\mathbf X=(X_1,\cdots,X_n)$ is given by

\begin{align} f_{\theta}(\mathbf x)&=\prod_{i=1}^n\frac{1}{\theta}\mathbf1_{0<x_i<\theta} \\&=\frac{1}{\theta^n}\mathbf1_{0<x_{(1)},x_{(n)}<\theta} \end{align}

It is clear from the Factorization theorem that $T(\mathbf X)=X_{(n)}$ is sufficient for $\theta$.

But from the definition of sufficiency, I have to show that the conditional distribution of $\mathbf X\mid T$ is independent of $\theta$. I don't think I can say the following:

\begin{align} f_{\mathbf X\mid T}(\mathbf x\mid t)f_T(t)&=f_{T\mid\mathbf X}(t\mid\mathbf x)f_{\theta}(\mathbf x) \\\implies f_{\mathbf X\mid T}(\mathbf x\mid t)&=\frac{f_{\theta}(\mathbf x)}{f_T(t)}f_{T\mid\mathbf X}(t\mid\mathbf x) \end{align}

We know that the density of $T$ is $$f_T(t)=\frac{n\,t^{n-1}}{\theta^n}\mathbf1_{0<t<\theta}$$

But I don't know what $f_{T\mid\mathbf X}(\cdot)$ is because it looks like $T\mid\mathbf X$ has a singular distribution.

If I knew the joint distribution of $(\mathbf X,T)$, then maybe I could have said that $$f_{\mathbf X\mid T}(\mathbf x\mid t)=\frac{f_{\mathbf X,T}(\mathbf x,t)}{f_T(t)}$$

Also tried working with the conditional distribution function $$P\left[X_1\le x_1,\cdots,X_n\le x_n\mid T=t\right]=\lim_{\varepsilon\to0}\frac{P\left[X_1\le x_1,\cdots,X_n\le x_n, t-\varepsilon\le T\le t+\varepsilon\right]}{P(t-\varepsilon\le T\le t+\varepsilon)}$$

I have gone through this related post but could not come up with an answer. Is it also true that $\mathbf X\mid T$ has a mixed distribution?

I also have this equivalent definition of sufficiency at hand, which says that if $T$ is sufficient for $\theta$, then for any other statistic $T'$, the conditional distribution of $T'\mid T$ is also independent of $\theta$. Maybe for a suitable choice of $T'$ I can prove the required fact, but I prefer to do it from the first definition. Any hints will be great.

It looks like what I am looking for is basically a proof of the Factorisation theorem for continuous distributions, which I did find in Hogg and Craig's Mathematical Statistics.

Here is an extract from Theory of Point Estimation by Lehmann-Casella (2nd edition) that gives a hint of a probabilistic argument for sufficiency of $T=X_{(n)}$:

Let $X_1,\cdots,X_n$ be independently distributed according to the uniform distribution $U(0,\theta)$. Let $T$ be the largest of the $n$ $X$'s, and consider the conditional distribution of the remaining $n − 1$ $X$'s given $t$ . Thinking of the $n$ variables as $n$ points on the real line, it is intuitively obvious and not difficult to see formally (Problem 6.2) that the remaining $n − 1$ points (after the largest is fixed at $t$ ) behave like $n − 1$ points selected at random from the interval $(0, t)$. Since this conditional distribution is independent of $\theta$, $T$ is sufficient. Given only $T = t$ , it is obvious how to reconstruct the original sample: Select $n − 1$ points at random on $(0, t)$.

Problem 6.2 says

Let $X_1,\cdots,X_n$ be iid according to a distribution $F$ and probability density $f$ . Show that the conditional distribution given $X_{(i)} = a$ of the $i−1$ values to the left of $a$ and the $n−i$ values to the right of $a$ is that of $i−1$ variables distributed independently according to the probability density $f(x)/F(a)$ and $n−i$ variables distributed independently with density $f(x)/[1 − F(a)]$, respectively, with the two sets being (conditionally) independent of each other.

So for a 'formal' proof of the sufficiency of $T$ without applying the Factorization theorem, do I have to prove the theorem itself for this particular problem or is there another option as highlighted in the above extract?

In what sense does $T\mid X$ have a "singular" distribution? — whuber, Commented Aug 7, 2018 at 12:06
@whuber I might be wrong, but I think $T\mid X$ does not have a density, i.e not absolutely continuous. That's what I meant. — StubbornAtom, Commented Aug 7, 2018 at 15:53
The quotation after "Problem 6.2 says" refers explicitly to a density. — whuber, Commented Aug 9, 2018 at 18:35
@whuber Not sure I follow. No specific density is mentioned. Except perhaps it was referring to the Uniform density in the previous quotation. — StubbornAtom, Commented Aug 9, 2018 at 18:40

Sextus Empiricus · Accepted Answer · 2018-08-09 22:15:37Z

$(X_1, ..., X_n) \vert X_{(n)}$ can be expressed as the mixture distribution of $(Y_{1}, ,Y_{n})$ with $$Y_{i} \quad\begin{cases} &\sim U(0,X_{(n)}) & \qquad \text{for $i \neq j$} \\ &= X_{(n)} & \qquad \text{for $i=j$} \end{cases}$$

where the mixture is due to using $n$ different values for $j$ giving $n$ different distributions for $(Y_1,...,Y_n)$.

You can see it is independent from $\theta$.

It gives a bit strange density function. You get (where for simplicity of notation $\eta = X_{(n)}$):

$$g_\eta(\mathbf{x}) = \frac{1}{\eta^{n-1}} \mathbf{1}_{0\leq x_{(1)},x_{(n-1)} \leq x_{(n)}=\eta}$$

and this indicator function $\mathbf{1}_{0\leq x_{(1)},x_{(n-1)} \leq x_{(n)}=\eta}$ is not equal to 1 inside the hypercube, but only on $n$ of its facets ^{(here you see the facets for $n=2$ as an L-shape, the 2 facets of a 2d-hypercube are sides of a square.)}

The above is a bit intuitive. I actually don't know how to correctly describe the pdf/cdf. For comparison an example in two dimensions: the conditional variable $(X_1,X_2)_{ \max(X_1,X_2) = t}$, with $X_1$ and $X_2$ i.i.d uniform, is a uniform distribution on the two line pieces $X_1=t , 0\leq X_2 \leq t$ and $0 \leq X_1 \leq t, X_2=t$ in 2D space.

Maybe you can resolve this by using $(X_1, ..., X_n) \vert (X_{(n)} \leq t)$ which is the $(Y_1, ..., Y_n)$ where the $Y_i \sim U(0,t)$

I wonder if I can say the following like in discrete cases by introducing an indicator variable: \begin{align} f_{\mathbf X\mid T}(\mathbf x\mid t)&=\frac{f_{\theta}(\mathbf x)}{f_T(t)}\mathbf1_{\mathbf x\in A_t} \\&=\frac{1/\theta^n}{nt^{n-1}/\theta^n}\mathbf1_{\mathbf x\in A_t} \\&=\frac{\mathbf1_{\mathbf x\in A_t}}{nt^{n-1}} \end{align} where $$A_t=\{\mathbf x:x_{(n)}=t\},\quad0<t<\theta$$ — StubbornAtom, Commented Aug 28, 2018 at 21:35

Sextus Empiricus · Accepted Answer · 2018-08-09 22:15:37Z

$(X_1, ..., X_n) \vert X_{(n)}$ can be expressed as the mixture distribution of $(Y_{1}, ,Y_{n})$ with $$Y_{i} \quad\begin{cases} &\sim U(0,X_{(n)}) & \qquad \text{for $i \neq j$} \\ &= X_{(n)} & \qquad \text{for $i=j$} \end{cases}$$

where the mixture is due to using $n$ different values for $j$ giving $n$ different distributions for $(Y_1,...,Y_n)$.

You can see it is independent from $\theta$.

It gives a bit strange density function. You get (where for simplicity of notation $\eta = X_{(n)}$):

$$g_\eta(\mathbf{x}) = \frac{1}{\eta^{n-1}} \mathbf{1}_{0\leq x_{(1)},x_{(n-1)} \leq x_{(n)}=\eta}$$

and this indicator function $\mathbf{1}_{0\leq x_{(1)},x_{(n-1)} \leq x_{(n)}=\eta}$ is not equal to 1 inside the hypercube, but only on $n$ of its facets ^{(here you see the facets for $n=2$ as an L-shape, the 2 facets of a 2d-hypercube are sides of a square.)}

The above is a bit intuitive. I actually don't know how to correctly describe the pdf/cdf. For comparison an example in two dimensions: the conditional variable $(X_1,X_2)_{ \max(X_1,X_2) = t}$, with $X_1$ and $X_2$ i.i.d uniform, is a uniform distribution on the two line pieces $X_1=t , 0\leq X_2 \leq t$ and $0 \leq X_1 \leq t, X_2=t$ in 2D space.

Maybe you can resolve this by using $(X_1, ..., X_n) \vert (X_{(n)} \leq t)$ which is the $(Y_1, ..., Y_n)$ where the $Y_i \sim U(0,t)$

I wonder if I can say the following like in discrete cases by introducing an indicator variable: \begin{align} f_{\mathbf X\mid T}(\mathbf x\mid t)&=\frac{f_{\theta}(\mathbf x)}{f_T(t)}\mathbf1_{\mathbf x\in A_t} \\&=\frac{1/\theta^n}{nt^{n-1}/\theta^n}\mathbf1_{\mathbf x\in A_t} \\&=\frac{\mathbf1_{\mathbf x\in A_t}}{nt^{n-1}} \end{align} where $$A_t=\{\mathbf x:x_{(n)}=t\},\quad0<t<\theta$$ — StubbornAtom, Commented Aug 28, 2018 at 21:35

Stack Exchange Network

Conditional distribution of $(X_1,\cdots,X_n)\mid X_{(n)}$ where $X_i$'s are i.i.d $\mathcal U(0,\theta)$ variables

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
distributions
mathematical-statistics
estimation
sufficient-statistics
or ask your own question.

Linked

Hot Network Questions

Stack Exchange Network

Conditional distribution of $(X_1,\cdots,X_n)\mid X_{(n)}$ where $X_i$'s are i.i.d $\mathcal U(0,\theta)$ variables

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
distributions
mathematical-statistics
estimation
sufficient-statistics
or ask your own question.

Linked

Hot Network Questions

1 Answer 1

Not the answer you're looking for? Browse other questions tagged distributionsmathematical-statisticsestimationsufficient-statistics or ask your own question.

Linked

Related

1 Answer 1

Not the answer you're looking for? Browse other questions tagged distributionsmathematical-statisticsestimationsufficient-statistics or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
distributions
mathematical-statistics
estimation
sufficient-statistics
or ask your own question.

Not the answer you're looking for? Browse other questions tagged
distributions
mathematical-statistics
estimation
sufficient-statistics
or ask your own question.