Let $(X_1,X_2,\cdots,X_n)$ be a random sample drawn from $\mathcal U(0,\theta)$ distribution. It is a common exercise to prove that the maximum order statistic $X_{(n)}$ is sufficient for $\theta$ as an application of the Fisher-Neyman Factorization theorem. However, I am trying to prove this fact from the definition of a sufficient statistic.
Indeed, for some discrete distributions we can prove sufficiency of a given statistic from definition without using the Factorization theorem. But for continuous distributions like this one, I guess it is not that straightforward.
Joint density of the sample $\mathbf X=(X_1,\cdots,X_n)$ is given by
\begin{align} f_{\theta}(\mathbf x)&=\prod_{i=1}^n\frac{1}{\theta}\mathbf1_{0<x_i<\theta} \\&=\frac{1}{\theta^n}\mathbf1_{0<x_{(1)},x_{(n)}<\theta} \end{align}
It is clear from the Factorization theorem that $T(\mathbf X)=X_{(n)}$ is sufficient for $\theta$.
But from the definition of sufficiency, I have to show that the conditional distribution of $\mathbf X\mid T$ is independent of $\theta$. I don't think I can say the following:
\begin{align} f_{\mathbf X\mid T}(\mathbf x\mid t)f_T(t)&=f_{T\mid\mathbf X}(t\mid\mathbf x)f_{\theta}(\mathbf x) \\\implies f_{\mathbf X\mid T}(\mathbf x\mid t)&=\frac{f_{\theta}(\mathbf x)}{f_T(t)}f_{T\mid\mathbf X}(t\mid\mathbf x) \end{align}
We know that the density of $T$ is $$f_T(t)=\frac{n\,t^{n-1}}{\theta^n}\mathbf1_{0<t<\theta}$$
But I don't know what $f_{T\mid\mathbf X}(\cdot)$ is because it looks like $T\mid\mathbf X$ has a singular distribution.
If I knew the joint distribution of $(\mathbf X,T)$, then maybe I could have said that $$f_{\mathbf X\mid T}(\mathbf x\mid t)=\frac{f_{\mathbf X,T}(\mathbf x,t)}{f_T(t)}$$
Also tried working with the conditional distribution function $$P\left[X_1\le x_1,\cdots,X_n\le x_n\mid T=t\right]=\lim_{\varepsilon\to0}\frac{P\left[X_1\le x_1,\cdots,X_n\le x_n, t-\varepsilon\le T\le t+\varepsilon\right]}{P(t-\varepsilon\le T\le t+\varepsilon)}$$
I have gone through this related post but could not come up with an answer. Is it also true that $\mathbf X\mid T$ has a mixed distribution?
I also have this equivalent definition of sufficiency at hand, which says that if $T$ is sufficient for $\theta$, then for any other statistic $T'$, the conditional distribution of $T'\mid T$ is also independent of $\theta$. Maybe for a suitable choice of $T'$ I can prove the required fact, but I prefer to do it from the first definition. Any hints will be great.
It looks like what I am looking for is basically a proof of the Factorisation theorem for continuous distributions, which I did find in Hogg and Craig's Mathematical Statistics.
Here is an extract from Theory of Point Estimation by Lehmann-Casella (2nd edition) that gives a hint of a probabilistic argument for sufficiency of $T=X_{(n)}$:
Let $X_1,\cdots,X_n$ be independently distributed according to the uniform distribution $U(0,\theta)$. Let $T$ be the largest of the $n$ $X$'s, and consider the conditional distribution of the remaining $n − 1$ $X$'s given $t$ . Thinking of the $n$ variables as $n$ points on the real line, it is intuitively obvious and not difficult to see formally (Problem 6.2) that the remaining $n − 1$ points (after the largest is fixed at $t$ ) behave like $n − 1$ points selected at random from the interval $(0, t)$. Since this conditional distribution is independent of $\theta$, $T$ is sufficient. Given only $T = t$ , it is obvious how to reconstruct the original sample: Select $n − 1$ points at random on $(0, t)$.
Problem 6.2 says
Let $X_1,\cdots,X_n$ be iid according to a distribution $F$ and probability density $f$ . Show that the conditional distribution given $X_{(i)} = a$ of the $i−1$ values to the left of $a$ and the $n−i$ values to the right of $a$ is that of $i−1$ variables distributed independently according to the probability density $f(x)/F(a)$ and $n−i$ variables distributed independently with density $f(x)/[1 − F(a)]$, respectively, with the two sets being (conditionally) independent of each other.
So for a 'formal' proof of the sufficiency of $T$ without applying the Factorization theorem, do I have to prove the theorem itself for this particular problem or is there another option as highlighted in the above extract?