2
$\begingroup$

I would like to fit a distribution $f(\cdot;\theta)$ to a sample $\{x_1,\dots,x_n\}$, obtaining a m.l.e. $\hat{\theta}$. I know that the random variable $X \sim f(\cdot;\theta)$ can be obtained as the result of generating a random variable $Y$ following a distribution with p.d.f. $g(\cdot;\theta)$ and then generating $X$ following a distribution with p.d.f. $h(\cdot;Y)$.

Is it the maximum likelihood estimate $\hat{\theta}$ obtained by maximizing $\theta$ in $$ \prod_{i=1}^n f(x_i; \theta) $$ equal to the maximum likelihood estimate $\hat{\theta}$ obtained by maximizing $(\theta, y_1, \dots, y_n)$ in $$ \prod_{i=1}^n g(y_i; \theta) h(x_i; y_i)? $$

$\endgroup$

1 Answer 1

1
$\begingroup$

If you don't observe $y_i$, you have this density for $X_i{:}$ $$ \int_{\large\mathscr Y} h(x_i;y) g(y;\theta) \, dy $$ (where $\mathscr Y$ is the space of possible $y$ values).

Therefore the likelihood function is $$ L(\theta) = \prod_{i=1}^n \int_{\large\mathscr Y} h(x_i;y) g(y;\theta) \, dy. $$

If you did observe $y_i$ for $i=1,\ldots,n$, then the MLE for $\theta$ would not depend on $x_1,\ldots,x_n$ at all. In that case, you would have $$ L(\theta) = \prod_{i=1}^n g(y_i; \theta) h(x_i; y_i), $$ so $$ \ell(\theta) = \log L(\theta) = \sum_{i=1}^n \Big( \log g(y_i;\theta) + \log h(y_i;x_i \Big) $$ and the $\log h$ term would vanish when you differentiate with respect to $\theta.$

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.