2
$\begingroup$

Let $X_1, X_2, \dots , X_n$ be normally distributed independent observations with known variance $\sigma^2$ and mean respectively given by $\mu_i = \mu + \epsilon_i$ where $\epsilon_i$ is white noise, i. e., $E[\epsilon_i] = 0$, $E[\epsilon_i^2] = \sigma_{\epsilon}$ and the $\epsilon_i$ are all symmetrically identically distributed and independent from one another.

We can see that the maximum likelihood estimator for $\mu$ remains consistent by maximizing the log-likelihood function

$$ \sum_{i=1}^n \log \frac{1}{\sqrt{2 \pi \sigma^2}} - \frac{(X_i - \mu - \epsilon_i)^2}{2 \sigma^2}$$

taking the derivative wrt $\mu$ and setting the result equal to zero one obtains that

$$ +2 \hat{\mu} = \frac{2}{n} \left( \sum_{i=1}^n X_i + \sum_{i=1}^n \epsilon_i \right)$$

by the strong law of large numbers $\sum_{i=1}^n X_i \rightarrow \mu$ and $\sum_{i=1}^n \epsilon_i \rightarrow 0$ almost surely. So from the linearity of the limit we obtain the consistency.

Is this true even for the variance? that is; if we had $X_1, X_2, \dots , X_n$ be normally distributed independent observations with known mean $\mu$ and variance respectively given by $\sigma_i^2 = \sigma^2 + \epsilon_i$ would the maximum likelihood estimator for $\sigma^2$ still be consistent ?

$\endgroup$
2
  • $\begingroup$ Based on your description of $\epsilon_i$, it is possible that $-\epsilon_i > \sigma^2$. Then what happens? $\endgroup$
    – user158565
    Commented May 1, 2019 at 19:54
  • $\begingroup$ @user158565 Right, One would need to adjust the distribution of the noise to adapt to that. You are correct, it is badly specified as written. $\endgroup$
    – Monolite
    Commented May 1, 2019 at 20:03

1 Answer 1

3
$\begingroup$

No, if we look at the model implied covariance matrix for $X_1,\ldots,X_n$ for the example case of $n=3$ \begin{bmatrix} \sigma^2+\sigma_\epsilon^2 & 0 & 0 \\\\ 0 & \sigma^2+\sigma_\epsilon^2 & 0 \\\\ 0 & 0 &\sigma^2+\sigma_\epsilon^2 \end{bmatrix}

We see that your model is not identified. Model Identification is defined here: https://en.wikipedia.org/wiki/Maximum_likelihood_estimation#Consistency. Essentially, if two different parameter values imply the same distribution your model is not identified. In your case, there is an infinite amount of parameters that imply the same distribution. All parameters for which $\sigma^2+\sigma_\epsilon^2=c$ for some constant $c$ describe the same distribution; namely the one where all variances are $c$. So, as an example $\sigma^2=1,\sigma_\epsilon^2=1$, describes the same distribution as $\sigma^2=0,\sigma_\epsilon^2=2$.

If we now turn to how this influences maximum likelihood estimation. The likelihood of any parameter combination is only influenced by which $c$ it implies. Let's assume that $\hat{c}$ is the maximum likelihood estimate of $c$. You can get at this $\hat{c}$ with all parameters that satisfy $\sigma^2+\sigma_\epsilon^2=\hat{c}$, which are infinitely many. Thus, a maximum likelihood point estimate does not exist and can also not converge to the true value. Thus, it is not consistent.

$\endgroup$
2
  • $\begingroup$ Thank you for this answer! could you expand a bit on the indentifiability of the model and it's relation to maximum likelihood? my background is not purely statistical. $\endgroup$
    – Monolite
    Commented May 1, 2019 at 19:17
  • $\begingroup$ I extended my answer. $\endgroup$ Commented May 2, 2019 at 15:33

Not the answer you're looking for? Browse other questions tagged or ask your own question.