1
$\begingroup$

I was reading Lee's (1988) paper entitled A Test of Missing Completely at Random for Multivariate Data With Missing Values and Li's (2013) paper entitled Little's test of missing completely at random. Both are incredible papers, but I am just simply confused about missing completely at random (MCAR) and missing at random (MAR).

Let $\mathbf{R}$ represent a stacked matrix of $\mathbf{r}$, which represents whether each component in vector $\mathbf{y}_i$ is observed. When $\mathbf{r}_{ij} = 1$, that means element $y_{ij}$ is observed, and vice-versa. Also, let $\mathbf{Y}_{obs}, \mathbf{Y}_{miss}$ represent the observed and missing values of a data-matrix $\mathbf{Y}$. Lastly, let $\mathbf{X}$ represent a matrix of covariate values.

Li's (2013) paper defines missing at random to be $$\Pr[\mathbf{R}|\mathbf{Y}_{miss},\mathbf{Y}_{obs}, \mathbf{X}] = \Pr[\mathbf{R}|\mathbf{Y}_{obs}, \mathbf{X}],$$ which means that $\mathbf{R}$ is independent of $\mathbf{Y}_{miss}$ but can still depend on $\mathbf{Y}_{obs}$. IOW, the distribution of missing indicators only depends on the observed data. She also defines missing completely at random to be $$\Pr[\mathbf{R}|\mathbf{Y}_{miss},\mathbf{Y}_{obs}, \mathbf{X}] = \Pr[\mathbf{R}],$$ which means $\mathbf{R}$ is independent of $\mathbf{Y}_{miss}, \mathbf{Y}_{obs},$ and $\mathbf{X}$. IOW, the distribution of missing indicators doesn't depend on anything and is completely random.

I just have a series of questions about this that I can't seem to find answers to.

  • What's the difference between $\mathbf{R}$ and $\mathbf{Y}_{obs}, \mathbf{Y}_{miss}$?
  • Is missing data the data that's literally missing (non-responses) or is it data that we don't have access to?
  • The assumption of MCAR is clearly stronger than MAR. How is it testable but MAR isn't?
$\endgroup$

1 Answer 1

1
$\begingroup$

I am only learning about this from reading Li's 2013 paper, so take this with a grain of salt. But my understanding is:

$\mathbf Y$ is an $n \times p$ matrix of random variables. From here, possibly depending on $\mathbf Y$ in some way and possibly not, $\mathbf R$ is an $n\times p$ matrix of random $\{0,1\}$-valued random variables. Finally $\mathbf Y_o$ and $\mathbf Y_m$ are random variables entirely determined by $\mathbf Y$ and $\mathbf R$: $\mathbf Y_o$ is all the entries of $\mathbf Y$ for which the corresponding entry of $\mathbf R$ is $1$, and $\mathbf Y_m$ is all the entries of $\mathbf Y$ for which the corresponding entry of $\mathbf R$ is $0$.

As a result, $\mathbf Y_m$ consists of actual data that we can never know. I think it's helpful to think of our probability space as including all the data above, but restrict our algorithms to only work with $\mathbf R$ and $\mathbf Y_o$.

As for why it's possible to test MCAR but not MAR: because MCAR is a stronger assumption, it's possible to encounter data that's unlikely under MCAR (and therefore lets us reject MCAR) but not very unlikely under MAR. Some concrete examples:

  1. Maybe all entries of $\mathbf Y$ are uniformly sampled from $[-1,1]$, but all negative values are missing. This violates both MCAR and MAR. It's not distinguishable in principle from the alternate hypothesis where all entries of $\mathbf Y$ are uniformly sampled from $[0,1]$ and half of them are missing at random.
  2. Suppose $\mathbf Y$ is $n \times 2$, and its entries are uniformly sampled from $[-1,1]$; for every pair $(\mathbf y_{i1}, \mathbf y_{i2})$, we observe $\mathbf y_{i2}$ iff $\mathbf y_{i1}>0$. This violates MCAR but not MAR. Little's test will almost certainly reject MCAR under this scenario, since the distribution of $\mathbf y_{i1}$ will be very different in the two possible missing-value patterns: "observed-observed" and "observed-missing".

Note that "testing for MCAR" does not mean that we'll be able to detect all possible scenarios that violate MCAR. If it did, it would indeed be weird that a stronger assumption is testable when a weaker assumption is not. Instead, "testing for MCAR" means we can compute a $p$-value which:

  • Is unlikely to be very small (has a less than $0.05$ chance of being $<0.05$ and all that good stuff) if MCAR holds;
  • Will be very small under many plausible scenarios where MCAR is violated.

The second bullet point is necessarily vague, and this is where we discuss the power of different tests in different situations.

$\endgroup$
1
  • $\begingroup$ Amazing answer, thank you so much! $\endgroup$
    – JerBear
    Commented Oct 24, 2022 at 18:29

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .