I was reading Lee's (1988) paper entitled A Test of Missing Completely at Random for Multivariate Data With Missing Values and Li's (2013) paper entitled Little's test of missing completely at random. Both are incredible papers, but I am just simply confused about missing completely at random (MCAR) and missing at random (MAR).
Let $\mathbf{R}$ represent a stacked matrix of $\mathbf{r}$, which represents whether each component in vector $\mathbf{y}_i$ is observed. When $\mathbf{r}_{ij} = 1$, that means element $y_{ij}$ is observed, and vice-versa. Also, let $\mathbf{Y}_{obs}, \mathbf{Y}_{miss}$ represent the observed and missing values of a data-matrix $\mathbf{Y}$. Lastly, let $\mathbf{X}$ represent a matrix of covariate values.
Li's (2013) paper defines missing at random to be $$\Pr[\mathbf{R}|\mathbf{Y}_{miss},\mathbf{Y}_{obs}, \mathbf{X}] = \Pr[\mathbf{R}|\mathbf{Y}_{obs}, \mathbf{X}],$$ which means that $\mathbf{R}$ is independent of $\mathbf{Y}_{miss}$ but can still depend on $\mathbf{Y}_{obs}$. IOW, the distribution of missing indicators only depends on the observed data. She also defines missing completely at random to be $$\Pr[\mathbf{R}|\mathbf{Y}_{miss},\mathbf{Y}_{obs}, \mathbf{X}] = \Pr[\mathbf{R}],$$ which means $\mathbf{R}$ is independent of $\mathbf{Y}_{miss}, \mathbf{Y}_{obs},$ and $\mathbf{X}$. IOW, the distribution of missing indicators doesn't depend on anything and is completely random.
I just have a series of questions about this that I can't seem to find answers to.
- What's the difference between $\mathbf{R}$ and $\mathbf{Y}_{obs}, \mathbf{Y}_{miss}$?
- Is missing data the data that's literally missing (non-responses) or is it data that we don't have access to?
- The assumption of MCAR is clearly stronger than MAR. How is it testable but MAR isn't?