6
$\begingroup$

In a comment reply on Frank Harrell's answer here, in relation to interpreting an interaction, he says this should be done by:

double difference contrasts and a series of single differences at different levels of interacting factor.

What exactly does this mean, and can someone please show an example of doing it ?

$\endgroup$

2 Answers 2

10
$\begingroup$

There are several examples in the help file for the contrast function in the R rms package here.

An expansion of this answer is here.

  • Single difference: slope (change in y when x goes from a to a+1
  • Double difference (interaction, when effects are linear): difference in two slopes
  • Double difference (interaction between two binary predictors): effect of x increasing by 1 unit when the other variable = 1 minus the effect of increasing x by 1 unit when the other variable = 0
  • Double difference for a single predictor, to assess linearity: difference in Y when x=2 vs. x=1 minus the difference in Y when x=1 vs. x=0.
  • Triple difference: assess degree of nonlinearly in an interaction, or magnitude of 3rd order interaction
$\endgroup$
7
$\begingroup$

The quoted comment is about interpreting coefficients in a model with interaction terms. The interaction effect is one type of nonlinear effects. Common interaction terms are two-way between two predictors, but there can be three-or-more-way interactions among three or more predictors. More complex multiple-way interaction effects can be decomposed into several layers of two-way interactions. It is best to illustrate with plots of predicted values, but regression equations will conceptually clarify the model specification.

With predicted $y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + \beta_3 x_1 x_2$, and both $x_1$ and $x_2$ are binary, we can show the effect of variables using predicted response in different scenarios.

  • When $x_1 = 0$ and $x_2 = 0$, $y = \beta_0 + \beta_1 0 + \beta_2 0 + \beta_3 0 \cdot 0 = \beta_0$
  • When $x_1 = 0$ and $x_2 = 1$, $y = \beta_0 + \beta_1 0 + \beta_2 1 + \beta_3 0 \cdot 1 = \beta_0 + \beta_2$
  • When $x_1 = 1$ and $x_2 = 0$, $y = \beta_0 + \beta_1 1 + \beta_2 0 +\beta_3 1 \cdot 0 = \beta_0 + \beta_1$
  • When $x_1 = 1$ and $x_2 = 1$, $y = \beta_0 + \beta_1 1 + \beta_2 1 +\beta_3 1 \cdot 1 = \beta_0 + \beta_1 + \beta_2 + \beta_3$

Rearrange the above equation, and $\beta_3 = [(\beta_0 + \beta_1 + \beta_2 + \beta_3) - (\beta_0 + \beta_1)] - [(\beta_0 + \beta_2) - (\beta_0)]$. This is what "double difference contrasts" refer to, a term not commonly used or well known elsewhere. It is to remove the effect of $x_2$ at the reference level of $x_1$, $y|{(x_1 = 0, x_2 = 1)} - y|{(x_1 = 0, x_2 = 0)}$, from the effect of $x_2$ at the comparison level of $x_1$, $y|{(x_1 = 1, x_2 = 1)} - y|{(x_1 = 1, x_2 = 0)}$. This specification is exactly what the difference-in-difference approach does for causal inference in natural experiments, where $x_1$ denotes the temporal stage, $x_1 = 0$ before the experiment and $x_1 = 1$ after the experiment, and $x_2$ denotes the group membership, $x_2 = 0$ among the control group and $x_2 = 1$ among the treatment group. The interaction term enables a special effect in the post-experiment treatment group where $x_1 = 1$ and $x_2 = 1$ that temporal changes $\beta_1$ and group assignment $\beta_2$ alone cannot sufficiently explain. In a bar- or box-plot, the nonparallelism between two lines by $x_2$ of mean $y$ over $x_1$ suggests an interaction effect between $x_1$ and $x_2$. If $\beta_3$ is nonsignificant, the model reduces to one with only main effects and without an interaction term, $y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + 0 x_1 x_2 = \beta_0 + \beta_1 x_1 + \beta_2 x_2$. It says that the control and treatment groups differs by $\beta_1$ before the experiment and both experience the same temporal increment by $\beta_2$. In a plot, lines of mean response over one of the predictors grouped by the other should be nearly parallel.

If $x_2$ is continuous, we can still use the regression equation to illustrate the effect of both $x_1$ and $x_2$ on $y$.

  • When $x_1 = 0$, $y = \beta_0 + \beta_1 0 + \beta_2 x_2 + \beta_3 0 x_2 = \beta_0 + \beta_2x_2$.
  • When $x_1 = 1$, $y = \beta_0 + \beta_1 1 + \beta_2 x_2 + \beta_3 1 x_2 = (\beta_0+ \beta_1) + (\beta_2 + \beta_3) x_2$.

Therefore, groups $x_1 = 0$ and $x_1 = 1$ differ in both intercepts and slopes. At $x_2 = 0$, the two groups differ by $\beta_1$ equal to the vertical distance between the y-axis intercepts. At $x_2 = 1$, the two groups differ by $\beta_1 + \beta_3$; at $x_2 = 2$, the two groups differ by $\beta_1 + 2\beta_3$; at $x_2 = 10$, the two groups differ by $\beta_1 + 10\beta_3$, etc. Hence a series of distinct group differences by $x_1$ at different points of $x_2$ shows the interaction effects. This is what "a series of single differences at different levels of interacting factor" suggests. In a line plot of mean $y$ over $x_2$ grouped by $x_1$, if the two lines by $x_1$ is not parallel, there is interaction. If there is no interaction, $\beta_3 = 0$, group differences by $x_1$ at all points of $x_2$ are constant at $\beta_1$, and the slope of $x_2$ at any given point of $x_2$ is the same between $x_1$ groups.

In summary, we interpret the interaction effects by writing the regression equation and plugging in representative values. Effects in linear models are easy to comprehend even without such effort, as coefficients directly show marginal effects. In generalized linear models, however, this approach is especially useful because coefficients do not directly measure marginal effects of predictors on the mean response, which are nonlinear.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.