7
$\begingroup$

I have a confusing situation where I have strongly conflicting results from two ways of analyzing my simple data. I measure two binary variables from each participant, AestheticOnly and ChoiceVA. I want to know if AestheticOnly depends on ChoiceVA and whether this relation is different in two different experiments. Here is my participant count data:

Experiment 1
                 AestheticOnly
                 0   1  All
ChoiceVA A      35   6   41
         V      20  13   33
         All    55  19   74

Experiment 2
                 AestheticOnly
                 0   1  All
ChoiceVA A      12  10   22
         V      31  11   42
         All    43  21   64

I run a logistic regression where AestheticOnly is modelled by ChoiceVA, Experiment, and the interaction:

> mod <- glm( AestheticOnly ~ ChoiceVA*Experiment, data = d, family=binomial)
> summary(mod)

Call:
glm(formula = AestheticOnly ~ ChoiceVA * Experiment, family = binomial, 
    data = d)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-1.1010  -0.7793  -0.5625   1.2557   1.9605  

Coefficients:
                     Estimate Std. Error z value Pr(>|z|)    
(Intercept)           -3.3449     0.9820  -3.406 0.000659 ***
ChoiceVAV              3.5194     1.2630   2.787 0.005327 ** 
Experiment             1.5813     0.6153   2.570 0.010170 *  
ChoiceVAV:Experiment  -2.1866     0.7929  -2.758 0.005820 ** 
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 166.16  on 137  degrees of freedom
Residual deviance: 157.01  on 134  degrees of freedom
AIC: 165.01

Number of Fisher Scoring iterations: 4

Apparently all factors are significant. But, this just doesn't make sense to me. For example, looking at the main effect of experiment should be equivalent to performing a Fisher's Exact test comparing 55 and 19 with 43 and 21 (bottom lines of each table). This is obviously not significant (p=.452). So why does the regression model give such a different result? Any help much appreciated.

$\endgroup$
9
  • 5
    $\begingroup$ Where do you perceive a contradiction? The two test don't test the same hypothesis. $\endgroup$
    – Roland
    Commented Oct 1, 2014 at 10:57
  • $\begingroup$ I believe including Experiment as a main effect in the regression model tests whether it has an effect on the response variable AestheticOnly. Likewise, a Fisher's exact test comparing the pattern of AestheticOnly responses between the experiments is asking the same question: does AestheticOnly depend on Experiment. That's my understanding, please correct me if I'm wrong. $\endgroup$
    – Amorphia
    Commented Oct 1, 2014 at 11:12
  • 2
    $\begingroup$ You didn't just include Experiment as a main effect in the regression model. You also included ChoiceVAV and the interaction. $\endgroup$
    – Roland
    Commented Oct 1, 2014 at 11:16
  • $\begingroup$ ... & therefore, the way you've coded the predictors, your "main effect" compares 35 & 6 with 12 & 10 (the top lines of each table where ChoiceVAV is at the reference level) $\endgroup$ Commented Oct 1, 2014 at 11:30
  • $\begingroup$ Ah, thank you. I tried including only the two main effects and then the p-values come out as I expect. Evidently I don't properly understand what it means to include a factor's main effect in a model also containing interactions with the factor. Is there a way to include the interaction in the model but also test what I think of as the main effect (i.e. the bottom line of the tables rather than the top)? $\endgroup$
    – Amorphia
    Commented Oct 1, 2014 at 11:47

1 Answer 1

2
$\begingroup$

I think the problem is that you are trying to answer two questions with the same model. The Fisher test is for the crude odds ratio, collapsing across experimental levels. The logistic model does not test the crude OR. We do not want to conduct a crude or stratified analysis if the homogeneity of the odds ratio is not met in the experimental strata. The output from your saturated logistic model can be used to check this.

Because you have coded Experimental level as a numeric, it's difficult to interpret the output. The ChoiceVAV effect is projected for an experimental level of 0. We have to use post-estimation to predict the actual, meaningful results from the experiment from the saturated logistic model.

To get the OR for experiment one, you must add ChoiceVAV and ChoiceVAV:Experiment: 3.52 - 2.19 = 1.33 = log(35*13 /(20*6)). The last expression being the expression for the stratum specific OR for the first experiment.

The second experiment has stratum specific OR 3.52 - 2*2.19 = -0.86 = log(12*11 /(10*31)). These are of opposite signs. So the homogeneity of the OR is violated. The crude OR is not quite (but close to) a test of the weighted average of these ORs. You can plainly see they average out to around 0. The crude OR verifies this.

In summary, the two experiments produce massively conflicting findings and should not be combined into a single analysis. This is the problem of overfocusing on statistical significance. That the FET is not significant is exactly what is expected.

$\endgroup$
2
  • $\begingroup$ Thanks! It's a common problem for me that I find the assumptions necessary for logistic models to be valid harder to understand than the assumptions for straight-up linear models, where you can check almost everything by eye-balling residual plots. To be honest I hardly understand logistic regression assumptions at all. Do you know of a good on-line resource explaining them? $\endgroup$
    – Amorphia
    Commented Jan 31, 2018 at 22:06
  • $\begingroup$ @Amorphia A text which I find nearly overkill, but adequate to bring almost anyone up to expert level with logistic modeling is Kleinbaum and Klein "Logistic Regression", not freely available however. The ATS UCLA site has some detailed tutorials. $\endgroup$
    – AdamO
    Commented Jan 31, 2018 at 22:15

Not the answer you're looking for? Browse other questions tagged or ask your own question.