2
$\begingroup$

I have two different groups of participants ("g" and "b") answering the same set of questions. Group "g" answered questions in the same order. Group "b" answered questions in a randomized order. The answers are binary Y/N (0/1).

This is my dataset:

             |   group   | question1 | question2 | question3 | ... | questionN | 
             |-----------|-----------|-----------|-----------| ... |-----------|
participant1 |     g     |     1     |     0     |     1     | ... |     1     | 
participant2 |     g     |     0     |     1     |     0     | ... |     0     | 
participant3 |     g     |     0     |     1     |     0     | ... |     0     | 
...          |    ...    |    ...    |    ...    |    ...    | ... |    ...    | 
participant8 |     b     |     1     |     0     |     1     | ... |     1     | 
participant9 |     b     |     0     |     0     |     1     | ... |     0     | 
participantN |     b     |     0     |     1     |     0     | ... |     0     | 

What is the best test to perform to understand if there is a statistical difference in responses between the groups ("g" and "b")? In other words, what test should be used to understand if the order of questions has an impact on the responses?

MANOVA seems not to be suitable due to the lack of multivariate normality (binary data Y/N 0/1 inherently follows a Bernoulli distribution). How about the Kruskal-Wallis test or Fisher's Exact test?

I'm looking for the best non-parametric test for multivariate binary data.

Any suggestions would be highly appreciated!

$\endgroup$
6
  • $\begingroup$ How many questions are there in total ? $\endgroup$
    – CaroZ
    Commented Oct 18, 2023 at 19:08
  • $\begingroup$ there are 8 questions total $\endgroup$
    – Roland D
    Commented Oct 18, 2023 at 19:20
  • $\begingroup$ From your question it is not completely clear : you want to understand if there is a difference between groups in what ? $\endgroup$
    – CaroZ
    Commented Oct 18, 2023 at 20:11
  • $\begingroup$ Great question thanks! I'll edit my original post. I didn't mention that my questions for group b were randomized. I want to prove that the order of the questions is not influencing the responses. $\endgroup$
    – Roland D
    Commented Oct 18, 2023 at 20:28
  • $\begingroup$ I wonder if it would be legitimate to run an interaction regression, modeling the binary outcome as a function of the group, the question, and the group-question interaction. I have played with this idea in other contexts and have gotten good simulation results, though my simulations might not generalize beyond the particular cases I’ve used. $\endgroup$
    – Dave
    Commented Oct 18, 2023 at 20:50

1 Answer 1

0
$\begingroup$

One solution would be to fit one binomial GLM (logistic regression) per answer. In R, it would look something like this : glm(question n~group, family=binomial) this way you would know if the outcome of one particular question is explained by the group the participant belongs to, i.e. if the answer to the question which is the response variable of your model is different whether the order of the questions is randomized. However, this approach is not multivariate.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.