2
$\begingroup$

Can the Mann-Whitney test be used to analyze two different Likert scale questions answered by the same users before and after an experiment? For example, question 1 is asked before doing an experiment and goes as follows, do you like reading (scale from 1 to 5)? Question 2 is asked after doing an experiment with the users, and it is as follows: Now, after this experiment, do you like reading (scale from 1 to 5)? I want to show that users did not like reading before the experiment but after the experiment, they started liking reading, and I want to test statistical significance.

$\endgroup$
8
  • 1
    $\begingroup$ Did you record the data in such a way so that you can match the before value to the after value for each user ? $\endgroup$ Commented Aug 14, 2022 at 0:40
  • $\begingroup$ @SalMangiafico Yes I did $\endgroup$
    – kaki no
    Commented Aug 16, 2022 at 5:15
  • $\begingroup$ To clarify, the question appears to be about the analysis single Likert-type items, not scale results composed of several questions. $\endgroup$ Commented Aug 16, 2022 at 11:24
  • $\begingroup$ @SalMangiafico Can you clarify what is the difference please? $\endgroup$
    – kaki no
    Commented Aug 16, 2022 at 11:28
  • $\begingroup$ The word "scale" is ambiguous. It is sometimes used to describe the responses for a single Likert-type item. But more usually, it can be used to describe the outcome of several items summed or averaged into a single result, as is commonly done in psychology or sociology. For example, the Dissociative Experiences Scale II (which I'm not endorsing), doesn't use Likert-type questions, but does combine several questions into a single result. traumadissociation.com/des. In statistical analysis, the results for items and scales are often handled differently $\endgroup$ Commented Aug 16, 2022 at 11:40

1 Answer 1

2
$\begingroup$

The values will be paired (on user), not independent, so no, you would not normally consider the Wilcoxon-Mann-Whitney.

You would use a paired analysis (e.g. paired-t, Wilcoxon signed rank, sign test etc). One important consideration is the precise form of your hypotheses -- especially what specific kinds of alternatives you're interested in making conclusions about (e.g. whether you care about saying something about population means, for example).

If your inclination to Wilcoxon-Mann-Whitney was because you did not want to assume an interval scale on the Likert items, one thing to beware of is the fact that the Wilcoxon signed rank requires you to be able to take pair differences (and only then to be able to rank their absolute values). For the pair-difference operation to make sense (e.g. to call "5"-"3" the same as "4"-"2" and "3"-"1", for example, giving them all the same value $2$, and similarly for the other possible pair differences) you would need at least an interval scale, since otherwise you have no basis to assert the various gaps you give the same value are actually about the same size. This isn't necessarily critical (it's sometimes possible to argue for doing it on some other basis) but if we're going to worry about the ordinal-vs-interval issue (as it seems you were doing implicitly in your question), then it's best to at least be consistent in what we are going to assume.

$\endgroup$
5
  • $\begingroup$ Thank you for you ans wer @Glen_b my hypothesis is that the answers between the 1st and 2nd questions are different. I want to conclude that the users changed their minds. Is it enough to say that the mean is not the same? $\endgroup$
    – kaki no
    Commented Aug 16, 2022 at 5:21
  • $\begingroup$ when comparing two distributions, "different" could mean all manner of things (more polarized, for example, where a number of 4's become 5's and 2's become 1's). Not all of them correspond to a change in mean; so you need to be clear about what sort of differences you're interested in. Further, to compute a mean difference you are again asserting that "5"-"3" is the same as "4"-"2" and "3"-"1", etc. This might not be problematic but you should be aware that it's there. $\endgroup$
    – Glen_b
    Commented Aug 16, 2022 at 6:33
  • $\begingroup$ Thank you for your answer. The difference I want to test is whether people started to like reading. Before they did not like reading and after they started liking reading. $\endgroup$
    – kaki no
    Commented Aug 16, 2022 at 9:43
  • $\begingroup$ to be more clear. I can compare the mean or the percentages of agree and disagree however, I also want to test statistical significance. $\endgroup$
    – kaki no
    Commented Aug 16, 2022 at 9:44
  • $\begingroup$ It looks like you are interested in a shift to higher ranking answers. There's a number of tests that might be used. You could use a signed rank test as discussed in my answer. You could use a paired t test (looking for a shift specifically in the mean). You could use a sign test. Or a variety of other possible tests might be used. $\endgroup$
    – Glen_b
    Commented Aug 16, 2022 at 16:40

Not the answer you're looking for? Browse other questions tagged or ask your own question.