What references discuss the problems with phi as an association measure?

Question

(Olivier (2013) states that

many have pointed out problems with $\phi$ as an association measure and advocate the use of odds ratios as an alternative

Unfortunately, it does not give citations for this. Does anyone know of references which point out such problems?

(The context is that I need to persuade colleagues that it is not a great idea to simply use $r$ or $\phi_r$ for all measures of effect size, regardless of the model.)

Olivier J, Bell ML (2013) Effect Sizes for 2×2 Contingency Tables. PLoS ONE 8(3): e58777. https://doi.org/10.1371/journal.pone.0058777

J-J-J · Accepted Answer · 2023-11-24 20:16:18Z

James A. Rosenthal (1996), in Qualitative Descriptors of Strength of Association and Effect Size, Journal of Social Service Research. He indirectly cites Fleiss (1994):

Fleiss (1994) contends that the odds ratio is the preferred measure of effect size for dichotomous variables. Unlike the phi coefficient, it is not affected by the proportions in the sample that comprise the categories of the independent variable (Fleiss).

Rosenthal proceeds by giving the example of a situation where two hypothetical interventions are tested on young people to prevent delinquent offenses, at two different occasions with different sample sizes. Here are two tables to visualize the hypothetical situations he suggests (he doesn't use tables in the paper, he simply describes the situation inline):

Table 1

	did not commit delinquent offense	committed delinquent offenses
intervention A	90	10
intervention B	50	50

Table 2

	did not commit delinquent offense	committed delinquent offenses
intervention A	180	20
intervention B	10	10

He explains that the phi coefficient may lead to an incorrect conclusion. Namely, that the difference of effectiveness between interventions A and B in the first table might look more important than between A and B in the second table, as the $\phi$ coefficient is 0.436 vs. 0.335. But if you look at the row-wise percentages, this conclusion would be highly disputable. This problem does not arise with the odds ratio, which is the same in both tables (9).

In Effect Sizes for Research, Robert J. Grissom and John J. Kim, while they raise criticism against the odds ratio, also argue that (p. 250):

A phi arising from another study of the same two dichotomous variables, but using a sampling method other than naturalistic sampling, would not be comparable to a phi based on naturalistic sampling; that is, the value of phi can vary across studies using different sampling methods to study the same pair of dichotomous variables.

They proceed by also explaining the limitations of $\phi$ relative to the possibility of attaining its minimal or maximal values (besides the Grissom and Kim's book, looking for "phi maximum value" in a specialized search engine should return a couple of relevant papers; otherwise, see the Davenport and El-Sanhurry's paper mentioned in the references below).

Grissom and Kim's recommendation for choosing an effect size for 2x2 tables can be found p.281 of their book (bold is mine):

In the case of naturalistic sampling, in which a given number of participants is categorized with respect to two truly dichotomous variables in a 2 × 2 table, possibly appropriate measures of effect size in the population are the phi coefficient, relative risk, and the odds ratio

[...]

When participants have been randomly assigned into two treatment groups that are to be classified into a 2 × 2 table, appropriate measures of effect size are the population risk difference, relative risk, and (possibly) odds ratio.

References

Davenport, E. C., & El-Sanhurry, N. A. (1991). Phi/phimax: Review and synthesis. Educational and Psychological Measurement, 51(4), 821–828. https://doi.org/10.1177/001316449105100403

Fleiss, J. L. (1994). Measures of effect size for categorical data. In H. Cooper & L. V. Hedges (Eds.), The handbook of research synthesis (pp. 245–260). Russell Sage Foundation.

Grissom, R. J., & Kim, J. J. (2012). Effect sizes for research: Univariate and multivariate applications (2nd ed.). Routledge.

Rosenthal, J. A. (1996). Qualitative Descriptors of Strength of Association and Effect Size, Journal of Social Service Research, 21:4, 37-59, https://doi.org/10.1300/J079v21n04_02

Sorry not to quote directly the whole excerpt where Rosenthal discusses the example and the advantages of odds ratio, but in my opinion it would be too close to copyright infringment to quote the whole thing, as it is quite long. Unfortunately, it seems that the article is still under restricted access almost 28 years after its publication, so if you're interested, you should ask a local academic library if they have the paper. — J-J-J, Commented Nov 24, 2023 at 17:32

Stack Exchange Network

What references discuss the problems with phi as an association measure?

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
effect-size
odds-ratio
pearson-r
or ask your own question.

Linked

Hot Network Questions

What references discuss the problems with phi as an association measure?

1 Answer 1

Not the answer you're looking for? Browse other questions tagged effect-sizeodds-ratiopearson-r or ask your own question.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
effect-size
odds-ratio
pearson-r
or ask your own question.