1
$\begingroup$

Say for the sake of argument I have a categorical variable called race. The variable has white, black, and Asian levels. I make two dummies (dummy variables) White and Asian (for variable white if 1 you are white for 0 you are not; it is the same logic for the dummy Asian - if 1 you are Asian 0 then you are not). So black is the base variable, aka reference. The dependent variable is weekly income - I have 50 plus predictors of which the two dummies are two. Say I get an effect size of 100 for white. That means that if you are white you earn a hundred dollars more than if you are black a week controlling for the other variables. But what my audience really wants to know is controlling for the other variables what is the difference between being white and not white in income not between white and black or any reference group.

I have heard that this can be done, perhaps by comparing to a grand mean,effect coding?, but I know very little about this.

$\endgroup$
1
  • $\begingroup$ You could you just create a dummy variable with two levels: white and non-white. And only include that variable in your model? $\endgroup$
    – Lachlan
    Commented Mar 26, 2021 at 5:52

1 Answer 1

1
$\begingroup$

You could change the encoding, for instance define two dummys this way:

         asian  black  white
dummy_1    1      1       0
dummy_2    0      1       0

(there are many other ways). dummy-1 compares white to non-white, and the other compares within the non-white group. This is as if comparing white average to non-white weighted average, with weights takes as sample proportions.

If you want some other weighting, you can use a custom contrast.

EDIT You say (in a comment)

I believe the group of managers want to see all the levels in our model not just two

But all the levels are in the model! It is as simple as understanding that $a+b = a+b +0\cdot c$, the "omitted" level is there, but with a coefficient of zero. So maybe this is only a problem of reporting, it might be a good idea to present the result with this implicit zeros made explicit. For some examples (and R code) see

and

$\endgroup$
1
  • 1
    $\begingroup$ Thanks. I believe the group of managers want to see all the levels in our model not just two (although that is an interesting idea). Maybe they don't, I can find out. The variables were created by the federal government, but for this purpose maybe we can ignore that. $\endgroup$
    – user54285
    Commented Mar 26, 2021 at 15:34

Not the answer you're looking for? Browse other questions tagged or ask your own question.