0
$\begingroup$

I am new to Dirichlet regression, but I am trying to understand why model outputs are potentially different when I use two different R packages, and how I could interpret the slope and intercept provided by Dirichlet regressions!

Following the article from Douma and Weedon (here ) I have seen that when you are using Dirichlet regression, one proportion is used as a reference. For example, having y1,y2, and y3, one of those would be used as a reference level.

I was initially using the brms R package, but I wanted to compare the output to the DirichletReg R package. Both models can provide similar output.

But I have seen that the R package DirichetReg can potentially provide you a coefficient (and intercept) for each proportion used (if you are running an analysis with 3 proportions, you will get three slopes and intercept) - which might be useful. Not most people are used to Dirichlet, and I guess they would like to see for example how y1,y2, and y3 are affected by your different covariates. For that, you need to specify model = "common" (which is by default the implementation in the function DirichetReg()). In the original article, the author wrote The parametrization of model is determined by model which can be "common"(i.e., all α parameters are modeled independently) or "alternative" (i.e., means and precision are modeled).' But then I am not sure to understand fully his section 2.2.1 and how to implement this in R...

Based on this I want to know,

  • which method is mostly used? the one with a slope/intercept for each proportion (y1,y2 and y3) or the one when you have only y2, and y3 (because y1 would be used as the reference)?
  • how can I interpret the coefficient from the common vs alternative approach? The significance would remain the same (eg my covariate x is not significant in both approaches), but the slopes are different, in one case (eg common) it can be positive whereas in the other case (eg alternative) it can be negative. What does it mean?
  • more broadly, how should we interpret the coefficient? In the article from Douma and Weedon, it seems that you first determine the significance of your covariates (eg check the p-values), but if you want to determine if the effect is positive or negative, you need to make a prediction. Am I right? (also, here the interpretation I have found is about the 'common' approach)
  • Lastly, how can I get the coefficient values with brms for all different proportions included as a response (where I could get for example a slope and intercept, even for the proportion used as a reference, similarly to DirichetReg(model = 'common'))?

thank you!

Here is a small code with the different approaches

library(brms)
library(rstan)
library(dplyr)
bind <- function(...) cbind(...)

N <- 20
df <- data.frame(
  y1 = rbinom(N, 10, 0.5), y2 = rbinom(N, 10, 0.7), 
  y3 = rbinom(N, 10, 0.9), x = rnorm(N), x2 = rnorm(N)
) %>%
  mutate(
    size = y1 + y2 + y3,
    y1 = y1 / size,
    y2 = y2 / size,
    y3 = y3 / size
  )
df$y <- with(df, cbind(y1, y2, y3))

make_stancode(bind(y1, y2, y3) ~ x+x2, df, dirichlet())

fit <- brm(bind(y1, y2, y3) ~ x+x2, df, dirichlet())
summary(fit)

#test with DirichReg function
library(DirichletReg)
Edat=DR_data(df[,c('y1', 'y2', 'y3')])
out.freq.diri=DirichReg(Edat~ x+x2, df, 'alternative')
summary(out.freq.diri)
#common approach, provide slope, intercept for the three y
out.freq.diri.option2=DirichReg(Edat~ x+x2, df)
summary(out.freq.diri.option2)
$\endgroup$

0