1
$\begingroup$

I'd like to estimate "cost" using some covariates with a weighted gamma model using svyglm. The weights sum to 1, and there are about 10,000 rows in the dataframe df total, with columns including cost, and the covariates listed in the model below. Why is predict giving only values in the range of 5-11, when half of the cost data is between \$500 and \$5000? Does this have something to do with the weights? How can I get the output of predict to scale to the original scaling (\$)?

Here is my survey design:

des <- svydesign(
  ids = ~0,
  weights = ~weights,
  data = df
)

And the svyglm model.

model <- svyglm(cost ~  gender  
             + agegrp 
             + flag   
             + race
            + risk 
             , design = des,
           rescale =TRUE, family=Gamma(link="log"))

A summary of the predictions gives very low values (5-11).

summary(predict.glm(recent, temp2[,c("gender", "agegrp", "flag", "race", "risk")]), na.rm=TRUE)
Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
  5.219   7.013   7.842   7.943   8.754  10.782      63 

A summary of the cost variable shows that the interquartile range is about \$500-\$5000, with some very large values as well.

 summary(df$cost)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
      0     534    1553    6798    4547 2689484 

A summary of the weights cost shows that those values are even smaller than the predictions, other than some of the largest values in the dataset.

summary(df$weights * df$cost)
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
  0.00000   0.01364   0.05918   0.58572   0.26013 231.06067 
$\endgroup$

1 Answer 1

3
$\begingroup$

It defaults to predicting on "link" scale see (?survey:::predict.svyglm()), i.e. the prediction is of the logarithm of the mean (as you've defined your model with Gamma(link="log")).

If you do type = "response", this will predict on the scale of the respondent, which is what you are looking for in this example.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.