0
$\begingroup$

I came across this question in my exam. Where there is a table where the columns are the different model selection methods: OLS, Lasso, Forward_Size1, ForwardSize2. And the rows are the predictors, and intercept only. I was asked if I could deduce whether or not Lasso's RSS (Residual Sum of Squares) is smaller/large/equal to Forward_Size2. An option is to say, I can't deduce. But there was a hint in the question that told me to look at the previous step of Forward_Size2 which was ForwardSize1, so I thought it must be possible?

This was the table, I was shown. The number are different, but there relative "size" from each other is about the same. Like how Lasso intercept was larger than OLS. Another distinction is that Lasso selected a different predictor, then forward selection size 2, despite being the same size.

--------/-OLS-/-Lasso-/-Forward_Size1-/-Forward_Size2/
intrcpt |0.42| 0.61  | 0.39           |0.25
pred1   |1.50| 0.70  | 0.00           |NA
pred2   |0.60| 0.00  | 0.00           |0.62
pred3   |2.00| 1.56  | 1.80           |1.10

I answered that Lasso RSS is smaller because Lasso considers the joint contribution of all predictors unlike forward selection which only considers the partial contribution at each step, in which the previous variables selected won't be evaluated. Which is the reason why we see that Forward_Size 2 chose pred2 instead of pred1.

I didn't get to mention on how Foward_Size2 is different from Forward_Size1, because I couldnt deduce any infomration from Size1 to Size2.

$\endgroup$
2
  • $\begingroup$ What does NA mean, why it's not 0.00? $\endgroup$
    – runr
    Commented Apr 30 at 18:56
  • $\begingroup$ LASSO solution induces smaller shrinkage than both forward selections, which suggests smaller RSS. Geometrically, the "FS2" solution was within the "LASSO" boundary $\max \beta < 1.56$ $\endgroup$
    – runr
    Commented Apr 30 at 19:22

0