5
$\begingroup$

I am trying to implement an algorithm that solves a linear regression problem with the following objective function (LASSO):

$$\min_\beta \frac{1}{2}||y-X\beta||_2^2 + \lambda ||\beta||_1$$

for various values of $\lambda$ under several constraints which are added or change from time to time. $y$ and $X$ are my training data which have been standardized to have mean 0 and normalized to have unit $l_2$-norm. For all the regression problems that I solve (remember I add a few constraints from time to time), I want to calculate an out-of-sample $R^2$ on a validation set in order to compare the models. The validation set has also been standardized, yet I used the mean of the training set and the validation set was not normalized.

When I calculate the $R^2$ in the following way I receive values greater than 1:

$$R^2= \frac{\sum_{i=1}^n(\hat{y_i}-\bar{y_i})^2}{\sum_{i=1}^n (y_i - \bar{y_i})^2}$$

Since the training set was standardized to have 0 mean and the mean from the training set is used for the calculation of $R^2$, the above term simplifies to:

$$R^2= \frac{\sum_{i=1}^n(\hat{y_i})^2}{\sum_{i=1}^n (y_i )^2}$$

All my $R^2$ values are higher than 1 (about 1.5 to 1.6). Even if I use the same calculation on the training set, the value exceeds 1 (note that in the case of the training set the denominator equals 1 as the training set was normalized to have unit $l_2$-norm.

I sense that something is going utterly wrong here, yet I did not manage to find the mistake. I thought that maybe this standard calculation of $R^2$ does not work for my LASSO-objective function. If that is the case, which would be the correct way to calculate $R^2$ here?

$\endgroup$
2
  • $\begingroup$ possible duplicate with this stats.stackexchange.com/questions/246347/… ? $\endgroup$
    – utobi
    Commented Nov 26, 2016 at 18:50
  • $\begingroup$ Apparently, the user there did have a similar question. Nevertheless there was no answer on how to calculate $R^2$ in the case for penalized models. $\endgroup$
    – YukiJ
    Commented Nov 27, 2016 at 10:17

2 Answers 2

7
$\begingroup$

Your mistake doesn't come from putting the mean to zero, but from the general computation of $R^2$, which isn't the one you wrote. Using your notation we have several values:

  • $SS_{tot} = \sum_i (y_i-\bar{y})^2$ total sum of squares
  • $SS_{reg} = \sum_i (\hat{y}_i-\bar{y})^2$ explained sum of squares
  • $SS_{res} = \sum_i (y_i-\hat{y}_i)^2$ residual sum of squares

Now the general formula is $R^2 = 1- \frac{SS_{res}}{SS_{tot}}$ depending on the ratio between the unexplained variance and the total variance of the data.

When $SS_{res} + SS_{reg} = SS_{tot}$ then the general formula is equivalent to the one you wrote: $R^2 = \frac{SS_{reg}}{SS_{tot}}$, which can be seen as the ratio between the explained variance to the total variance.

The condition $SS_{res} + SS_{reg} = SS_{tot}$ is true for instance in unregularized linear regression, but possibly not true with LASSO penalty.

$\endgroup$
6
  • $\begingroup$ That's what I thought. Are there any suggestions on how to calculate $R^2$ in a regularized model? $\endgroup$
    – YukiJ
    Commented Nov 27, 2016 at 7:57
  • $\begingroup$ What's the problem with using the first (more general) formula? $\endgroup$
    – etal
    Commented Nov 27, 2016 at 15:44
  • $\begingroup$ The one you suggested cannot be used without an intercept as far as I am aware. So I wonder what I should do when my model does not allow a constant term (i.e. it always includes the origin). $\endgroup$
    – YukiJ
    Commented Nov 28, 2016 at 11:12
  • $\begingroup$ I don't understand why you cannot compute SSres and SStot and thus R even if your model includes the origin. You only need the original points and your predictions $\endgroup$
    – etal
    Commented Nov 28, 2016 at 17:08
  • 1
    $\begingroup$ Sure, but if it becomes negative that means you're better off always predicting the average (so 0) as your approximation. The formula implicitly assumes your model is at least as good as doing a constant prediction, which is the simplest thing you could do. $\endgroup$
    – etal
    Commented Nov 29, 2016 at 14:00
0
$\begingroup$

This is an interesting question, see this and this for two related posts. As far as I understood from the literature, and judging from the answers/comments to the above cited posts, the calculation and interpretation of the coefficient of determination and the calculation of standard errors in penalised estimation approaches are currently open problems.

So my current answer to your question is: just give up using $R^2$ and adjusted $R^2$ in lasso-type problems. Perhaps, goodness-of-fit tests can be a viable alternative to $R^2$.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.