2
$\begingroup$

I am currently plotting a continuum of observed data. I need to check the effectiveness of the fitted continuum with the reduced chi-square method. Ideally, should the reduced chi-square value increase or decrease when the polynomial fit increases degree? Like I fit a polynomial fit with n=1 up to n=10 so should I expect a decrease in the reduced chi-square values with increasing n

$\endgroup$

2 Answers 2

3
$\begingroup$

The chi-squared should always decrease as you increase the complexity of the model (in this case by increasing the number of polynomial terms). The reduced chi-squared may decrease or increase, because the reduced chi-squared is the chi-squared divided by the number of degrees of freedom, but you have reduced the number of degrees of freedom.

Whether the additional complexity is warranted can be checked by comparing the change in the chi-squared against the number of degrees of freedom in the fit. For example you could use the Bayesian Information Criterion (BIC) but generally speaking, if the reduced chi-squared does not decrease then you are probably over-fitting the data.

$\endgroup$
3
  • $\begingroup$ Another way to put the latter part of the final sentence is that the point at which the reduced chi-squared does not decrease significantly from the previous iteration with one fewer degree of freedom is the point at which one might well be overfitting. $\endgroup$ Commented Jan 13, 2023 at 11:59
  • $\begingroup$ @DavidHammen it can be more complicated. For example the true function could be odd or even in which case adding even or odd polynomial terms won't change the chi-squared and the reduced chi-squared will increase. But adding a further term could then see a significant decrease. $\endgroup$
    – ProfRob
    Commented Jan 13, 2023 at 12:42
  • $\begingroup$ That is correct. Sometimes it's better to take a stepwise approach of looking at each model parameter individually and adding the one that results in the best improvement to the model, repeating until there are no modeling parameters left or the improvement is negligible. An alternative is to start with using all modeling parameters on the first round and then eliminating modeling parameters one by one based on some measure of statistical (in)significance, stopping the trimming when removing any one of the remaining parameters would trim too much (from a statistically significant perspective). $\endgroup$ Commented Jan 13, 2023 at 12:55
1
$\begingroup$

Should I expect a decrease in the reduced chi-square values with increasing n?

You should see a decrease up to the point where you start overfitting -- assuming that your assumption that the correct model is some kind of polynomial is correct. If it's not a polynomial, adding ever more terms might by luck capture ever more of the unmodeled variation.

Suppose one has captured data at $n>4$ points along some domain of interest, each with a bit of measurement error. Suppose the true model is a quadratic. The regression will improve with each step in going from a constant to a line to a quadratic to a cubic, so the chi-squared value will decrease with each step (but not much for the last step from a quadratic to a cubic), The reduced chi-squared value will decrease nicely on moving from a constant to a line to a quadratic. It won't reduce by much, or it might even increase on that last step from a quadratic to a cubic. That's a sign of overfitting (and also of using the wrong model).

Suppose one has lots of measurements (e.g., $n=1000$) and the measurement noise is not particularly small. Going from a constant to a line to a quadratic still results in a nice stepwise decrease in the reduced chi-square value. Here the step to a cubic might also result in a decrease because the addition of the cubic term might happen to capture some of the measurement noise and because dividing by 996 is not all that different from dividing by 997.

$\endgroup$

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .