Questions tagged [stepwise-regression]
Stepwise regression (often called forward or backward regression) involves fitting a regression model and adding or removing predictors based on $t$ statistics, $R^2$ or information criteria to arrive in a *stepwise* manner at a final model. This tag can also be used for forward selection, backward elimination & best subsets variable selection strategies.
333
questions
0
votes
0
answers
40
views
Selecting degrees of freedom in stepwise regression (stepAIC function in R)
Context: I have data available on water quality in a number of catchments, for example the concentration of Zinc (Zn). For each catchment, I also have a range of characteristics (n=16), such as the ...
1
vote
0
answers
57
views
Least-bad stepwise procedure for a simulation that shows issues with stepwise regression
I am well-aware of the issues that stepwise regression causes. I want to demonstrate some of them via simulation in a particular situation.
I am thinking of a regression where I have some categorical ...
0
votes
0
answers
8
views
Can you deduce if a lasso model has a smaller/larger/equal RSS to a forward selection model?
I came across this question in my exam. Where there is a table where the columns are the different model selection methods: OLS, Lasso, Forward_Size1, ForwardSize2. And the rows are the predictors, ...
0
votes
0
answers
31
views
Can you infer that non-significant variables in full model won't be chosen by stepwise regression methods?
I recently encountered this question twice, on my exam. If you fit a full MLR additive, model, can you infer that the insignificant predictors (p-value > 0.05 from lm output) will not be chosen ...
0
votes
0
answers
20
views
Stepwise Regression - when will bi-direction give different results?
There is forward selection and backward elimination, and in both cases we can not only add or subtract variables, but also do both. My question is under which circumstances would the ...
0
votes
2
answers
58
views
intuition linear regression stepwise selection of predictors
I am using a tool in genetics, which works very similar to stepwise linear regression (its called GCTA-COJO for those interested). Essentially the starting situation looks like this:
You have 1000s of ...
1
vote
1
answer
45
views
Significance test of an increase in adjusted R-squared between two models
I found two papers that provide their results showing the significance of an increase in adjusted R-squared between two models statistically, with p-values, to show the improvement after adding a few ...
0
votes
0
answers
103
views
Interpreting coefficients in Linear regression with categorical variables and one hot encoding (drop first)
I am doing multiple linear regression where my independent variables are a mix of categorical and numerical variables. Obviously I need to one-hot-encode the categorical variables, and I need to "...
2
votes
1
answer
44
views
Feature selection in a traditional regression model to an experiment data
I have an experiment data (total of 96) with 10 predictor and 2 response variables. I want to build a traditional multiple linear regression model to them in R. My aim is to build clearly ...
1
vote
0
answers
22
views
Automated Code for Logistic Regression [closed]
My Y variable (output) is binary (0 or 1). I have 10 input variables in total, 3 of them are scaled variable, 2 of them are ordinal number therefore being written with C( ). Rather than running the ...
2
votes
1
answer
293
views
How is missing values handled in a stepwise model process?
Suppose I have 100 observations among 4 variables, a Y with no missing values, then X1, X2, X3 that each have 10 (distinct) missing values, so that the complete case analysis has only N=70 ...
1
vote
1
answer
850
views
The reason of different regression results between "enter" and "stepwise" methods
I and one of my colleagues conducted regression analysis in SPSS. There is a significant difference in our regression results obtained using the "enter" and "stepwise" methods. All ...
1
vote
0
answers
82
views
Using step() and car::vif(): order matters?
When fitting linear models and coming up with a plausible one, AIC and VIF are often used. However, I notice that the order in which the methods are used makes a difference on the final model.
Should ...
1
vote
0
answers
44
views
What relevance do p-values have in a multiple regression of all 50-states (a.k.a. population/census data)
I am running a regression analysis of the U.S. (specifically, the population represented in the U.S. Congress by voting members, so census data for the 50 states would constitute population data). One ...
1
vote
0
answers
42
views
Magnitude of Type I error inflation related to error noise after model selection
I am investigating Type I error inflation for the one-sided test $$H_0:\beta_2=0$$ after one step of forward stepwise regression for the following model: $$Y=X_1+\beta_2X_2+\epsilon$$ where $\epsilon \...