Skip to main content

Questions tagged [random-forest]

Random forest is a machine-learning method based on combining the outputs of many decision trees.

0 votes
0 answers
21 views

How Random Forest handle missing value in sk-learn? [duplicate]

What is the technic used in Random Forest Regressor from scikit-learn to handle missing value ? First I thought that a Random Forest regressor was able to natively handle missing value during training ...
Maxime Charrière's user avatar
3 votes
1 answer
35 views

Can I use simulated data only for testing a Random Forest regression already trained on real data?

I am working, using Python, on a Random Forest Regression for the prediction of a target variable. I have trained it and tested it on real data, obtaining satisfying results. Now, I would like to ...
Ismaela Avellino's user avatar
0 votes
0 answers
15 views

Feature selection for logistic regression and random forest (using Orange - no coding)

I’m using Orange to create a prediction model for the Indian liver patient dataset (binary target variable – either has or does not have liver disease – with 580 instances and 10 features). I’m using ...
Jess's user avatar
  • 21
1 vote
1 answer
18 views

Why RandomForestRegressor.score() return a coefficient of determination? [duplicate]

In ScikitLearn's method RandomForestRegressor.score(X, y), the coefficient of determination R_2 is returned as a metric of the ...
Maxime Charrière's user avatar
0 votes
0 answers
20 views

Kfold cross val in Regression model

How to use K-fold CV to evaluate my regression model performance to calculate the R2, MAE and MSE in the train set to make the model more robust? This code below refers to the tuned model and I'm ...
Vinicius Maia's user avatar
1 vote
1 answer
25 views

how to train and hypertune a model

As I am new to machine learning, and learning it myself, pardon me if I ask a silly question. My question is: What is the correct approach to building a model for, say, random forest and tuning ...
NEERAJ YADAV 's user avatar
4 votes
2 answers
66 views

If R2 is not appropriate for non-linear ML algorithms such as Random Forests, can a Pearson or Spearman correlation be used as performance metric?

$R^2$ is not appropriate for non-linear models, such as Random Forest (RFs) models. https://arxiv.org/pdf/1611.03063 Is R-squared truly an invalid metric for non-linear models? https://...
JElder's user avatar
  • 1,037
1 vote
2 answers
51 views

Is duplicating dataset an augmentation?

For a very small dataset, there is a lot of overfit in the random forest regressor model. I have removed extraneous data, scaling and feature selection, but overfit is still there. The oversampling ...
Erfan Mollai's user avatar
1 vote
0 answers
16 views

Fixed-effect trained model inspection in mixed-effects random forest (MERF)

I have run a Mixed-Effects Random Forest (MERF) using the python merf module, see therein example use here (see also blog post). I have read the above and also Hajjem et al's paper, to get an idea of ...
Emma Wiik's user avatar
1 vote
0 answers
30 views

Significant performance drop between train and validation set

I am trying both Lgbm and RandomForest for a classification, and I observe the same problem. I am using various metaparams to prevent overfitting, such as max_depth, num_trees (keeping it small for ...
Baron Yugovich's user avatar
0 votes
0 answers
15 views

Enforcing symmetries "for bag-of-vector" data in XGBoost or random forest - geodata example for illustration

I'll give a concrete toy problem, then give some comments on what sorts of abstractions I care about. Toy problem: Each person $i$ in my dataset has a phone, and every once in a while the phone will ...
user1557414's user avatar
1 vote
0 answers
25 views

What is the difference between model\$pred and model\$finalModel\$votes in a random forest model trained by caret?

I trained a random forest model as below: ...
Robin's user avatar
  • 119
4 votes
2 answers
533 views

Overfitting in randomForest model in R, WHY?

I am trying to train a Random Forest model in R for sentiment analysis. The model works with tf-idf matrix and learns from it how to classify a review, in positive or negative. Positive ones are ...
Anisa's user avatar
  • 43
0 votes
0 answers
19 views

How do I interpret a Random Forest Survival C-index value relative to the Requested performance error?

I'm doing a random forest survival analysis for a school project and I'm confused about the C-index output that I'm getting relative to the Requested performance error. Shouldn't my C-index get higher ...
Jake S's user avatar
  • 41
2 votes
0 answers
67 views

Causal forests for causal interaction effects between two treatment factors

I'm analyzing a survey experiment data with a factorial design with $2 \times 2$, where each factor is randomly assigned with equal probability. I'd also like to know the heterogeneous effect of the ...
Jin's user avatar
  • 21

15 30 50 per page
1
2 3 4 5
165