Questions tagged [scikit-learn]
A machine-learning library for Python. Use this tag for any on-topic question that (a) involves scikit-learn either as a critical part of the question or expected answer, & (b) is not just about how to use scikit-learn.
1,807
questions
1
vote
0
answers
22
views
I have a dataset with 18 biomarker features and a target variable. I want to find the features which are having the biggest impact on the target
I Have some disease biomarker datasets that contain 18 biomarker readings from different samples and a target variable which shows presence or absence of disease (features are both categorical and ...
0
votes
0
answers
21
views
How Random Forest handle missing value in sk-learn? [duplicate]
What is the technic used in Random Forest Regressor from scikit-learn to handle missing value ?
First I thought that a Random Forest regressor was able to natively handle missing value during training ...
0
votes
0
answers
7
views
Why the different default parameters for scikit-learn gradient boosting classifiers? (GradientBoostingClassifier and HistGradientBoostingClassifier)
Why do gradient boosting classifiers (GradientBoostingClassifier) and histogram-based gradient boosting classifiers (HistGradientBoostingClassifier) have significantly different default hyperparameter ...
0
votes
0
answers
18
views
scikit-learn CCA: x_loadings_x attribute
I'm doing a canonical correlation analysis using scikit-learn's CCA. After doing the usual steps and calling ca.x_loadings_, I see that I get values bigger than 1. ...
2
votes
1
answer
46
views
Meaning/interpretation of intercept_ in partial least squares
After using sklearn library for Partial Least Squares, I have doubts about the interpretation of the "intercept" of the model.
As you can see in the code that follows, and its corresponding ...
0
votes
0
answers
32
views
How to handle Data Normalization in case that a Logarithmic scale is required?
Let's say we wished to build a Regressor (e.g. a Support Vector Regressor) to predict the price of an asset, within a given time span from now on.
However, what if the historical data we have ...
1
vote
0
answers
26
views
What are the best options for imputing time series that is missing lots of days [closed]
I have many months of temperature data recorded roughly every ten minutes. Except it has gaps. If the gap is an hour or so, I can linearly interpolate, but if the gap is a few days this obviously ...
0
votes
1
answer
20
views
How does KNNImputer stores fitted values of the train set?
If someone here is familiar with the KNNImputer implementation of Scikit-learn, I would be eager to learn this from him.
When you fit an Imputer transformer on your ...
0
votes
0
answers
14
views
GridSearchCV performs worse than baseline
I'm working on a binary classification problem using scikit-learn. One of the models I've tested is KNeighborsClassifier, for ...
4
votes
2
answers
89
views
Finding the corners of noisy polygons
I have some polygons that look for example like this:
If I zoom in very close on one side, you can see the noise.
The data is a list of x coordinates and a corresponding list of y coordinates.
I ...
2
votes
1
answer
74
views
Is my understanding/approach to nested cross-validation, final model tuning correct?
I am training a SVM on limited training data with unbalanced classes.
Here are the things that I want to do:
1.) I want to make a statement of the generalizability ...
1
vote
1
answer
38
views
Reason for high MSE and negative R square value
I am getting really high MSE and negative R square value.
Dataset: https://docs.google.com/spreadsheets/d/1moTZS_LgOn6d74NC44i9lVcWchj-abVx/edit?usp=sharing&ouid=100514649347129021200&rtpof=...
2
votes
1
answer
32
views
How to interpret the results of a classifier when train/test method gives much better results than cross validated one?
I need your help to understand a situation where using train and test set produces perfect results (in terms of accuracy, precision, and recall) but when cross validation is used, the accuracy on ...
1
vote
0
answers
60
views
An error occurred when using the xgboost as a classifier for hiclass [closed]
Bellow it's my example when using the xgboost classifier for hiclass. My question is specifically directed to the hiClass Python package for hierarchical classification. I would like to model the ...
6
votes
1
answer
43
views
What is happening behind the scenes when we use CalibratedClassifierCV without prefit?
From what I understood by reading sklearn Probability Calibration, when we run CalibratedClassifierCV we will fit "a regressor (called a calibrator) that maps the output of the classifier (as ...