1

I was testing out all of the sklearn regressors:

[compose.TransformedTargetRegressor(), AdaBoostRegressor(), BaggingRegressor(), ExtraTreesRegressor(), GradientBoostingRegressor(), RandomForestRegressor(), HistGradientBoostingRegressor(), LinearRegression(), Ridge(), RidgeCV(), SGDRegressor(), ARDRegression(), BayesianRidge(), HuberRegressor(), RANSACRegressor(), TheilSenRegressor(), PoissonRegressor(), TweedieRegressor(), PassiveAggressiveRegressor(), KNeighborsRegressor(), MLPRegressor(), svm.LinearSVR(), svm.NuSVR(), svm.SVR(), tree.DecisionTreeRegressor(), tree.ExtraTreeRegressor(), xgb.XGBRegressor(), xgb.XGBRFRegressor()]

on the iris dataset and I'm confused why MLPRegressor isn't working. I'm predicting the sepal length given the other 3 features and every single regressor with default hyperparameters has a test data MAE of .25 to .34, except for MLPRegressor which has a MAE of 1.0! I've tried doing things like scaling and hyperparameter tuning, but MLPRegressor is always wildly inaccurate.

EDIT: After comparing eschibli's code to mine, I figured out that the problem was my scaler. I was using this code

scaler = StandardScaler()
scaler.fit(X)
X = scaler.transform(X)

Why is scaling the iris dataset making the MAE much worse?

1 Answer 1

2

While the default hyperparameters are wildly inappropriate for such a problem, I was able to obtain a MAE of 0.32 on the first run with them (varying the random seed produced values from 0.29 to 0.55 over five tries.) I expect choosing a (much!) smaller hidden layer, scaling the data, and/or tweaking the regularization parameter would produce much better and more consistant results.

X = iris.data[:, 1:]  # sepal length is the first feature
y = iris.data[:, 0]

X_train, X_test, y_train, y_test = train_test_split(X, y)
model = MLPRegressor() 
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
print(mean_absolute_error(y_test, y_pred))
# > 0.3256431821425728

Perhaps you could share the rest of your code?

Not the answer you're looking for? Browse other questions tagged or ask your own question.