I'm kind of new to machine learning and I am using MLPRegressor. I split my data with

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

then I make and fit the model, using 10-fold validation for test set.

nn = MLPRegressor(hidden_layer_sizes=(100, 100), activation='relu',
                     solver='lbfgs', max_iter=500)

nn.fit(X_train, y_train)

TrainScore = nn.score(X_train, y_train)

kfold = KFold(n_splits=10, shuffle=True, random_state=0)
        print("Cross-validation scores:\t{} ".format(cross_val_score(nn, X_test, y_test, cv=kfold)))
        av_corss_val_score = np.mean(cross_val_score(nn, X_test, y_test, cv=kfold))
        print("The average cross validation score is: {}".format(av_corss_val_score))

The problem is that the test scores I receive are very negative (-4256). What could be possible be wrong?


To keep syntax the same, sklearn maximizes every metric, whether classification accuracy or regression MSE. Therefore, the objective function is defined in a way that a more positive number is good and more negative number is bad. Hence, a less negative MSE is preferred.

Moving on to why it may be so negative in your case, it could be broadly due to two things: overfitting or underfitting. There are tonnes of resources out there to help you from this point forward.

