1
$\begingroup$

Similar to this question about MLPClassifier, I suspect the answer is 'no' but I will ask it anyway.

Is it possible to change the activation function of the output layer in an MLPRegressor neural network in scikit-learn?

I would like to use it for function approximation. I.e.

y = f(x)

where x is a vector of no more than 10 variables and y is a single continuous variable.

So I would like to change the output activation to linear or tanh. Right now it looks like sigmoid.

If not, I fail to see how you can use scikit-learn for anything other than classification which would be a shame.

Yes, I realise I could use tensorflow or PyTorch but my application is so basic I think scikit learn would be perfect fit (pardon the pun there).

Is it possible to build a more customized network with MultiLayerPerceptron or perhaps from individual layers (sknn.mlp)?

UPDATE:

In the documentation for MultiLayerPerceptron it does say:

For output layers, you can use the following layer types: Linear or Softmax.

But then further down it says:

When using the multi-layer perceptron, you should initialize a Regressor or a Classifier directly.

And there is no example of how to instantiate a MultiLayerPerceptron object.

$\endgroup$

2 Answers 2

1
$\begingroup$

I tried to inject a modified initialization, which allows you to set the output activation:

from sklearn.neural_network import MLPRegressor
model = MLPRegressor()
from sklearn.neural_network._base import ACTIVATIONS, DERIVATIVES

def inplace_capped_output(X):
    """Compute a capped linear function inplace.
    Parameters
    ----------
    X : {array-like, sparse matrix}, shape (n_samples, n_features)
        The input data.
    """
    np.clip(X, -40,40)
    

ACTIVATIONS["custom"]=inplace_capped_output

model._old_initialize=model._initialize
def _initialize(self, y, layer_units, dtype):
    self._old_initialize(y, layer_units, dtype)
    self.out_activation_="custom"
    
model._initialize = _initialize.__get__(model)

The binding of the modified initialize function follows this post.

You need to do something similar with regard to DERIVATIVES which is needed due to the backpropagation algorithm.

Be aware that you need to adapt https://github.com/scikit-learn/scikit-learn/blob/2beed55847ee70d363bdbfe14ee4401438fba057/sklearn/neural_network/_multilayer_perceptron.py#L274 because deltas[last] = activations[-1] - y is not true if you provide arbitrary, custom output activations.

$\endgroup$
3
  • $\begingroup$ Thanks for sharing this. I'm just trying it out. Getting the following error: TypeError: _initialize() missing 1 required positional argument: 'dtype' when I call model.fit(...) on this line in _fit: self._initialize(y, layer_units) $\endgroup$
    – Bill
    Commented Jun 22, 2021 at 18:41
  • $\begingroup$ The signature of the initalize function depends on the version of sklearn. In the newest version there is the dtype argument: github.com/scikit-learn/scikit-learn/blob/2beed5584/sklearn/… Probably you run on an older version. You can fix the problem with your version via removing the two mentions of dtype in the above snippet. $\endgroup$
    – Ggjj11
    Commented Jun 22, 2021 at 18:52
  • $\begingroup$ That was the problem. I upgraded from 0.22.2 to 0.24.2 and now it works perfectly. Here is my demo code which shows how to change the output layer activation function to 'tanh' using this method. Thanks! $\endgroup$
    – Bill
    Commented Jun 22, 2021 at 19:31
3
$\begingroup$

I see in the code for the MLPRegressor, that the final activation comes from a general initialisation function in the parent class: BaseMultiLayerPerceptron, and the logic for what you want is shown around Line 271.

# Output for regression
if not is_classifier(self):
    self.out_activation_ = 'identity'
# Output for multi class
...

Then during a foward pass this self.out_activation_ is called (defined here):

# For the last layer
output_activation = ACTIVATIONS[self.out_activation_]
activations[i + 1] = output_activation(activations[i + 1])

That ominous looking variable ACTIVATIONS is simply a dictionary ,with the keywords being the descriptions you can choose as a parameter in your MLP, each mapping an actual function. Here is the dictionary:

ACTIVATIONS = {'identity': identity, 'tanh': tanh, 'logistic': logistic,
               'relu': relu, 'softmax': softmax}

With all of this information, you might be able to come up with a few ways of putting in your custom function. Off the top of my head, I can't see a quick way to simply provide a function. You could for example:

  1. define your function where all the other activation functions are defined
  2. add it to that ACTIVATIONS dictionary
  3. make self.out_activation_ equal to your custom function (or even a new parameter in MLPRegressor
  4. cross your fingers it doesn't break something somewhere else
  5. run it and solve the inevitable small adaptations that will be necessary in a few places

I'm, afraid I have never looked at the source code of that library before, so cannot give more nuanced advice. Perhaps there is a beautifully elegant way to do it that we have both overlooked.

$\endgroup$
8
  • $\begingroup$ Thanks very much for this. If I understand this correctly the default activation for an MLPRegressor is set to 'identity'. But MLPRegressor object has no attribute out_activation_ so I guess it isn't exposed as an attribute. Identity (linear activation) may be fine for what I need. Is there an easy way to confirm what activation it is? $\endgroup$
    – Bill
    Commented Apr 27, 2018 at 15:59
  • $\begingroup$ I don't know if it's relevant but the last change to this module in Aug 2017 was to line 271 to use this is_classifier method from ..base import. Is out_activation_ not being created for some reason? $\endgroup$
    – Bill
    Commented Apr 27, 2018 at 16:21
  • $\begingroup$ I raised an issue in scikit-learn. $\endgroup$
    – Bill
    Commented Apr 27, 2018 at 16:38
  • $\begingroup$ It is the norm to have a linear activation for the last layer. This basically means no activation, as a linear-transform doesn't really do anything but scale your output. Keras has a linear activation as the end; it just multiplies by 1 - described here. Here is a related question and here is a related blog post $\endgroup$
    – n1k31t4
    Commented Apr 27, 2018 at 16:38
  • 1
    $\begingroup$ It seems they don't have it straight after instantiation. Perhaps it is buried somewhere or is only created after you have done further setup with the MLPRegressor object. Afraid I don't know anymore! $\endgroup$
    – n1k31t4
    Commented Apr 27, 2018 at 17:06

Not the answer you're looking for? Browse other questions tagged or ask your own question.