3
$\begingroup$

I'm trying to teach myself about neural networks. I've been reading through the "Efficient BackProp" paper that's highly sited and it's brought me this question;

Since stochastic learning converges significantly faster than batch learning, but has a higher level of noise (doesn't converge to the global minimum) should one perform Stochastic learning until the network reaches a platue/minimum then use batch learning to finish the process?

When I say batch learning I mean treat the entire training set as a single batch (compute the sum of the derivatives of the weights for all samples and then update the network).

$\endgroup$

1 Answer 1

1
$\begingroup$

Yes you can perform stochastic learning followed by batch learning in neural networks. In fact the very same paper discusses it:

Another method to remove noise is to use “mini-batches”, that is, start with a small batch size and increase the size as training proceeds. Møller discusses one method for doing this [25] and Orr [31] discusses this for linear problems. However, deciding the rate at which to increase the batch size and which inputs to include in the small batches is as difficult as determining the proper learning rate. Effectively the size of the learning rate in stochastic learning corresponds to the respective size of the mini batch.

But keep in mind that the paper was written in 1998, when GPUs were not commonly used to train neural networks. With GPUs, it is much cheaper to use mini-batch training than it is on CPUs. (See {1} for one of the first papers underlying the use of GPUs for neural networks.)

FYI: Tradeoff batch size vs. number of iterations to train a neural network


$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.