Why are neural networks trained in batches?

Question

When training a neural network with back propagation, I have often seen that data is processed in batches. So instead of computing and updating the gradient for each training sample, the average gradient is calculated over multiple samples, this is used for the update.

What is the reason for this? Is it because it is faster to train as you update the weights less frequently? Or is it because training over multiple samples avoids overfitting to the individual samples? If the latter is try, then why not train over all the samples at once, rather than dividing it into batches at all?

Thank!

$\begingroup$ Please see stats.stackexchange.com/questions/245502/…. $\endgroup$
– Josh
Commented Nov 1, 2017 at 20:23 — Josh, Commented Nov 1, 2017 at 20:23

burk · Accepted Answer · 2015-10-30 14:33:40Z

3

The latter is true. It would be nice to train on the whole dataset, but the dataset is often too large for it to be technically feasible, at least in image analysis applications.

answered Oct 30, 2015 at 14:33

burk

4763 silver badges2 bronze badges

Add a comment |

Stack Exchange Network

Why are neural networks trained in batches?

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
machine-learning
neural-networks
optimization
or ask your own question.

Linked

Hot Network Questions

Why are neural networks trained in batches?

1 Answer 1

Not the answer you're looking for? Browse other questions tagged machine-learningneural-networksoptimization or ask your own question.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
machine-learning
neural-networks
optimization
or ask your own question.