I'm trying to teach myself about neural networks. I've been reading through the "Efficient BackProp" paper that's highly sited and it's brought me this question;
Since stochastic learning converges significantly faster than batch learning, but has a higher level of noise (doesn't converge to the global minimum) should one perform Stochastic learning until the network reaches a platue/minimum then use batch learning to finish the process?
When I say batch learning I mean treat the entire training set as a single batch (compute the sum of the derivatives of the weights for all samples and then update the network).