In Simple Neural Network back propagation, we normally use one round of forward and back propagation in every iteration. Let's assume, we have one training example for any arbitrary dimensions, and some initial weights. Then using forward propagation, we calculate the predicted output. This predicted output is then used to calculate the total error which is the back propagated to Re-calculate the weights. After recalculating the weights for all the layers, we update the weights for all the layers all at once. It's not like first we update the weights of one layer and then the other, but instead we first recalculate the weights of all layers( layer by layer ) and then update all at once. We can conclude that "
Re-calculating of the weights layer by layer and then updating the weights with recalculated weights all at one for all the layers". Does this makes sense? Is it the right way of weight update using back propagation?
Now Let's assume, I have "m" examples instead of just one example. In case of "m" examples, each of these small gradient steps will be taken after one back propagation iteration over all examples "m".
I am confused that in case of "m" examples, this back propagation works on these examples one by one. Like, it first takes the first example and update the weights. Then it takes the second example and calculate the weight again. then it takes the third example and calculate the weight and so on. Then in the last when it has run over all the examples, only then it takes the single step towards optimum point. If that is the case, is there any relation between weights for one example to the weights for another example?? As the BP is is recalculating the weights for each examples in sequence?