0

I'm looking at this pytorch starter tutorial:

https://pytorch.org/tutorials/beginner/blitz/neural_networks_tutorial.html#sphx-glr-beginner-blitz-neural-networks-tutorial-py

the zero_grad() function is being used to zero the gradients which means that it's running with mini-batches, is this a correct assumption? If so, where is the batch size defined??

I found the following for nn.conv2d:

For example, nn.Conv2d will take in a 4D Tensor of nSamples x nChannels x Height x Width.

in that case nSamples is the batch size?

but how do you specify the batch size for a nn.Linear layer? do you decide what your mini-batches are when you load the data or what?

I am making a few assumptions here that may be totally incorrect, pls correct me if i'm wrong. thank you!

2 Answers 2

1

You predefine the batch_Size in the dataloader, For a linear layer you do not specify batch size but the number of features of your previous layer and the number of features you wish to get after the linear operation.

This is a code sample from the Pytorch Docs

m = nn.Linear(20, 30)
input = Variable(torch.randn(128, 20))
output = m(input)
print(output.size())
1

As Ryan said, you don't have to specify the batch size in Lieanr layers.

Here I'd add something for you to clarify more details.

Let's first consider the equation of a linear layer:

where X is a tensor with size batch_size * in_feats_dim, W is a weights matrix with size out_feats_dim and in_feats_dim, and b is a bias vector with size out_feats_dim.

So far you probably find that your parameters, W and b, are independent of your batch size.

You can check the implementation in Pytorch nn.module.functional.linear line 1000 to line 1002. It actually matches what we discuss above.

Not the answer you're looking for? Browse other questions tagged or ask your own question.