How to train pytorch model with numpy data and batch size?

Question

I am learning the basics of pytorch and thought to create a simple 4 layer nerual network with dropout to train IRIS dataset for classification. After refering to many tutorials I wrote this code.

import pandas as pd
from sklearn.datasets import load_iris
import torch
from torch.autograd import Variable

epochs=300
batch_size=20
lr=0.01

#loading data as numpy array
data = load_iris()
X=data.data
y=pd.get_dummies(data.target).values

#convert to tensor
X= Variable(torch.from_numpy(X), requires_grad=False)
y=Variable(torch.from_numpy(y), requires_grad=False)
print(X.size(),y.size())

#neural net model
model = torch.nn.Sequential(
    torch.nn.Linear(4, 10),
    torch.nn.ReLU(),
    torch.nn.Dropout(),
    torch.nn.Linear(10, 5),
    torch.nn.ReLU(),
    torch.nn.Dropout(),
    torch.nn.Linear(5, 3),
    torch.nn.Softmax()
)

print(model)

# Loss and Optimizer
optimizer = torch.optim.Adam(model.parameters(), lr=lr)  
loss_func = torch.nn.CrossEntropyLoss()  

for i in range(epochs):
    # Forward pass
    y_pred = model(X)

    # Compute and print loss.
    loss = loss_func(y_pred, y)
    print(i, loss.data[0])

    # Before the backward pass, use the optimizer object to zero all of the
    # gradients for the variables it will update (which are the learnable weights
    # of the model)
    optimizer.zero_grad()

    # Backward pass
    loss.backward()

    # Calling the step function on an Optimizer makes an update to its parameters
    optimizer.step()

There are currently two problems I am facing.

I want to set a batch size of 20. How should I do this?
At this step y_pred = model(X) its showing this error

Error

 TypeError: addmm_ received an invalid combination of arguments - got (int, int, torch.DoubleTensor, torch.FloatTensor), but expected one of:
 * (torch.DoubleTensor mat1, torch.DoubleTensor mat2)
 * (torch.SparseDoubleTensor mat1, torch.DoubleTensor mat2)
 * (float beta, torch.DoubleTensor mat1, torch.DoubleTensor mat2)
 * (float alpha, torch.DoubleTensor mat1, torch.DoubleTensor mat2)
 * (float beta, torch.SparseDoubleTensor mat1, torch.DoubleTensor mat2)
 * (float alpha, torch.SparseDoubleTensor mat1, torch.DoubleTensor mat2)
 * (float beta, float alpha, torch.DoubleTensor mat1, torch.DoubleTensor mat2)
      didn't match because some of the arguments have invalid types: (int, int, torch.DoubleTensor, !torch.FloatTensor!)
 * (float beta, float alpha, torch.SparseDoubleTensor mat1, torch.DoubleTensor mat2)
      didn't match because some of the arguments have invalid types: (int, int, !torch.DoubleTensor!, !torch.FloatTensor!)

Sorry I could not solve this problem. If you have a solution please post it — Eka, Commented Nov 17, 2017 at 5:02

jdhao · Accepted Answer · 2017-11-17 15:15:09Z

I want to set a batch size of 20. How should I do this?

For data processing and loading, PyTorch provide two classes, one is Dataset, which is used to represent your dataset. Specifically, Dataset provide the interface to get one sample from the whole dataset using the sample index.

But Dataset is not enough, for large dataset, we need to do batch processing. So PyTorch provide a second class Dataloader, which is used to generate batches from the Dataset given the batch size and other parameters.

For your specific case, I think you should try TensorDataset. Then use a Dataloader to set batch size to 20. Just look through the PyTorch official examples to get a sense how to do it.

At this step y_pred = model(X) its showing this error

The error message is pretty informative. Your input X to the model is type DoubleTensor. But your model parameters have type FloatTensor. In PyTorch, you can not do operation between Tensors of different types. What you should do is replace the line

X= Variable(torch.from_numpy(X), requires_grad=False)

with

X= Variable(torch.from_numpy(X).float(), requires_grad=False)

Now, X has type FloatTensor, the error message should disappear.

Also, as a gentle reminder, there are pretty much materials on the Internet about your question which can sufficiently solve your problem. You should try hard to solve it by yourself.

Now its showing this error TypeError: FloatClassNLLCriterion_updateOutput received an invalid combination of. If its ok can you share the fully solved codes. For the past two months I am stuck here and could not able to gain any knolwedge in pytorch past this. Sometimes we will get stuck in places where we cant seem to find a solution and I am currently in this position. Any help will be appreciated. — Eka, Commented Nov 17, 2017 at 16:09
Please update your code, then I will find what is wrong with your code. I have tried to run myself, it works fine. The loss decreases as training goes, see here — jdhao, Commented Nov 18, 2017 at 5:30

Aechlys · Accepted Answer · 2017-09-12 08:43:10Z

1

Probably same issue: Pytorch: Convert FloatTensor into DoubleTensor

In short: when converting from numpy the values are stored in DoubleTensor, while optimizer requires FloatTensor. You have to change one of them.

answered Sep 12, 2017 at 8:43

Aechlys

1,2967 silver badges17 bronze badges

Add a comment |

Collectives™ on Stack Overflow

How to train pytorch model with numpy data and batch size?

2 Answers 2

Not the answer you're looking for? Browse other questions tagged
python
pytorch
or ask your own question.

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Not the answer you're looking for? Browse other questions tagged pythonpytorch or ask your own question.

Linked

Related

Not the answer you're looking for? Browse other questions tagged
python
pytorch
or ask your own question.