I am very confused by how Pytorch deals with one-hot vectors. In this tutorial, the neural network will generate a one-hot vector as its output. As far as I understand, the schematic structure of the neural network in the tutorial should be like:
However, the labels
are not in one-hot vector format. I get the following size
print(labels.size())
print(outputs.size())
output>>> torch.Size([4])
output>>> torch.Size([4, 10])
Miraculously, I they pass the outputs
and labels
to criterion=CrossEntropyLoss()
, there's no error at all.
loss = criterion(outputs, labels) # How come it has no error?
My hypothesis:
Maybe pytorch automatically convert the labels
to one-hot vector form. So, I try to convert labels to one-hot vector before passing it to the loss function.
def to_one_hot_vector(num_class, label):
b = np.zeros((label.shape[0], num_class))
b[np.arange(label.shape[0]), label] = 1
return b
labels_one_hot = to_one_hot_vector(10,labels)
labels_one_hot = torch.Tensor(labels_one_hot)
labels_one_hot = labels_one_hot.type(torch.LongTensor)
loss = criterion(outputs, labels_one_hot) # Now it gives me error
However, I got the following error
RuntimeError: multi-target not supported at /opt/pytorch/pytorch/aten/src/THCUNN/generic/ClassNLLCriterion.cu:15
So, one-hot vectors are not supported in Pytorch
? How does Pytorch
calculates the cross entropy
for the two tensor outputs = [1,0,0],[0,0,1]
and labels = [0,2]
? It doesn't make sense to me at all at the moment.