0
$\begingroup$

fist of all i am ne wto the whole thing, so sorry if this is superdumb.

I'm currently training a Transformer model for a sequence classification task using CrossEntropyLoss. My input tensor has the shape (batch_size, classes, seq_len) and my target tensor has the shape (batch_size, seq_len).

Chatgpt advised me to the following:

yHatReshaped = yHat.view(-1, 512)
yReshaped = y.view(-1)
error = lossFunction(yHatReshaped, yReshaped)

Is that correct and the best way to handle a seqence? The documentation also just confuses me, since it says (N,C,d1​,d2​,...,dK​) for k-dimensional loss. Is my sequence basicly a d1? I dont understand the whole thing.

Thanks in advance for your help!

$\endgroup$

0

Browse other questions tagged or ask your own question.