All Questions
Tagged with neural-network loss-function
126
questions
0
votes
0
answers
9
views
How to handle sequences with crossEntropyLoss
fist of all i am ne wto the whole thing, so sorry if this is superdumb.
I'm currently training a Transformer model for a sequence classification task using CrossEntropyLoss. My input tensor has the ...
1
vote
1
answer
50
views
Does using different optimizer change the loss landscape
I plot the landscape using this code, and I notice the landscape shape has changed a lot. My understanding is that the optimizer does not change the loss landscape. But now I'm confused if its just ...
0
votes
0
answers
14
views
How to combine a classificiation dataset with a pair-wise comparison dataset
Let's say I'm trying to train a neural network that predicts a single output [0.0, 1.0] value that correlates to photo realism which I can use either in a classification setting or for ranking. I have ...
0
votes
1
answer
37
views
My custom neural network is converging but keras model not
in most cases it is probably the other way round but...
I have implemented a basic MLP neural network structure with backpropagation. My data is just a shifted quadratic function with 100 samples. I ...
0
votes
0
answers
138
views
Custom Loss Function Returns Graph Execution Error: Can not squeeze dim[0], expected a dimension of 1, got 32
I have built a loss function which adds time and frequency weighted averages and variances to the MSE:
...
0
votes
0
answers
170
views
Training loss is much higher than validation loss
I am trying to train a neural network with 2 hidden layers to perform a multi class classification of 3 different classes. There is a huge imbalance to the classes, with the distribution being around ...
2
votes
1
answer
150
views
What is the benefit of the exponential function inside softmax?
I know that softmax is:
$$ softmax(x) = \frac{e^{x_i}}{\sum_j^n e^{x_j}}$$
This is an $\mathbb{R}^n \implies \mathbb{R}^n$ function, and the elements of the output add up to 1. I understand that the ...
0
votes
0
answers
51
views
Train neural network to predict multiple distributions
I aim to train a neural network to predict 2 distributions (10 quantiles, i.e. deciles) at 5 time points. So my y is of shape:
...
0
votes
0
answers
48
views
The cost function gets stuck at 120 epochs
I did a neural network in c++ to recognize handwritten digits using the MNIST dataset without any neural network pre-existing libraries. My network has 784 inputs neuron (the pixel of the image), 100 ...
0
votes
0
answers
94
views
Why backpropagation is done in every epoch when loss is always scalar?
I understand the backpropagation algorithm that it calculates the derivate of loss with respect to all the parameters in the neural network. My question is this derivate is constant right because the ...
1
vote
2
answers
3k
views
Training and validation loss are almost the same (perfect fit?)
I am developing an ANN from scratch which classifies MNIST digits.
These are the curves I get using only one hidden layer composed of 100 neurons activated by ...
0
votes
1
answer
23
views
Binary crossentropy loss
When we have a binary classification problem, we use a sigmoid activation function in the output layer+ a binary crossentropy loss. We also need to one hot encode the target variable.This s a binary ...
0
votes
1
answer
73
views
How do I know that my weights optimizer have found the best weights?
I am new to deep learning and my understanding of how optimizers work might be slightly off. Also, sorry for a third-grader quality of images.
For example if we have simple task our loss to weight ...
1
vote
3
answers
156
views
How to learn steep functions using neural network?
I am trying to use a neural network to learn the below function. In total, I have 25 features and 19 outputs. The above image shows the distribution of two features with respect to one of the outputs....
0
votes
1
answer
303
views
Training deep neural networks with ReLU output layer for verification
Most algorithms for verification of deep neural network require ReLU activation functions in each layer (e.g. Reluplex).
I have a binary classification task with classes 0 and 1. The main problem I ...