Skip to main content

All Questions

1 vote
1 answer
38 views

How does seeing training batches only once influence the generalization of a neural network?

I am referring to this question/scenario Train neural network with unlimited training data but unfortunately I can not comment. As I am not seeing any training batch multiple times I would guess that ...
ZenDen's user avatar
  • 13
0 votes
1 answer
30 views

Learning the gradient descent stepsize with RL [closed]

Problem statement: I've been working on a project to accelerate the convergence of gradient descent using reinforcement learning (RL). I want to learn a policy that can map the current state of ...
CodeGuy's user avatar
0 votes
0 answers
20 views

Feeding more data to a neural network

I watched a video on Tesla's FSD where the drive was really smooth but required one intervention when the traffic light changed to green but the car wouldn't go because it looked like the light was ...
Noale's user avatar
  • 1
1 vote
2 answers
226 views

Gradient Descent: Is the magnitude in Gradient Vectors arbitrary?

I am only just getting familiar with gradient descent through learning logistic regression. I understand the directional component in the gradient vectors is correct information derived from the slope ...
MrHunda's user avatar
  • 11
0 votes
0 answers
21 views

How to decide a State for Deep Q Learning for Production Line scheduling

There is a production floor with W workstations and N jobs with M operations( different processing times per operation ). A job is completed only if its M Operations are completed. Objective is to ...
ArchanaR's user avatar
2 votes
0 answers
104 views

Can I find the input that maximises the output of a Neural Network?

So I trained a 2 layer Neural Network for a regression problem that takes $D$ features $(x_1,...,x_D)$ and outputs a real value $y$. With the model already trained (weights optimised, fixed), can I ...
puradrogasincortar's user avatar
0 votes
1 answer
23 views

Binary crossentropy loss

When we have a binary classification problem, we use a sigmoid activation function in the output layer+ a binary crossentropy loss. We also need to one hot encode the target variable.This s a binary ...
John adams's user avatar
0 votes
1 answer
73 views

How do I know that my weights optimizer have found the best weights?

I am new to deep learning and my understanding of how optimizers work might be slightly off. Also, sorry for a third-grader quality of images. For example if we have simple task our loss to weight ...
Neriko's user avatar
  • 3
0 votes
1 answer
19 views

change parameterization to eliminate weight constraints in neural networks

I am wondering if it makes sense to use a parameterization to eliminate simple weight inequalities, for example if the weights should be $w\geq 0$, one cound train $\exp w$ over the unconstrained set ...
Philipp123's user avatar
0 votes
0 answers
25 views

uncertainties in non-convex optimization problems (neural networks)

How do you treat statistical uncertainties coming from non-convex optimization problems? More specifically, suppose you have a neural network. It is well known that the loss is not convex; the ...
Dave's user avatar
  • 13
0 votes
1 answer
138 views

Implementing a Randomized Neural Network using Tensorflow?

I want to implement a Randomised Neural Network (alt. Neural Network with Random Weights (NNRW)) in keras based on the following paper: https://arxiv.org/pdf/2104.13669.pdf Essentially the idea is the ...
SwagCakes's user avatar
0 votes
0 answers
26 views

Why does my regression-NN completely fail to predict some points?

I would like to train a NN in order to approximate an unknown function $y = f(x_1,x_2)$. I have a lot of measurements $y = [y_1,\dots,y_K]$ (with K that could be in the range of 10-100 thousands) ...
MttRch's user avatar
  • 1
0 votes
2 answers
2k views

Is reinforcement learning analogous to stochastic gradient descent?

Not in a strict mathematical formulation sense but, would there be there any key overlapping principals for the two optimisation approaches? For example, how does $$\{x_i, y_i, \mathrm{grad}_i \}$$ (...
hH1sG0n3's user avatar
  • 2,068
0 votes
1 answer
199 views

Why is my Neural Network having constant loss and always predicting a singular value?

I am trying to make a neural network on a dataset with 257 features and 1 target variable. My code looks like the following: ...
bballboy8's user avatar
  • 101
0 votes
1 answer
307 views

Memorization in deep neural networks, random vs. properly labelled datasets

From about 19:20 in the video here: https://www.youtube.com/watch?v=IHZwWFHWa-w it shows the difference in value of the cost function for randomly labelled data vs. properly labelled data. What do ...
mLstudent33's user avatar

15 30 50 per page
1
2 3 4 5
7