Skip to main content

Questions tagged [gradient-descent]

Gradient Descent is an algorithm for finding the minimum of a function. It iteratively calculates partial derivatives (gradients) of the function and descends in steps proportional to those partial derivatives. One major application of Gradient Descent is fitting a parameterized model to a set of data: the function to be minimized is an error function for the model.

gradient-descent
0 votes
1 answer
29 views

Pytorch, use loss that don't return gradient

I'm trying to develop a model that improves the quality of a given audio. For this task I use DAC for the latent space and I run a transformer model to change the value of the latent space to improve ...
Jourdelune's user avatar
0 votes
0 answers
32 views

Gradient Vanishing when training LSTM with pytorch

I was training a simple LSTM neural network with pytorch to predict stock price. And it is confusing to me that my network wouldn't fit. The loss is exploding and the r2 is negative. As the training ...
王一诺's user avatar
0 votes
0 answers
21 views

Why does the actor_gradients calculate as [None, None, None, None]?

I'm trying to train a RL agent with DDPG policy to solve an Pendulum problem. An issue occurs when the policy attempts to train the parameter with the optimizers.Adam.apply_gradients. This is because ...
Joshua Westhoek's user avatar
1 vote
0 answers
29 views

Calculating variance of gradient of barren plateau problem in quantum variational circuit

In paper Cost function dependent barren plateaus in shallow parametrized quantum circuits, the author exhibit an warm-up example in page 2 to show the barren plateau phenomenon. In this example, the ...
lang xian's user avatar
2 votes
1 answer
31 views

Torch.unique() alternatives that do not break gradient flow?

In a Pytorch gradient descent algorithm, the function def TShentropy(wf): unique_elements, counts = wf.unique(return_counts=True) entrsum = 0 for x in counts: p = x/len_a #...
2 False's user avatar
  • 21
0 votes
0 answers
26 views

How do I code Gradient Descent over a discrete Probability Function in Pytorch?

I am trying to code a gradient descent algorithm to minimize the Shannon entropy of a convolution between a 1D array X and a smaller 1D array A, where the parameters to optimize for are the entries of ...
2 False's user avatar
  • 21
1 vote
2 answers
128 views

Minimizing Euclidean Norm with Gradient Descent

I'm trying to find a solution for a system of linear equations using Gradient Descent Method ∥Ax-b∥^2 in Python. The linear equations are: x - 2y + 3z = - 1 3x + 2y - 5z = 3 2x - 5y + 2z = 0 The ...
Orhan94's user avatar
  • 11
2 votes
1 answer
36 views

Cost Function Increases, Then Stops Growing

I understand the zig-zag nature of the cost function when applying gradient descent, but what bothers me is that the cost started out at a low 300 only to increase to 1600 in the end. The cost ...
Miguel Angel's user avatar
0 votes
0 answers
30 views

Is the given code for gradient descent updating the paraments sequentially or simultaneously?

I'm new to machine learning and I have been learning gradient descent algorithm. I believe this code uses simultaneous update, even though it looks like sequential update. Since the values of partial ...
Mayank Gupta's user avatar
0 votes
0 answers
17 views

Gradient accumulation loss compute

Suppose we have data [b,s,dim], I recently noticed that CrossEntropyLoss is (1) computed the average on all tokens (b * s) in a batch instead of (2) computing on each sentence and then compute the ...
pythonHua's user avatar
0 votes
0 answers
14 views

derive tensor after using where function

I implemented the following function def t_asy(self, data, beta: float): power = 1 + (beta * torch.linspace(0, 1, data.shape[-1], device=data.device)) * data.sqrt() ...
Yedidya kfir's user avatar
  • 1,659
0 votes
0 answers
26 views

Gradient Descent Logistic Regression and Covariate Scaling

I'm trying to understand logistic regression and gradient descent. How hard can it be, right? Well, I used the example from this website mydata <- read.csv("https://stats.idre.ucla.edu/stat/...
Smelton's user avatar
  • 31
0 votes
0 answers
12 views

Training RL model with TF over all the output vector

I'm training a deep RL model with TensorFlow, but my model doesn't have a single correct action. The output of the network is a vector [x1, x2], and both are actions that need to be optimized. def ...
gustavo lobos astorquiza's user avatar
0 votes
0 answers
26 views

Why is the matrix transposed when calculating the gradient in a multiple linear regression?

I am taking an online machine learning course and when talking about multivariable linear regression they used the following function to calculate the gradient: def gradient(X, Y, w): return 2 * np....
Gabriel Marinho's user avatar
1 vote
0 answers
31 views

MINST Image Classification Gradient Descent Neural Network not working

I have to files PreProcess.java: /* * 4/28/24 * Final */ package Final; import java.io.DataInputStream; import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io....
Mark Agib's user avatar

15 30 50 per page
1
2 3 4 5
99