Questions tagged [gradient-descent]

Ask Question

Gradient Descent is an algorithm for finding the minimum of a function. It iteratively calculates partial derivatives (gradients) of the function and descends in steps proportional to those partial derivatives. One major application of Gradient Descent is fitting a parameterized model to a set of data: the function to be minimized is an error function for the model.

1,472 questions

0 votes

1 answer

29 views

Pytorch, use loss that don't return gradient

I'm trying to develop a model that improves the quality of a given audio. For this task I use DAC for the latent space and I run a transformer model to change the value of the latent space to improve ...

Jourdelune

asked Jul 5 at 17:05

0 votes

0 answers

32 views

Gradient Vanishing when training LSTM with pytorch

I was training a simple LSTM neural network with pytorch to predict stock price. And it is confusing to me that my network wouldn't fit. The loss is exploding and the r2 is negative. As the training ...

王一诺

asked Jun 30 at 7:15

0 votes

0 answers

21 views

Why does the actor_gradients calculate as [None, None, None, None]?

I'm trying to train a RL agent with DDPG policy to solve an Pendulum problem. An issue occurs when the policy attempts to train the parameter with the optimizers.Adam.apply_gradients. This is because ...

Joshua Westhoek

asked Jun 26 at 20:07

1 vote

0 answers

29 views

Calculating variance of gradient of barren plateau problem in quantum variational circuit

In paper Cost function dependent barren plateaus in shallow parametrized quantum circuits, the author exhibit an warm-up example in page 2 to show the barren plateau phenomenon. In this example, the ...

lang xian

asked Jun 16 at 4:26

2 votes

1 answer

31 views

Torch.unique() alternatives that do not break gradient flow?

In a Pytorch gradient descent algorithm, the function def TShentropy(wf): unique_elements, counts = wf.unique(return_counts=True) entrsum = 0 for x in counts: p = x/len_a #...

2 False

asked Jun 13 at 4:48

0 votes

0 answers

26 views

How do I code Gradient Descent over a discrete Probability Function in Pytorch?

I am trying to code a gradient descent algorithm to minimize the Shannon entropy of a convolution between a 1D array X and a smaller 1D array A, where the parameters to optimize for are the entries of ...

2 False

asked Jun 12 at 2:58

1 vote

2 answers

128 views

Minimizing Euclidean Norm with Gradient Descent

I'm trying to find a solution for a system of linear equations using Gradient Descent Method ∥Ax-b∥^2 in Python. The linear equations are: x - 2y + 3z = - 1 3x + 2y - 5z = 3 2x - 5y + 2z = 0 The ...

Orhan94

asked Jun 10 at 22:08

2 votes

1 answer

36 views

Cost Function Increases, Then Stops Growing

I understand the zig-zag nature of the cost function when applying gradient descent, but what bothers me is that the cost started out at a low 300 only to increase to 1600 in the end. The cost ...

Miguel Angel

asked Jun 7 at 22:10

0 votes

0 answers

30 views

Is the given code for gradient descent updating the paraments sequentially or simultaneously?

I'm new to machine learning and I have been learning gradient descent algorithm. I believe this code uses simultaneous update, even though it looks like sequential update. Since the values of partial ...

Mayank Gupta

asked Jun 5 at 15:28

0 votes

0 answers

17 views

Gradient accumulation loss compute

Suppose we have data [b,s,dim], I recently noticed that CrossEntropyLoss is (1) computed the average on all tokens (b * s) in a batch instead of (2) computing on each sentence and then compute the ...

pythonHua

asked Jun 3 at 11:19

0 votes

0 answers

14 views

derive tensor after using where function

I implemented the following function def t_asy(self, data, beta: float): power = 1 + (beta * torch.linspace(0, 1, data.shape[-1], device=data.device)) * data.sqrt() ...

Yedidya kfir

1,659

asked Jun 2 at 16:49

0 votes

0 answers

26 views

Gradient Descent Logistic Regression and Covariate Scaling

I'm trying to understand logistic regression and gradient descent. How hard can it be, right? Well, I used the example from this website mydata <- read.csv("https://stats.idre.ucla.edu/stat/...

Smelton

asked May 27 at 7:34

0 votes

0 answers

12 views

Training RL model with TF over all the output vector

I'm training a deep RL model with TensorFlow, but my model doesn't have a single correct action. The output of the network is a vector [x1, x2], and both are actions that need to be optimized. def ...

gustavo lobos astorquiza

asked May 27 at 2:48

0 votes

0 answers

26 views

Why is the matrix transposed when calculating the gradient in a multiple linear regression?

I am taking an online machine learning course and when talking about multivariable linear regression they used the following function to calculate the gradient: def gradient(X, Y, w): return 2 * np....

Gabriel Marinho

asked May 19 at 23:04

1 vote

0 answers

31 views

MINST Image Classification Gradient Descent Neural Network not working

I have to files PreProcess.java: /* * 4/28/24 * Final */ package Final; import java.io.DataInputStream; import java.io.FileInputStream; import java.io.FileNotFoundException; import java.io....

Mark Agib

asked May 17 at 7:18

15 30 50 per page

2 3 4 5

…

99 Next

Collectives™ on Stack Overflow

Questions tagged [gradient-descent]

Pytorch, use loss that don't return gradient

Gradient Vanishing when training LSTM with pytorch

Why does the actor_gradients calculate as [None, None, None, None]?

Calculating variance of gradient of barren plateau problem in quantum variational circuit

Torch.unique() alternatives that do not break gradient flow?

How do I code Gradient Descent over a discrete Probability Function in Pytorch?

Minimizing Euclidean Norm with Gradient Descent

Cost Function Increases, Then Stops Growing

Is the given code for gradient descent updating the paraments sequentially or simultaneously?

Gradient accumulation loss compute

derive tensor after using where function

Gradient Descent Logistic Regression and Covariate Scaling

Training RL model with TF over all the output vector

Why is the matrix transposed when calculating the gradient in a multiple linear regression?

MINST Image Classification Gradient Descent Neural Network not working

Hot Network Questions

Collectives™ on Stack Overflow

Questions tagged [gradient-descent]

Related Tags