Newest 'train' Questions - Cross Validated

0 votes

1 answer

20 views

Using a model to evaluate over or under-priced rental prices for the same apartments used in training

If I have a machine learning model which predicts the rental prices of apartments, can I use the model once complete to analyse the prediction for the same apartments I used to train the model so I ...

AWGIS

83

asked Jul 1 at 9:50

0 votes

0 answers

9 views

Validation accuracy dip and recovery when restarting training

i was fine-tuning this large language model with Stochastic Gradient Descent and mid epoch i stopped training, and saved the model weights. Then at a later time, reloaded the weights and restarted the ...

clam

348

asked May 7 at 13:31

0 votes

1 answer

49 views

Is Gaussian Process Regression more suitable for limited amounts of training data than other methods?

In the field of machine learning for molecular properties, one sometimes has to deal with low amounts of (experimental) training data. I have heard some people advising me to use Gaussian Process ...

C_Swann22

103

asked Apr 22 at 21:50

3 votes

1 answer

4k views

Understanding the advantages of BF16 vs. FP16 in mixed precision training

Brain float (BF16) and 16-bit floating point (FP16) both require 2 bytes of memory, but in contrast to FP16, BF16 allows to represent a much larger numerical range than FP16, so under-/overflows won't ...

Green绿色

151

asked Jan 29 at 7:48

0 votes

1 answer

33 views

Make Predictions with an RNN Using a Multi-dimensional Training Set

I have a 2D matrix TD of training data that is a collection of N non-linear signals that are functions of time (hence the ...

Jonathan Frutschy

103

asked Jan 23 at 0:22

0 votes

1 answer

36 views

Is there a (lower) limit/minimum for learning rate values?

I'm building a model for traffic prediction with ConvLSTM and A3T-GCN cells. Since the input data is highly complex and the model is relatively big, I can only load ...

olenscki

101

asked Jan 19 at 16:37

1 vote

0 answers

14 views

Determining Optimal Data Period / Time Span for Model Training

I'm seeking advice on determining the ideal time span for optimizing a weather forecast strategy using historical data without overfitting/underfitting our model. In pursuit of optimal performance and ...

RezAm

111

asked Nov 13, 2023 at 3:40

2 votes

0 answers

48 views

How was the word2vec model trained?

Let's take the CBOW (continuous bag of words) model as the example. Suppose that, there are $c$ context words, each of which is a one-hot encoding vector. So the total number of elements of input ...

J. Doe

66

asked Nov 13, 2023 at 0:23

0 votes

0 answers

24 views

Trained network always predicts zero [duplicate]

I have an encoder model and I'm training it with a dataset of signals with size (500,1). The data set is normalized and then used to train the model but the problem is that after the model is trained, ...

rrSep

1

asked Nov 6, 2023 at 10:30

6 votes

1 answer

91 views

Does training time increase more if I add a layer at the beginning of a neural network or at the end?

Let's consider a fixed NN architecture, dataset and hardware. We add a layer, either at the beginning or at the end of the NN. In which case the training time will increase more? Intuitively, I ...

DeltaIV

18.3k

asked Oct 26, 2023 at 12:09

0 votes

0 answers

8 views

Deep NN with positive partial derivative

Let's assume we are given a FFNN of type $$F: \mathbb{R}^n \times \mathbb{R} \rightarrow \mathbb{R}, \quad (x_1,...,x_{n+1}) \mapsto y$$ We assume the generic architecture (of depth $H$) $$a^{l+1}=\...

NicAG

181

asked Oct 16, 2023 at 13:15

1 vote

0 answers

45 views

Do common implementations of mini-batch gradient descent violate the i.i.d assumption needed for unbiased estimation?

When we perform mini-batch GD, we estimate the true gradient: $$\nabla L = \frac{1}{N} \sum_i \nabla L_i$$ with: $$\nabla_B L = \frac{1}{B} \sum_{i \in B} \nabla L_i$$ where $B$ is the batch size. ...

ado sar

477

asked Oct 11, 2023 at 11:15

1 vote

1 answer

118 views

Classification Threshold Optimization after GridSearchCV

In my machine learning problem I am using a CNN to classify images. Since my dataset is imbalanced I want to perform classification probability threshold tuning so I can find the optimal balance ...

Throwaway123

11

asked Aug 27, 2023 at 13:46

1 vote

1 answer

61 views

Model complexity and number of examples

Is there a measure for model complexity? For given units of this measure how many examples do we need to train a network to get the model right and generalize? In essence what is the relation between ...

Justaperson

121

asked Jun 14, 2023 at 19:06

1 vote

0 answers

15 views

XGBoost Training Logloss dropping but Validation staying steady [duplicate]

Im currently hyper parameter tuning my model and returning the model with the least amount of error. Before I start the hyper parameter tuning process I ensure my validation and test data is is ...

paddockson

11

asked Jun 5, 2023 at 11:26

Stack Exchange Network

Questions tagged [train]

Using a model to evaluate over or under-priced rental prices for the same apartments used in training

Validation accuracy dip and recovery when restarting training

Is Gaussian Process Regression more suitable for limited amounts of training data than other methods?

Understanding the advantages of BF16 vs. FP16 in mixed precision training

Make Predictions with an RNN Using a Multi-dimensional Training Set

Is there a (lower) limit/minimum for learning rate values?

Determining Optimal Data Period / Time Span for Model Training

How was the word2vec model trained?

Trained network always predicts zero [duplicate]

Does training time increase more if I add a layer at the beginning of a neural network or at the end?

Deep NN with positive partial derivative

Do common implementations of mini-batch gradient descent violate the i.i.d assumption needed for unbiased estimation?

Classification Threshold Optimization after GridSearchCV

Model complexity and number of examples

XGBoost Training Logloss dropping but Validation staying steady [duplicate]

Hot Network Questions

Questions tagged [train]

Related Tags