Questions tagged [train]
Training (or estimation) of statistical models or machine learning algorithms.
348
questions
0
votes
1
answer
20
views
Using a model to evaluate over or under-priced rental prices for the same apartments used in training
If I have a machine learning model which predicts the rental prices of apartments, can I use the model once complete to analyse the prediction for the same apartments I used to train the model so I ...
0
votes
0
answers
9
views
Validation accuracy dip and recovery when restarting training
i was fine-tuning this large language model with Stochastic Gradient Descent and mid epoch i stopped training, and saved the model weights. Then at a later time, reloaded the weights and restarted the ...
0
votes
1
answer
49
views
Is Gaussian Process Regression more suitable for limited amounts of training data than other methods?
In the field of machine learning for molecular properties, one sometimes has to deal with low amounts of (experimental) training data. I have heard some people advising me to use Gaussian Process ...
3
votes
1
answer
4k
views
Understanding the advantages of BF16 vs. FP16 in mixed precision training
Brain float (BF16) and 16-bit floating point (FP16) both require 2 bytes of memory, but in contrast to FP16, BF16 allows to represent a much larger numerical range than FP16, so under-/overflows won't ...
0
votes
1
answer
33
views
Make Predictions with an RNN Using a Multi-dimensional Training Set
I have a 2D matrix TD of training data that is a collection of N non-linear signals that are functions of time (hence the ...
0
votes
1
answer
36
views
Is there a (lower) limit/minimum for learning rate values?
I'm building a model for traffic prediction with ConvLSTM and A3T-GCN cells. Since the input data is highly complex and the model is relatively big, I can only load ...
1
vote
0
answers
14
views
Determining Optimal Data Period / Time Span for Model Training
I'm seeking advice on determining the ideal time span for optimizing a weather forecast strategy using historical data without overfitting/underfitting our model. In pursuit of optimal performance and ...
2
votes
0
answers
48
views
How was the word2vec model trained?
Let's take the CBOW (continuous bag of words) model as the example.
Suppose that, there are $c$ context words, each of which is a one-hot encoding vector. So the total number of elements of input ...
0
votes
0
answers
24
views
Trained network always predicts zero [duplicate]
I have an encoder model and I'm training it with a dataset of signals with size (500,1). The data set is normalized and then used to train the model but the problem is that after the model is trained, ...
6
votes
1
answer
91
views
Does training time increase more if I add a layer at the beginning of a neural network or at the end?
Let's consider a fixed NN architecture, dataset and hardware. We add a layer, either at the beginning or at the end of the NN. In which case the training time will increase more? Intuitively, I ...
0
votes
0
answers
8
views
Deep NN with positive partial derivative
Let's assume we are given a FFNN of type
$$F: \mathbb{R}^n \times \mathbb{R} \rightarrow \mathbb{R}, \quad (x_1,...,x_{n+1}) \mapsto y$$
We assume the generic architecture (of depth $H$)
$$a^{l+1}=\...
1
vote
0
answers
45
views
Do common implementations of mini-batch gradient descent violate the i.i.d assumption needed for unbiased estimation?
When we perform mini-batch GD, we estimate the true gradient:
$$\nabla L = \frac{1}{N} \sum_i \nabla L_i$$
with:
$$\nabla_B L = \frac{1}{B} \sum_{i \in B} \nabla L_i$$
where $B$ is the batch size. ...
1
vote
1
answer
118
views
Classification Threshold Optimization after GridSearchCV
In my machine learning problem I am using a CNN to classify images.
Since my dataset is imbalanced I want to perform classification probability threshold tuning so I can find the optimal balance ...
1
vote
1
answer
61
views
Model complexity and number of examples
Is there a measure for model complexity?
For given units of this measure how many examples do we need to train a network to get the model right and generalize?
In essence what is the relation between ...
1
vote
0
answers
15
views
XGBoost Training Logloss dropping but Validation staying steady [duplicate]
Im currently hyper parameter tuning my model and returning the model with the least amount of error. Before I start the hyper parameter tuning process I ensure my validation and test data is is ...