Questions tagged [normalization]
Normalization is a technique often applied as part of data preparation for machine learning. The goal of normalization is to change the values of numeric columns in the dataset to use a common scale, without distorting differences in the ranges of values or losing information.
298
questions
0
votes
0
answers
16
views
I don't understand why LayerNorm is killing cosine predictions
I have a very plain cosine prediction model:
batch_size = 20
Conv1D(filters=1, kernel=10, padding="same")
RELU
Dense(1)
Tanh
If I add LayerNormalization between 1 and 2 or between 2 and 3, ...
0
votes
0
answers
25
views
Is it legit to normalize time series with respect to the x-axis?
I have a data set consisting of multivariate time series, e.g. a batch of my data has the shape (batch_size, timesteps, number_input_features) and I want to train a neural network on it to predict ...
0
votes
0
answers
27
views
Negative values when calculating weighted Jaccard similarity
I have a bilateral dataset that includes countries and their the weight of their relation. I'd like to calculate the similarity of countries in 1) who their trade parterns are and 2) the weight of the ...
0
votes
0
answers
5
views
Can principal components changed by a normaliaation method be used to construct original data shape with SVD
I would like to use an algorithm called Harmony to normalize my data.
Harmony takes as input principal components ($PC$), and ...
0
votes
1
answer
107
views
Min-Max Scaling more sensitive to outliers than 'Simple Feature Scaling'?
I am confused as to the pros and cons of two different
approaches to normalization:
Min-Max Scaling, and what the lecturer in the course I am
taking refers to as 'Simple Feature Scaling'.
The latter ...
0
votes
2
answers
118
views
Batch Normalization vs Layer Normalization
In Batch Normalization, mean and standard deviation are calculated feature wise and normalization step is done instance wise and in Layer Normalization mean and standard deviation are calculated ...
1
vote
1
answer
58
views
How to normalize the features without the knowledge of the min and max values in online learning?
I am developing an online learning platform where input features are gathered from various sensors. However, these features may have vastly different ranges. For example, displacement values may be ...
0
votes
0
answers
24
views
Should I log transform the data if the dataset is small?
Suppose I have a dataset with only about 25 rows.
All the relevant columns are all integers. The ranges of them go from 1 million to 100 million, and the distributions are all skewed right.
In this ...
1
vote
0
answers
32
views
Standardizing my target versus not-standardizing
I've heard from multiple sources that it depends on whether I should standardize or not. Most of the time, people would say it doesn't make sense to do so, some would say it's better if I standardize ...
0
votes
0
answers
26
views
TFRobertaSequenceClassification for Address Normalization task
I have dataset with two column: one with faulty addresses, and other with correct addresses. I want to train a model such that, I can use it later for correcting all the incoming faulty addresses.
I ...
0
votes
0
answers
24
views
Data Preprocessing in the Wild?
I am new to ML, NN, and data science as a whole so the following question might sound silly. How can we perform inference when the model is deployed in the wild?
To my understanding, cleaning/...
1
vote
2
answers
123
views
weighting voting classifier (MAE and MSE)
I am trying to optimize the weights of a Voting Regressor problem. To achieve the best score, I am considering both MAE and MSE as parameters, using the following formula:
score = w * MAE + (w-1) * ...
0
votes
0
answers
14
views
Comparing multiple multivariate datasets
Take the two datasets below:
default rate
state
age
income
asofdate
10
Texas
55
100,000
202309
14
Texas
35
97,000
202309
18
Texas
55
95,000
202308
22
Texas
35
95,000
202308
8
New York
21
55,000
...
0
votes
0
answers
10
views
How to account for exponential oversampling?
I have a dataset of frequencies from 20Hz to 20kHz. The measurement rig that created the dataset doesn't sample evenly across this frequency range. There is a much higher density of sampling happening ...
0
votes
0
answers
20
views
Riemannian metric in Layer Normalization
I'm reading a paper about Layer normalization, and I couldn't find any clear explanation for this part:
Q1. Can anyone describe the derivation of the first equation in (8)?
Q2. I cannot understand ...