All Questions
Tagged with bias machine-learning
130
questions
1
vote
0
answers
17
views
Why is the threshold term incorporated into the weight vector in linear classifiers?
In the context of linear classifiers, such as the perceptron or logistic regression, I understand that the decision boundary is defined by a linear combination of input features and weights, plus a ...
0
votes
1
answer
28
views
Why does increasing model complexity reduce bias over the entire data distribution?
In ML, we often talk about the bias-variance tradeoff, and how increasing model complexity both reduces bias and increases variance. I understand why increasing model complexity reduces bias at first, ...
0
votes
0
answers
31
views
Derivation of bias of LASSO in the ortnormal case
In the following lecture slides by Breheny, P. (2016) titled "Adaptive lasso, MCP, and SCAD" from the High Dimensional Data Analysis course at the University of Iowa, slide 2 presents the ...
1
vote
1
answer
25
views
Using Forecasted Data to Augment Predictions
We have a model that is predicting 5 year rent growth. We know that supply for the next two years is at a record high. We know that this record high supply is going to impact the rent growth ...
0
votes
0
answers
30
views
Test for Look-ahead bias in Time Series Forecasting
I have a general question regarding testing for look-ahead bias. Is there any technical test for look-ahead bias in training data? Especially in the context of time series forecasting e.g. predicting ...
2
votes
2
answers
172
views
Proof of the bias-variance decomposition in Bishop's book
I am trying to rewrite the demonstration given in Bishop's book: Pattern Recognition and
Machine Learning (2009)
I reproduce the figure (page 149) in which I am unclear about the step leading from (3....
1
vote
0
answers
18
views
Can I use Shapley values with metadata (i.e. information about observations that I didn't train my model on)?
I'm training a set of models (random forest/XGBoost) for an ordinal regression task. I'm (tentatively) planning to use Shapley values to infer feature performance.
I also have some metadata that my ...
1
vote
1
answer
28
views
Manually adding edge-cases to a text classification model
Suppose I want to get training data for a model that deals with sentiment analysis for text that indicates an affirmative (yes) or negative (no) response, such as
...
1
vote
1
answer
27
views
Prediction biased by closest hyperpoints
I am building a boosted decision trees classification model, where the input variables vary smoothly with time.
The problem is that the predictions will always be biased by the most recent entries. I ...
0
votes
0
answers
252
views
Leave One Subject Out Cross Validation: mean vs median
Assume we have a dataset with n subjects and m labels and train a classifier. To ensure that there is no subject bias in the ...
12
votes
2
answers
1k
views
Is this question based on inspection bias?
This table describes the positive corona tests in all three open ports of country X, in two different days, plus for the total positive for all people entering the country via those three ports:
...
3
votes
3
answers
2k
views
If we reduce size of training dataset does it decreases bias?
I'm a newbie and learning ML. I've a doubt, normally we know we should increase the size of training dataset or should add more data to reduce variance (fairly understood why). Now variance has ...
0
votes
0
answers
75
views
build a linear regressor with labels in different scales
I just ran into this linear regression problem where the labels are in entirely different range for example for 25% of the samples, the labels are in [0.001,0.01], then for another 25 % of the samples,...
0
votes
0
answers
100
views
Training on biased dataset, when the bias is quantitively known
I have a machine learning model (A neural network here) which minimizes MSE loss. The model should fallow an unbiased distribution. Nevertheless, the training set is biased, but fortunately by a known ...
1
vote
1
answer
181
views
How does SGD training error decrease in subsequent epochs with non-iid samples when it is recommended that samples in subsequent epochs be iid?
I have been reading the Deep Learning book by Ian Goodfellow and on pg. 277, they mention:
It is also crucial that the minibatches be selected randomly.
Computing an unbiased estimate of the expected ...