Skip to main content

All Questions

Tagged with
1 vote
0 answers
17 views

Why is the threshold term incorporated into the weight vector in linear classifiers?

In the context of linear classifiers, such as the perceptron or logistic regression, I understand that the decision boundary is defined by a linear combination of input features and weights, plus a ...
Narges Ghanbari's user avatar
0 votes
1 answer
28 views

Why does increasing model complexity reduce bias over the entire data distribution?

In ML, we often talk about the bias-variance tradeoff, and how increasing model complexity both reduces bias and increases variance. I understand why increasing model complexity reduces bias at first, ...
user35734's user avatar
  • 406
0 votes
0 answers
31 views

Derivation of bias of LASSO in the ortnormal case

In the following lecture slides by Breheny, P. (2016) titled "Adaptive lasso, MCP, and SCAD" from the High Dimensional Data Analysis course at the University of Iowa, slide 2 presents the ...
Joe94's user avatar
  • 95
1 vote
1 answer
25 views

Using Forecasted Data to Augment Predictions

We have a model that is predicting 5 year rent growth. We know that supply for the next two years is at a record high. We know that this record high supply is going to impact the rent growth ...
Magnolia Capital's user avatar
0 votes
0 answers
30 views

Test for Look-ahead bias in Time Series Forecasting

I have a general question regarding testing for look-ahead bias. Is there any technical test for look-ahead bias in training data? Especially in the context of time series forecasting e.g. predicting ...
Kingvader Wong's user avatar
2 votes
2 answers
172 views

Proof of the bias-variance decomposition in Bishop's book

I am trying to rewrite the demonstration given in Bishop's book: Pattern Recognition and Machine Learning (2009) I reproduce the figure (page 149) in which I am unclear about the step leading from (3....
Gianni's user avatar
  • 153
1 vote
0 answers
18 views

Can I use Shapley values with metadata (i.e. information about observations that I didn't train my model on)?

I'm training a set of models (random forest/XGBoost) for an ordinal regression task. I'm (tentatively) planning to use Shapley values to infer feature performance. I also have some metadata that my ...
Neil's user avatar
  • 66
1 vote
1 answer
28 views

Manually adding edge-cases to a text classification model

Suppose I want to get training data for a model that deals with sentiment analysis for text that indicates an affirmative (yes) or negative (no) response, such as ...
multiheadedattention's user avatar
1 vote
1 answer
27 views

Prediction biased by closest hyperpoints

I am building a boosted decision trees classification model, where the input variables vary smoothly with time. The problem is that the predictions will always be biased by the most recent entries. I ...
Helen's user avatar
  • 299
0 votes
0 answers
252 views

Leave One Subject Out Cross Validation: mean vs median

Assume we have a dataset with n subjects and m labels and train a classifier. To ensure that there is no subject bias in the ...
CLRW97's user avatar
  • 121
12 votes
2 answers
1k views

Is this question based on inspection bias?

This table describes the positive corona tests in all three open ports of country X, in two different days, plus for the total positive for all people entering the country via those three ports: ...
CORy's user avatar
  • 543
3 votes
3 answers
2k views

If we reduce size of training dataset does it decreases bias?

I'm a newbie and learning ML. I've a doubt, normally we know we should increase the size of training dataset or should add more data to reduce variance (fairly understood why). Now variance has ...
iamawesome's user avatar
0 votes
0 answers
75 views

build a linear regressor with labels in different scales

I just ran into this linear regression problem where the labels are in entirely different range for example for 25% of the samples, the labels are in [0.001,0.01], then for another 25 % of the samples,...
Upendra01's user avatar
  • 1,956
0 votes
0 answers
100 views

Training on biased dataset, when the bias is quantitively known

I have a machine learning model (A neural network here) which minimizes MSE loss. The model should fallow an unbiased distribution. Nevertheless, the training set is biased, but fortunately by a known ...
Daniel Wiczew's user avatar
1 vote
1 answer
181 views

How does SGD training error decrease in subsequent epochs with non-iid samples when it is recommended that samples in subsequent epochs be iid?

I have been reading the Deep Learning book by Ian Goodfellow and on pg. 277, they mention: It is also crucial that the minibatches be selected randomly. Computing an unbiased estimate of the expected ...
Kunj Mehta's user avatar

15 30 50 per page
1
2 3 4 5
9