Skip to main content

All Questions

Tagged with
2 votes
1 answer
48 views

How normalizing data cause not problem in prediction?

In algorithms that perform better with data normalization or deep learning problems such as classification, how normalizing data does not bias our algorithm? I mean, in training or even testing, we ...
AliM's user avatar
  • 131
1 vote
1 answer
18 views

Correction of labelling bias using the labeler identity as a feature

Suppose I have a dataset labeled by multiple analysts. I assume that each analyst has some bias in his labeling. Is there any literature on reducing the bias effect on the general model by using the ...
Gideon Kogan's user avatar
1 vote
1 answer
28 views

Manually adding edge-cases to a text classification model

Suppose I want to get training data for a model that deals with sentiment analysis for text that indicates an affirmative (yes) or negative (no) response, such as ...
multiheadedattention's user avatar
0 votes
0 answers
252 views

Leave One Subject Out Cross Validation: mean vs median

Assume we have a dataset with n subjects and m labels and train a classifier. To ensure that there is no subject bias in the ...
CLRW97's user avatar
  • 121
3 votes
1 answer
909 views

Why do we need separate data for probability calibration?

Why do we need separate data for probability calibration? Scikit learn documentation says: The samples that are used to fit the calibrator should not be the same samples used to fit the classifier, as ...
Glue's user avatar
  • 485
0 votes
1 answer
54 views

How to account for known bias in classification data

I apologize for the vagueness beforehand. Here's my experimental setup. I am trying to see if a data point has a property p. For example, in an image classification ...
rivu's user avatar
  • 424
3 votes
1 answer
159 views

Random forest classifier. Some of my data is overrepresented. Is this an issue?

I am using a random forest classifier to predict plant color in my study species, using a variety of environmental variables. My data comes from citizen scientists and I am worried that the class ...
Rachel's user avatar
  • 41
0 votes
0 answers
67 views

Bias and variance of an estimator of a model mean

I have a binary classification model and I need to use its output to estimate the means of groups of observations. I have two questions: A. Can I compute the the bias and variance of the estimator of ...
mchl_k's user avatar
  • 111
1 vote
1 answer
148 views

Why does k-means have more bias than spectral clustering and GMM?

I ran into a 2019-Entrance Exam question as follows: Which of the following algorithm has the higher bias? GMM GMM (identity covariance matrix) spectral clustering k-means The answer mentioned is (...
Lisa Berry's user avatar
2 votes
1 answer
4k views

Why does increasing K increase bias and reduce variance

I get confused when it comes to KNN, why exactly does increasing K increase bias and reduce variance Correct me if I’m wrong My knowledge, suppose we have a regression problem If k=1 and our nearest ...
Chukwudi Ogbonna's user avatar
5 votes
1 answer
2k views

Definition of Bias and Variance in classification problems

I was looking into a StatQuest video and he gave the meaning of bias and variance in regression problems Correct me if I’m wrong Bias is the sum of squares error between the predicted and actual ...
Chukwudi Ogbonna's user avatar
1 vote
0 answers
54 views

Diffrence between bias and training error regarding to KNN

So I'll ask my question by presenting another question, Which of the following statements regarding the k-nearest neighbors classifier for samples in $\mathcal{X} = \mathbb{R}^d$ is true? (a) The ...
drdisrespect's user avatar
8 votes
4 answers
316 views

Was Amazon's AI tool, more than human recruiters, biased against women?

A typical example how bias in data is being copied by AI is Amazon's recruiting tool that got abandoned in 2018. In the various reports it is implicitly (or sometimes explicitly) stated that the AI ...
Sextus Empiricus's user avatar
1 vote
1 answer
266 views

No need for bias term if data is standardised? Linear classification models

For linear classification models, e.g. perceptron, bias term allows to move separating hyperplane away from origin. If data is scattered around the zero does that mean that we don't need bias term?
Egor Epishin's user avatar
3 votes
0 answers
93 views

Calibrating probabilities of a binary classifier when class prior is unknown

Is it possible to calibrate the probabilities of a binary classifier when the class priors are unknown? In cases where the data is obtained with selection bias (i.e. more positives than negatives in ...
user3542930's user avatar

15 30 50 per page