Questions tagged [bias]
The difference between the expected value of a parameter estimator & the true value of the parameter. Do NOT use this tag to refer to the [bias-term] / [bias-node] (ie the [intercept]).
1,098
questions
2
votes
1
answer
270
views
mlogit + logitr packages fail to recover true estimates of mixed logit random coefficient model
I am running Monte-Carlo simulations on a simple DGP of a mixed logit random coefficient model to check if the mlogit and logitr ...
3
votes
1
answer
72
views
Tossing Until First Heads Outcome, and Repeating, as a Method for Estimating Probability of Heads
Consider the problem of estimating the heads probability $p$ of a coin
by tossing it until the first heads outcome is observed. Say we get $k_1$
tosses, then $U_1 = \frac{1}{k_1}$ is an estimate for $...
0
votes
1
answer
19
views
Censoring and then re-entering subjects
I am following a group of woman during the study period and performing the analysis using Cox model, comparing non-users against users of the investigated medicine. However, to remove the impact of ...
0
votes
1
answer
55
views
Analysis of the bias resulting from PCA [closed]
Suppose that we generate some dataset from $y = X \beta + \epsilon,$ where $\epsilon$ is some independent error, and the rows of $X$ come from some distribution (unspecified for now). Suppose you run ...
2
votes
0
answers
60
views
Bias and Variance of a Honest Random Forest
I am trying to read the paper Estimation and Inference of Heterogeneous Treatment
Effects using Random Forests. In the section 3.1(Theoretical Background), page 13 paragraph 2, The authors have ...
1
vote
1
answer
43
views
Are missing variables an important factor when considering instrumental variable analysis?
I'm currently reading some papers that deal with the effects of education on health (smoking and obesity). Mostly they use an IV approach (college availability).
However in several analysis, only a ...
2
votes
2
answers
172
views
Proof of the bias-variance decomposition in Bishop's book
I am trying to rewrite the demonstration given in Bishop's book: Pattern Recognition and
Machine Learning (2009)
I reproduce the figure (page 149) in which I am unclear about the step leading from (3....
0
votes
0
answers
9
views
How to improve sample representativeness for longitudinal data collected via an online platform?
I am working with a longitudinal dataset exploring cognitive ageing (e.g., memory performance over time). Participants complete the study annually. Inclusion criteria for this study are 1) UK resident,...
3
votes
1
answer
340
views
Question about Analogy to Statistics
If anyone could help me verify if my analogy is correct, thanks so much!
Here is an analogy:
A population is like a pot of soup.
We stir the pot of soup with the ladle because naturally the contents ...
1
vote
0
answers
18
views
Can I use Shapley values with metadata (i.e. information about observations that I didn't train my model on)?
I'm training a set of models (random forest/XGBoost) for an ordinal regression task. I'm (tentatively) planning to use Shapley values to infer feature performance.
I also have some metadata that my ...
2
votes
1
answer
102
views
How to determine whether a sample from a known population is significantly biased?
I have a large dataset (the population) and a large subset of it (the sample) containing the same, continuous variables. The sample represents more than 90% of the population but is not random -- we ...
2
votes
2
answers
446
views
Conceptually, what is the bias of the standard error of an estimator?
I'm reading Muthén and Muthén (2002) to learn how to use Monte Carlo simulation to estimate statistical power in regards to the coefficients of a model that is linear in its coefficients.
I understand ...
2
votes
1
answer
48
views
How normalizing data cause not problem in prediction?
In algorithms that perform better with data normalization or deep learning problems such as classification, how normalizing data does not bias our algorithm? I mean, in training or even testing, we ...
0
votes
1
answer
147
views
How to avoid bias/avoid overfitting when choosing a machine learning model? [closed]
My typical workflow in the past, when creating machine learning models, has been to do the following:
Decide on some candidate model families for the task at hand.
Divide dataset into train and test ...
1
vote
1
answer
18
views
Correction of labelling bias using the labeler identity as a feature
Suppose I have a dataset labeled by multiple analysts.
I assume that each analyst has some bias in his labeling.
Is there any literature on reducing the bias effect on the general model by using the ...