Questions tagged [weighted-data]
Datasets where different pieces of data can have different "weights", i.e. different importance.
205
questions
3
votes
2
answers
291
views
Sandwich variance estimator or bootstrap-based variance for stabilized inverse probability weighting (IPW)
Multiple published papers describe IPW as akin to having population with multiply copies of the same individuals. Hence, the correlation should be accounted and corrected using sandwich variance ...
1
vote
1
answer
25
views
Subjective confidence as weights in regression models
I have data where subjects rate a quantity on a certain scale ($y$) but also add their subjective confidence as to how sure they are in their choice ($w$). My initial thought was to add $w$ as weights ...
2
votes
2
answers
157
views
Question on the effect of test reliability on weighting of a test battery
Originally, a test battery had 4 components: two 100 item multiple-choice tests, one oral test, and one essay test. Each component measured different topics. Each of the 4 components was weighted 25%.
...
0
votes
0
answers
28
views
Generate weighted median from activity per timespan
I have a set of data where I have a number of observations over the course of a year per individual. Generally speaking, I want to know the average activeness of the individuals that participated in ...
0
votes
0
answers
23
views
Weighing Data Issue
I am looking at e-cig prevalence within a city. I used surveys to collect data from residents, and I have a query around weighing data.
I have made the assumption, due to over and underrepresentation ...
1
vote
0
answers
94
views
Best practice for subsampling training data and weights (in XGBoost)
I am trying to build an XGBoost model in pycharm and I have a general method question even though it relates to my model of choice (XGBoost). Any kind of general comments on the proper statistical ...
0
votes
0
answers
22
views
Using weights in prognostic models of Survival Analysis data
I have a dataset where I'm comparing Survival (overall and cancer-specific survival) between two treatment groups (Surgery vs. Radiation) for prostate cancer. As suggested by Noordzij et al (PMID
...
0
votes
0
answers
102
views
Exponentially Weighted Covariance Matrix with Ledoit Wolf Shrinkage
The Ledoit Wolf paper "Honey, I Shrunk the Sample Covariance Matrix" presents the formulation for the shrinkage intensity parameter estimate in Appendix B.
The formula for a weighted ...
0
votes
0
answers
30
views
Complex survey design with multiple waves
The organization I work for has collected data from individuals in multiple waves. Their goal was to collect 333 individuals in 6 different groups (genderXgroup). If the first Wave did not reach 333 ...
1
vote
0
answers
81
views
Power analysis for an IPTW (Inverse Probability of Treatment Weighting) model?
I have a sample of N=1,615, and 319 of those cases (19.75%) received the treatment. The prevalence of the outcome of interest is about 0.09 for the whole sample.
Ultimately, I want to conduct IPTW to ...
0
votes
0
answers
73
views
Do I need to standardise variables when computing an index of interest?
Suppose I have to create an "index of interest" for the products of an e-commerce. By index of interest I mean a $parameter$ which, based on some variables, tells me which products are the ...
1
vote
0
answers
31
views
Weighted table probabilities without replacement
I'm trying to recall some statistical maths and could do with a little asisstance.
I'm dealing with a weighted table, and I want to work out the probability of drawing certain values with no order, ...
0
votes
0
answers
43
views
Bootstrap specifics with weighted outcome estimation
I am using bootstrapping to estimate the standard error of a weighted estimate. The process to calculate the estimate is:
Sample from the population
Calculate a weight for each individual based on ...
0
votes
1
answer
20
views
Determining comparable individual performance within a team sport
i have collated a significant body data for a local badminton league (> 20000 indvidual games)
and while i have the means to extract data on a:
per player
per couple
per team
per club
per season
...
6
votes
3
answers
2k
views
"Weighted Median": statistical properties and connection to the sample variance
Suppose we only have the following information:
A set of sample means : $\bar{x_1}$, $\bar{x_2}$ ... $\bar{x_k}$
The sample size used to calculated each sample mean: $n_1$, $n_2$ ....$n_k$
The ...