26
votes
Accepted
Isn't it problematic to look at the data to decide to use a parametric vs. non-parametric test?
Abso-****ing-lutely yes.
Ideally, one would decide on the entire analysis before seeing the first shred of data, by leveraging pilot data (which is not used in the "real" analysis). My ...
20
votes
Isn't it problematic to look at the data to decide to use a parametric vs. non-parametric test?
I strongly agree with Stephan Kolassa, but I do think there is an exception:
Suppose you carefully consider in advance what test or model you want to use. You justify your choices of assumptions and ...
10
votes
Isn't it problematic to look at the data to decide to use a parametric vs. non-parametric test?
Isn't it an example of the forking paths problem?
Yes.
There are a number of posts on site that address this issue, (and related issues of various aspects of model selection such as deciding whether ...
8
votes
Isn't it problematic to look at the data to decide to use a parametric vs. non-parametric test?
I want to make a distinction that doesn't seem to have come up in previous answers. There is a difference between
examining the results of several complete analyses (estimates, p-values, confidence ...
6
votes
Isn't it problematic to look at the data to decide to use a parametric vs. non-parametric test?
Incredible answers. To summarize some elements of a statistical philosophy that avoids problems of model uncertainty / forking paths:
Pre-specify flexible but powerful methods that are less likely ...
6
votes
Isn't it problematic to look at the data to decide to use a parametric vs. non-parametric test?
Nicely, in some answers and comments there are already links to some stuff I have written.
I will try to give a message in a nutshell here.
Yes, you are right. There is a problem. One nice way to show ...
5
votes
Accepted
sample size in chi-squared test
From what I understand, the Chi-squared is often called non-parametric because it does not assume any distribution on your sample.
However, the test-statistic itself is said to follow (approximately) ...
4
votes
Isn't it problematic to look at the data to decide to use a parametric vs. non-parametric test?
In the end, what matters is what the data sampled from nature tells about the questions we have about nature.
When data supplies us at with novel questions that we want the same data to answer us, ...
3
votes
Non-parametric one-sample mean test for a bounded variable (based on Chebyshev's inequality?)
Here is a maximum likelihood approach.
Suppose you have $n$ observations of $X$ which total $T$, and you want to test the null hypothesis $E[X]=\mu$, with $T/n<\mu$.
Then the maximum-likelihood ...
3
votes
sample size in chi-squared test
As Mathemagician777 explained in a different answer, the size restriction is due to an approximation made for the distribution of the test statistic.
It should be noted, however, that this restriction ...
1
vote
Non-parametric one-sample mean test for a bounded variable (based on Chebyshev's inequality?)
Let $\mathcal B$ be the ensemble of distributions of random variables bounded to $[0, 1]$.
We have $n$ variables $x_1 ... x_n$ IID from some distribution in $\mathcal B$. We want to test the null $E(X)...
1
vote
Accepted
Non-parametric one-sample mean test for a bounded variable (based on Chebyshev's inequality?)
I think for this there are several well known concentration inequalities that can be applied. In particular Hoeffding, Azuma, and McDiarmid. Not that there's any real difference between the bounds one ...
Only top scored, non community-wiki answers of a minimum length are eligible
Related Tags
nonparametric × 2126hypothesis-testing × 308
r × 235
wilcoxon-mann-whitney-test × 176
regression × 170
anova × 154
kruskal-wallis-test × 112
statistical-significance × 110
distributions × 107
t-test × 96
wilcoxon-signed-rank × 94
kernel-smoothing × 93
bootstrap × 85
mathematical-statistics × 68
normal-distribution × 66
repeated-measures × 65
correlation × 63
bayesian × 60
estimation × 59
confidence-interval × 57
self-study × 55
machine-learning × 54
multiple-comparisons × 54
normality-assumption × 52
paired-data × 43