Skip to main content

All Questions

32 votes
3 answers
5k views

Why does the Kolmogorov-Smirnov test work?

In reading about the 2-sample KS test, I understand exactly what it is doing but I don't understand why it works. In other words, I can follow all the steps to compute the empirical distribution ...
Darcy's user avatar
  • 915
23 votes
5 answers
16k views

Test if multidimensional distributions are the same

Lets say I have two or more sample populations of n-dimensional continuous-valued vectors. Is there a nonparametric way to test if these samples are from the same distribution? If so, is there a ...
mbc's user avatar
  • 331
20 votes
4 answers
126k views

How do I perform a regression on non-normal data which remain non-normal when transformed?

I've got some data (158 cases) which was derived from a Likert scale answer to 21 questionnaire items. I really want/need to perform a regression analysis to see which items on the questionnaire ...
rachel S's user avatar
  • 201
19 votes
3 answers
2k views

Statistical test for two distributions where only 5-number summary is known?

I have two distributions where only the 5-number summary (minimum, 1st quartile, median, 3rd quartile, maximum) and sample size are known. Contrary to the question here, not all data points are ...
bonifaz's user avatar
  • 1,095
14 votes
3 answers
3k views

How to scale violin plots for comparisons?

I'm trying to draw violin plots and wondering if there is an accepted best practice for scaling them across groups. Here are three options I've tried using the R ...
xan's user avatar
  • 8,958
11 votes
4 answers
3k views

Why is Kernel Density Estimation still nonparametric with parametrized kernel?

I am new to kernel density estimation (KDE), but I want to learn about it to help me calculate probabilities of outcomes in sequencing data. I watched this https://www.youtube.com/watch?v=QSNN0no4dSI ...
Galen's user avatar
  • 9,412
10 votes
2 answers
3k views

Looking for a robust, distribution-free/nonparametric distance between multivariate samples

There are many distance functions for distributions out there, but I'm having a hard time wading through them all to find one that is "distribution-free", or "nonparametric", by which I mean only ...
kjo's user avatar
  • 1,967
9 votes
2 answers
6k views

Is there a quantitative way to compare the distribution shape of different samples?

I am conducting some research which involves visually/graphically observing the differences between the shapes of the distributions of different samples. I would like to automate this process (at ...
Homunculus Reticulli's user avatar
8 votes
1 answer
2k views

What practical application is there for the Asymptotic Mean Integrated Squared Error in kernel density estimation?

Introduction For some time now I have been struggling to understand how theoretical results can be applied in practice. Fortunately in most cases the link between theory and practice is not hard to ...
Dennis Jaheruddin's user avatar
8 votes
2 answers
2k views

The exact distribution of Wilcoxon rank-sum statistic U

The distribition of the rank-sum statistic U is assumed to be normal for large number of samples being considered. What is the exact distribution? I want to compare and sometimes fuse results from ...
highBandWidth's user avatar
8 votes
1 answer
166 views

How to test that a set of distributions are located in a given order?

I have a set of population distributions; I obtained them empirically, computing histograms from very large populations (about 1 million per distribution). The population distributions might not have ...
Dr Fabio Gori's user avatar
7 votes
2 answers
2k views

Probability that randomly chosen value from one distribution is greater than randomly chosen value from another distribution

Say I have $n$ values sampled from two distributions, $A$ and $B$ . That is, I have a sample $A_1, A_2, \dots, A_n$ and a sample $B_1, B_2, \dots, B_n$. How would I go about finding $P\left(A_i>...
Jake Fisher's user avatar
7 votes
2 answers
3k views

How to choose a kernel for KDE

There are a lot of kernels available for a univariate KDE. R uses normal by default, but the efficacy discussion seems to support the use of Epanechnikov. What should influence kernel choice for ...
Simon Kuang's user avatar
  • 2,121
6 votes
1 answer
1k views

How to understand the definition of empirical distribution function

I am reading the All of Nonparametric Statistics, by Larry Wasserman. At page 12, he defines the empirical distribution function as: The empirical distribution function $\hat{F_n}$ is the CDF that ...
Deep North's user avatar
  • 4,776
6 votes
1 answer
2k views

Plotting non-parametric (E)CDF confidence envelopes for comparison

I have previously asked about a way to test whether two samples are drawn from the same distribution (Non-parametric test if two samples are drawn from the same distribution). I was very glad to learn ...
Luke Gorrie's user avatar

15 30 50 per page
1
2 3 4 5
8