Skip to main content

All Questions

0 votes
0 answers
40 views

Estimate the likelihood of two continuous samples of unknown distribution

Consider two continuous and unknown distributions $$X : {x_1, x_2, ..., x_n}$$ and $$Y : {y_1, y_2, ..., y_n}$$ both can be tagged as time series with $n > 8000$. I need to estimate the likelihood ...
0 votes
0 answers
11 views

How to compare peak location and tail length of two different distributions?

I have the distributions of the fraction of people in each income bracket in a town in 1990 and 2020. The total sample size is the same in both, and assume that the incomes have been adjusted to ...
11 votes
4 answers
3k views

Why is Kernel Density Estimation still nonparametric with parametrized kernel?

I am new to kernel density estimation (KDE), but I want to learn about it to help me calculate probabilities of outcomes in sequencing data. I watched this https://www.youtube.com/watch?v=QSNN0no4dSI ...
32 votes
3 answers
5k views

Why does the Kolmogorov-Smirnov test work?

In reading about the 2-sample KS test, I understand exactly what it is doing but I don't understand why it works. In other words, I can follow all the steps to compute the empirical distribution ...
6 votes
1 answer
2k views

Plotting non-parametric (E)CDF confidence envelopes for comparison

I have previously asked about a way to test whether two samples are drawn from the same distribution (Non-parametric test if two samples are drawn from the same distribution). I was very glad to learn ...
1 vote
1 answer
72 views

Does taking the ratio of Empirical Distributions (histogram bins) show their differences?

Background I have two Empirical distributions, both derived from social media data. The first represents a broad sample of ~4.8 million posts and the number of followers each post author has. The ...
0 votes
0 answers
150 views

Comparing the output distribution of two ML models

Consider a regression task (e.g. predicting house prices) with a given train and test sets. We start with constructing a linear regression model, in which we assume $y_i=X^T\beta+\epsilon$ with $E[\...
5 votes
2 answers
226 views

Calculation of a nonparametric equal-tailed (central) tolerance interval for an unknown continuous distribution

Assume we have a sample of size $n$ from an unspecified continuous distribution $F(\cdot)$. We wish to construct a tolerance interval to contain $(100\,\beta)\%$ of the population with a pre-specified ...
1 vote
0 answers
50 views

The hunt for a 'nice' flexible distribution [duplicate]

Background Suppose I have data $\mathcal{D}_1, \cdots, \mathcal{D}_n$ with each $\mathcal{D}_i$ containing $m$ observations $X_{i1}, \cdots, X_{im}$; these observations are of unknown distribution, ...
2 votes
2 answers
115 views

Distribution and variable analysis

I am doing a statistical test (program used is SPSS). On the basis of distribution and sample size, I have to chose the correct variable analysis. I also have to justify every decision. I have two ...
4 votes
3 answers
547 views

Nonparametric Order Statistics - Does this Exist?

I was reading about order statistics on Wikipedia [retrieved 29 June 2022]: Apparently, if we have a sample with $k$ elements (e.g., $x_1, x_2, ..., x_k$) and assume a probability distribution for ...
7 votes
2 answers
2k views

Probability that randomly chosen value from one distribution is greater than randomly chosen value from another distribution

Say I have $n$ values sampled from two distributions, $A$ and $B$ . That is, I have a sample $A_1, A_2, \dots, A_n$ and a sample $B_1, B_2, \dots, B_n$. How would I go about finding $P\left(A_i>...
1 vote
0 answers
261 views

Non-parametric test for unimodality of 2-d dataset

I'm looking for a method to determine if a 2-d dataset forms a single cluster or is split into two/multiple clusters - and what's the likelihood or p-value for this. Isolated data should not be ...
3 votes
1 answer
110 views

Approximation of distribution that has a positive atom by well-known parametric distributions

I have a variable (say, $y$) in my dataset for which a lot of observations are clustered around some point $c$ and after the point $c$ the distribution looks continuous. I imagine that would be the ...
0 votes
0 answers
65 views

Comparing averages of non normal distributions

I want to compare the daily average revenue of a promotion period (7 days) of a business with the daily average of the rest of the year. So, sample 1 has 7 data points, whereas sample 2 has 300 data ...

15 30 50 per page
1
2 3 4 5
8