Skip to main content

All Questions

0 votes
0 answers
11 views

How to compare peak location and tail length of two different distributions?

I have the distributions of the fraction of people in each income bracket in a town in 1990 and 2020. The total sample size is the same in both, and assume that the incomes have been adjusted to ...
SNIreaPER's user avatar
0 votes
0 answers
40 views

Estimate the likelihood of two continuous samples of unknown distribution

Consider two continuous and unknown distributions $$X : {x_1, x_2, ..., x_n}$$ and $$Y : {y_1, y_2, ..., y_n}$$ both can be tagged as time series with $n > 8000$. I need to estimate the likelihood ...
joueswant's user avatar
1 vote
1 answer
72 views

Does taking the ratio of Empirical Distributions (histogram bins) show their differences?

Background I have two Empirical distributions, both derived from social media data. The first represents a broad sample of ~4.8 million posts and the number of followers each post author has. The ...
Connor's user avatar
  • 655
0 votes
0 answers
150 views

Comparing the output distribution of two ML models

Consider a regression task (e.g. predicting house prices) with a given train and test sets. We start with constructing a linear regression model, in which we assume $y_i=X^T\beta+\epsilon$ with $E[\...
Spätzle's user avatar
  • 4,012
1 vote
0 answers
50 views

The hunt for a 'nice' flexible distribution [duplicate]

Background Suppose I have data $\mathcal{D}_1, \cdots, \mathcal{D}_n$ with each $\mathcal{D}_i$ containing $m$ observations $X_{i1}, \cdots, X_{im}$; these observations are of unknown distribution, ...
Tom Chen's user avatar
  • 621
5 votes
2 answers
226 views

Calculation of a nonparametric equal-tailed (central) tolerance interval for an unknown continuous distribution

Assume we have a sample of size $n$ from an unspecified continuous distribution $F(\cdot)$. We wish to construct a tolerance interval to contain $(100\,\beta)\%$ of the population with a pre-specified ...
COOLSerdash's user avatar
  • 30.9k
2 votes
2 answers
115 views

Distribution and variable analysis

I am doing a statistical test (program used is SPSS). On the basis of distribution and sample size, I have to chose the correct variable analysis. I also have to justify every decision. I have two ...
Chester's user avatar
  • 21
0 votes
0 answers
65 views

Comparing averages of non normal distributions

I want to compare the daily average revenue of a promotion period (7 days) of a business with the daily average of the rest of the year. So, sample 1 has 7 data points, whereas sample 2 has 300 data ...
andstat's user avatar
2 votes
0 answers
69 views

When is the central limit theorem not applied?

I am trying to compare two matched samples. In total I have a sample of 34 people. Each patient receives two treatments, a C1 treatment and a C2 treatment. So each patient will be compared to himself ...
Seydou GORO's user avatar
3 votes
3 answers
149 views

Problem with a single outlier, non-normal data, and unequal sample distributions

I am wanting to compare two independent groups on a likert-like item. To explain, the dependent variable is structured so that a 1 = <1 units, 2 = 1-<2, 3 = 2-<3, all the way up to option 7 = ...
Amy's user avatar
  • 31
1 vote
1 answer
43 views

Which pair of two distributions are more similar?

Suppose I have two pairs of distributions: distributions A and B in Pair 1, distribution C and D in Pair 2. There are non-parametric tests to determine if there is evidence to say that the ...
David Young's user avatar
4 votes
3 answers
547 views

Nonparametric Order Statistics - Does this Exist?

I was reading about order statistics on Wikipedia [retrieved 29 June 2022]: Apparently, if we have a sample with $k$ elements (e.g., $x_1, x_2, ..., x_k$) and assume a probability distribution for ...
stats_noob's user avatar
1 vote
1 answer
136 views

Deciding the Number of Clusters : Standard Methods vs. Non-Parametric Methods

I was watching this video over here (https://www.youtube.com/watch?v=UBiaLq5V7mE) that discussed a Non-Parametric based Bayesian approach for deciding the number of clusters in a dataset. Essentially, ...
stats_noob's user avatar
-1 votes
1 answer
184 views

Independent Sample T-test or Mann-Whitney U test?

I am a very young stats learner, and I need help understanding the justification of a test choice. I have a sample of 39 participants (20 females and 19 males) been measured on task performance, and I ...
marth's user avatar
  • 1
0 votes
1 answer
132 views

Compare nonparametric distributions

I generated distributions of travel times of commuters using transportation simulation tools (for different scenarios). The distributions are attached below. I wish to statistically compare each pair ...
SiH's user avatar
  • 141
0 votes
0 answers
37 views

Show that an event is improbable for exponential families iff it's improbable for all absolutely continuous distributions

Since all the exponential families are absolutely continuous, if part is trivial. However, I could not prove the only if part. My idea is to prove by contradiction, i.e. given an event $A$ such that $...
Martund's user avatar
  • 545
0 votes
1 answer
131 views

How to find the type of my distribution?

I checked the normality of my data on SPSS and one of the variables is not normally distributed. I have the mean, standard deviation, skewness, kurtosis , min and max values of my distribution. But I ...
user avatar
6 votes
2 answers
1k views

What is a medcouple?

What is a medcouple? I understand that it is the median of a couple of data points but it is not clear to me what these pairs of data actually are. E.g. https://wis.kuleuven.be/stat/robust/papers/2008/...
Maths12's user avatar
  • 579
2 votes
1 answer
170 views

How to estimate the probability that a single value follows the same distribution as a a set of values

This question is a possible duplicate of this one but I would like to go a bit further. I have a set of values $X=x_1, x_2, \cdots, x_n$ that are iid estimates of a reference value $y$ given by a ...
Gustave Coste's user avatar
4 votes
1 answer
106 views

Nonparemetric tests: how to support the null hypothesis you claim to be testing

Let us assume that we have taken an unbalanced number of independent random samples from 5 different populations, which will be analogous to 5 different locations in this example. Each observation ...
Ryan's user avatar
  • 351
2 votes
0 answers
104 views

Fitting a range of distribution and test for goodness of fit - choose based on p-value or chi-squared values?

I have data for which I want to study the best distribution that fits this data. I am following this blog post to do my experiments. Basically the following things are happening: fit a number of ...
Perl Del Rey's user avatar
0 votes
0 answers
30 views

Sampling distribution from a parametric curve?

I'm not sure if this question is well-founded, given that it seems to be mixing random and deterministic processes, but I'm wondering if there's a meaningful answer. Suppose we have a parametric curve ...
Ben's user avatar
  • 255
2 votes
2 answers
135 views

How to prove a multivariate r.v. does not follow the nonparanormal distribution?

Background You may find the definition of the non-paranormal distribution at the 2nd paragraph in p.2296 of this paper. In short, $(X_1, \ldots, X_p)$ is non-paranormal if there exists a set of ...
inmybrain's user avatar
  • 528
0 votes
0 answers
53 views

Finding the right hypothesis test

I have two distributions (Generic and Generic Masked) that I want to compare. I want to show that one is distributed closer to 1 than the other, but don't know which hypothesis to test for this. I ...
Jimmy2027's user avatar
  • 121
3 votes
1 answer
2k views

Test to show one distribution is bigger than another

Here is a MWE of my problem: I measure the size, $S$, of 10 red apples and 32 green apples. $\bar S_\mathrm{red} = 8 \pm 1\,\mathrm{cm}$ and $\bar S_\mathrm{green} = 4 \pm 2\,\mathrm{cm}$. I want ...
Sean Mooney's user avatar
1 vote
0 answers
131 views

What is the resulting distribution of a data set that was originally normally distributed but has been quantized and had all negative values removed?

I am trying to benchmark a seasonal forecasting model and calculate not just the point forecasts but the forecast densities from the model. To do this, I generated a simulated data set in the ...
Akaike's Children's user avatar
1 vote
0 answers
23 views

Ways to make parametric statistics work with real time (often non-normal) data

BACKGROUND: I have been tasked with teaching basic data analysis methods with R to a group of people in a business setting. While my stance is that I am most difinitely not at the level where I ...
random_guy's user avatar
2 votes
1 answer
182 views

Binomial distribution for two groups if success rate is not given

Two groups of twelve statisticians are taught two different methods of Statistics. (Assume that a statistician in group one is matched in terms of their Statistics ability with a statistician in group ...
user avatar
2 votes
0 answers
83 views

Estimating conditional probability distribution from samples

I have three continuous variables, $X$, $Y_1$ and $Y_2$. All three are correlated. For a given value of $X$, the conditional probability distributions of $Y_1$ and $Y_2$ are typically bimodal. I'm ...
ylangylang's user avatar
2 votes
1 answer
144 views

When a function of two random variables is Gumbel?

Consider two random variables $X,Y$. Is there any example in which $X$ and $Y$ have a known parametric distribution such that $f(X,Y)$ is Gumbel with scale $\sigma$ and location $\beta$, for some ...
Star's user avatar
  • 891
2 votes
1 answer
58 views

Non norma distribution

I have a non-normal distribution (Kilograms ~ Years), so I can't use ANOVA test to reject the null hypothesis (that the tree means are equal). There is a tendency of weight to be 100kg. Is there a way ...
Bruno Silva's user avatar
0 votes
1 answer
99 views

Parametric or non parametric test

I want to compare trends of R&D expenditures before and after a crisis. I was planning to use a paired T-test or a non-parametric alternative. But, before of that, I tested the data for normality. ...
user avatar
0 votes
0 answers
22 views

Confused about the statistical tests to choose or any transformation to apply

I am new to the stackexchange, so please forgive me for my editing ignorance. I am confused and stuck about how to proceed further with my data. I have the following data ...
ronizidane's user avatar
0 votes
0 answers
149 views

Determining differences between curves with the K-S test

I'm running an experiment where my treatments have a significant and highly sensitive effect of the distribution of a result. For example, if i alter a certain nutrient in a vegetable, the ...
user3020881's user avatar
32 votes
3 answers
5k views

Why does the Kolmogorov-Smirnov test work?

In reading about the 2-sample KS test, I understand exactly what it is doing but I don't understand why it works. In other words, I can follow all the steps to compute the empirical distribution ...
Darcy's user avatar
  • 915
1 vote
3 answers
197 views

Confused on normality assumption

I know that the sampling distribution of the mean can be assumed to be normal if N>30, but does this have an implication on the "30" itself (the sample data)? I have three different time series with ...
janebanane's user avatar
1 vote
0 answers
41 views

Prove the relation between two distribution functions

I have been given a homework in a subject called "Non-Parametric Statistics" and I'm a bit stuck with it. I would be very thankful if you could give me any advice or help, which would lead to a ...
Martin Smith's user avatar
2 votes
0 answers
150 views

Derive probability distributions from i.i.d. Gumbel

I have a question on how to derive (if possible) the following probability distributions. Consider 3 random variables $(X,Y,Z)$ mutually independent and identically distributed. Specifically, $X$ is ...
Star's user avatar
  • 891
1 vote
0 answers
95 views

What among location, scale and shape is Kolmogorov–Smirnov test statistic sensitive to and why? [closed]

I understand that the Kolmogorov–Smirnov test statistic for a given cumulative distribution function $F(x)$ is $D_n = \sup_x |F_n(x) - F(x)|$. However, if I have to rank its sensitivity to location, ...
GeorgeOfTheRF's user avatar
5 votes
4 answers
2k views

Why use parametric test at all if non parametric tests are 'less strict'

I have read from several sources, even in my undergrad courses, that parametric tests require the data to have a certain distribution, for instance normal, whilst non-parametric don't. I have ...
user2552108's user avatar
1 vote
0 answers
692 views

Type of parameter of the chi-squared distribution

Chi-squared distribution $\chi^2(k)$ has parameter $k$. On the one hand, $k$ should be the shape parameter because chi-squared distribution is a special case of Gamma distribution: $\chi^2(k) \equiv ...
Rodvi's user avatar
  • 1,008
0 votes
1 answer
2k views

Nonparametric test suggestion for distribution comparison

I am not sure if this is appropriate place to ask, but I appreciate any help on this issue. I want to compare the distributions of the results of the two experiments played by the same group of people....
user64066's user avatar
  • 119
1 vote
1 answer
1k views

What if half of your data is not normally distributed? [closed]

My experiment is to test the different diets (Pk, Hg, BYD & Control) in order to check the development of insect, what are the most preferred diets by insect. For this purpose, I used 3 parameters;...
Faray 's user avatar
  • 11
2 votes
1 answer
738 views

Possible explanations for Imputation before train-test split?

I'm working on a real world data set containing missing information. I understand imputing missing values before data partitioning can lead to leakage of information. I'm using this R package MissMech ...
NoName's user avatar
  • 33
1 vote
0 answers
31 views

Estimating Gamma PDF parameters from data with negative increments

Say we have collected data, and from a physical perspective we know that the collected data should increase positively with time. However the data looks more like this: This data shown in the figure ...
AnarKi's user avatar
  • 565
1 vote
2 answers
173 views

Closeness of 2-parametric discrete distributions when first 2 moments are matching

Let $\mathcal{D}$ be a particular 2-parameter uni-variate discrete distribution family, and let $D(\theta_1, \theta_2) \in \mathcal{D}$ be one particular distribution from this family, where $\theta_i ...
Abhiram Natarajan's user avatar
6 votes
1 answer
2k views

Plotting non-parametric (E)CDF confidence envelopes for comparison

I have previously asked about a way to test whether two samples are drawn from the same distribution (Non-parametric test if two samples are drawn from the same distribution). I was very glad to learn ...
Luke Gorrie's user avatar
1 vote
0 answers
366 views

Distribution (group) comparison based on PCA

I am looking for a way to compare the first the k principal components belonging to two separate groups of two-dimensional data, in order to see how similar the two groups are. I do not know which ...
MarkoF's user avatar
  • 11
5 votes
1 answer
680 views

Non parametric estimation/regression for conditional distribution

Context : one continuous variable $Y$ dependent on $X$ ($X$ can be anything) Linear regression, generalized linear model... focus on estimating the conditional expectation $E(Y\mid X)$. I want to ...
Benoit Sanchez's user avatar
0 votes
1 answer
102 views

Which stats should I use when samples are uneven and data are not normally distributed?

My data have four groups: "A" normal group (AN), "B" normal group (BN), "A" group with a clinical diagnosis (AC) and "B" group with a clinical diagnosis (BC). They were tested on a task that included ...
Grace's user avatar
  • 21

15 30 50 per page