Questions tagged [nonparametric]
Use this tag to ask about the nature of nonparametric or parametric methods, or the difference between the two. Nonparametric methods generally rely on few assumptions about the underlying distributions, whereas parametric methods make assumptions that allow data to be described by a small number of parameters.
838
questions with no upvoted or accepted answers
15
votes
0
answers
469
views
Penalized spline confidence intervals based on cluster-sandwich VCV
This is my first post here, but I've benefited a lot from this forum's results popping up in google search results.
I've been teaching myself semi-parametric regression using penalized splines. ...
12
votes
0
answers
1k
views
Computing a bootstrap confidence interval for the prediction error with the percentile and the BCa method
I have two related questions regarding the computation of a non-parametric bootstrap confidence interval for the prediction error.
Setting: I have a sample S from a data population P and a learner L, ...
8
votes
0
answers
319
views
Are there any surveys of the opinions of statisticians on the usefulness of classical rank-based nonparametric statistics?
The following comes from a YouTube video: Robustness in Statistics, which I have tried to quote verbatim.
In Biology and Medicine these procedures are extremely popular, and I don't know why. They're ...
8
votes
1
answer
178
views
What do the terms "nearly-optimal rate", "near-minimax rate", "minimax optimal rate" and "minimax rate" mean in the context of posterior consistency?
Definition: A sequence $\epsilon_n$ is a posterior contraction rate at the parameter $θ_0$ if $$\Pi_n(θ: d(θ, θ_0) ≥ M_n \epsilon_n| X^{(n)}) → 0$$ in $P^{(n)}_{θ_0}$-probability, for every $M_n → ∞$.
...
8
votes
1
answer
389
views
What to do if your regression residuals aren't normally distributed, cannot be transformed and do not conform even when outliers are removed?
I ran a regression on R and my shapiro wilk test showed that some of my residuals are not normally dsitributed. I cannot transform the data to fit a normal distribution and even when i remove outliers,...
7
votes
0
answers
287
views
Minimizing MISE to find consistent estimator
Consider kernel regression estimation of the mean function $m$ of the process
$$y_t = m(x_t) + \epsilon_t,$$ where $\epsilon_t$' s are correlated with covariance function $R(s,t) = \exp \{-\lambda|s-...
7
votes
0
answers
63
views
Estimate fraction of a known distribution in a mixture with unknown second distribution
Suppose I have a set of bulbs, which are known to be healthy. For each bulb I have a value of its brightness. The underlying distribution is not necessarily normal, and possibly have some complex ...
6
votes
0
answers
193
views
Non-uniform p-values from hoeffd function in Hmisc when data sets are independent
When using the function hoeffd in the CRAN package Hmisc I get unusual p-values for pairs of data sets that are independent. The function hoeffd is an implementation of Hoeffding's $D$ statistic. ...
6
votes
0
answers
165
views
Estimating Spline curve by OLS. Is a good idea to fix the knots at Chebyshev sites?
I am writing my master's degree thesis on a novel method for fixing knots in an adaptive way and while reading the literature I've found many references to the so-called Chebyshev sites. This sites or ...
6
votes
0
answers
505
views
Interpolation and Sample size when Visualizing distributions
Let's assume a stochastic simulation or test with a control variable. The task is to visualize the distribution to demonstrate the effect that is being researched. The objective is to get smooth plot, ...
6
votes
0
answers
444
views
All vs all post-hoc after Aligned Friedman (k classifiers over multiple datasets)
I have k classifiers and n datasets, and I have only one accuracy measurement (which is actually the average of three independent repetitions of the 5-fold-CV, i.e. average over 15 accuracy values) ...
5
votes
0
answers
27
views
Are these two estimated regression coefficient asymptotically equivalent? If not, which one is more efficient?
Suppose I have $Y=\beta_1X_1+\beta_2X_1X_2+g(X_2)+u$, where $E(u|X_1,X_2)=0$ and $S=g(X_2)+e$ with $E(e|X_2)=0$. I have a random sample $\{Y_i,X_{1i},X_{2i},S_i\}_{i=1}^n$. Suppose I first use a ...
5
votes
0
answers
131
views
How can I make a prediction interval for a future response (not its mean) in regression by using bootstrap?
I'd like to know how I can use bootstrap to predict the confidence interval for a future response (not for its mean) no matter what theorical model and error distribution are, I know I can train the ...
5
votes
0
answers
4k
views
How to better understand when to use Weibull AFT versus Cox Model for Failure Data
I am struggling to understand when I should consider using a Cox regression model versus using a Weibull AFT model to predict the end of life of mechanical components.
I have tried to apply the Cox ...
5
votes
0
answers
880
views
How general is the backfitting algorithm?
Hastie \& Tibshirani's original approach to fitting generalized additive models was the backfitting algorithm. For a model of the form
$$
y = \alpha + \displaystyle\sum_k f_k(x_k) + \epsilon
$$
...
5
votes
0
answers
1k
views
Empirical multivariate probability integral transform
Is there a 'simple' way to obtain a non-parametric empirical multivariate probability integral transform?
Univariate case
The probability integral transform relates to the transform of any random ...
5
votes
0
answers
522
views
Kolmogorov-Smirnov vs. Kuiper test
I would like to compare two 2D distributions with quantitative variables, illustrated here:
For each "x", they are several measures "y".
I can't assume these distributions are parametric.
A ...
5
votes
0
answers
683
views
Confusion related to Parzen window
I was going through this tutorial related to Parzen window at http://www.cs.utah.edu/~suyash/Dissertation_html/node11.html. However, I have some confusion related to Parzen window with gaussian kernel
...
5
votes
0
answers
4k
views
Calculate Mantel-Haenszel test in R
I would like to have a reality check of my understanding of the MH statistic. I have been trying to reproduce an example of the Mantel-Haenszel test provided in Conover (1999, p. 192-194). The data ...
5
votes
0
answers
102
views
Simultaneous Non-parametric regression and Non-parametric density estimation
Given a collection of $(x,y)$ data, one might do a non/semiparametric regression of $y$ on $x$ to understand how to predict the $E(Y|X)$. Similarly, if one has enough data, it might be useful take a ...
4
votes
0
answers
442
views
Derivation of k nearest neighbor classification rule
One way to derive the k-NN decision rule based on the k-NN density estimation goes as follows:
given $k$ the number of neighbors, $k_i$ the number of neighbors of class $i$ in the bucket, $N$ the ...
4
votes
1
answer
106
views
Nonparemetric tests: how to support the null hypothesis you claim to be testing
Let us assume that we have taken an unbalanced number of independent random samples from 5 different populations, which will be analogous to 5 different locations in this example. Each observation ...
4
votes
0
answers
319
views
Are there nonparametric generative models for datasets?
Typically when I see generative models, e.g., Latent Dirichlet Allocation (JMLR) or Linear/Quadratic Discriminant Analysis (wikipedia LDA), they are probabilistic models that belong to the exponential ...
4
votes
0
answers
90
views
Can Time Varying Coefficient models with a Kalman filter approximate any non-linear function?
I read that Time Varying Coefficients (TVC) models with non-parametric methods can approximate any non-linear function. This is from "Non-Linear Models: Where Do We Go Next - Time Varying Parameter ...
4
votes
0
answers
672
views
Can "non-parametric" tests be achieved with generalized linear models?
I recently read @Kodiologist's answer to a post here looking for clarification on the relation between GLMs and non-parametric tests. His answer is along the lines of "the approach is not non-...
4
votes
0
answers
2k
views
How does the gam library calculate AIC?
I was wondering how the gam library calculates the AIC. I can't find a reference that explains how this package calculates it.
...
4
votes
0
answers
2k
views
How to do a stratified nonparametric test?
I'm trying to use the "coin" (conditional inference) package to perform a stratified nonparametric test for difference in distribution (for count data).
I tried a stratified Mann-Whitney-Wilcoxon ...
4
votes
0
answers
96
views
all regressions: coefficients interpretation
good morning to all, I open this topic with the intention of being useful to me but also to many in my situational. I would like to clarify the "interpretation" of the coefficients in the regression. ...
4
votes
0
answers
238
views
multiple Kruskal-Wallis instead of ANOVA for non-normally distributed data
Apologies in advanced if I supply too little or vague information, since I am a complete newbie in stats and using this forum.
So I did a study which evaluated psychopathic personality traits, ...
4
votes
0
answers
88
views
Nonparametric estimation of the logarithm of a density
I was wondering whether there is an equivalent to Kernel Density Estimation to estimate nonparametrically the logarithm of a density. Or if there is any nonparametric method for that. (Taking the ...
4
votes
0
answers
406
views
Minimum sample for non parametric test?
Exactly how many sample to be minimum size for non parametric test?
Right now i have 3 data with really2 small sample size
1. 5 pairs for wilcoxon signed rank test (pretest and postest in 1 place)
2. ...
4
votes
0
answers
543
views
Multiple comparison of non-normal, heteroscedastic data. What test should I use?
I have a set of brain pathology data. These were obtained by counting certain parameters in the brain. Due to availability of human brains, the amount of cases vary a lot across the different groups ...
4
votes
0
answers
44
views
German tank variant: estimate resolution of camera given cropped photo sizes
Make whatever assumptions you like, but I like the flavor of nonparametric techniques.
I have a list of the $x_i$ by $y_i$ resolutions of a number of photos, all cropped from photos taken at the same ...
4
votes
0
answers
408
views
Is this how a Bayesian bootstrap works?
I am a bit new to the whole nonparametric and Bayesian idea, so tell me if this is correct: to estimate, say, the mean of a dataset's population we do the following:
We define a function $f(x)$ that ...
4
votes
0
answers
2k
views
Where is the maximum bias and variance in a histogram as non-parametric density estimator?
I am a little bit confused about bias and variance of non-parametric density estimators and hope you can help me.
Assuming a constant bandwidth and sample size, I am wondering at which points of the ...
4
votes
0
answers
138
views
Rank deficient bootstrap resamples
Despite years of stat courses I'm afraid I may still not completely understand bootstrapping.
My question here relates to nonparametric boostrapping of regression models. As i understand it you draw ...
4
votes
0
answers
804
views
ANOVA on ranks?
Once in a statistics class I've seen the suggestion that one non-parametric approach for ANOVA would be to perform the ANOVA analysis on the ranks of the original data (basically as an alternative to ...
4
votes
0
answers
213
views
Nonparametric test for largely skewed count data
My research design looks as follows: an experimental game with 4 participants (human subjects), repeated for 20 rounds. During each round, participants are allowed to form bilateral coalitions which ...
4
votes
0
answers
246
views
Maximum likelihood estimation and density estimation
Let's consider a general signal processing estimation problem where the measurements are modeled as
$${\bf x}[n]={\bf s(\theta)}+{\bf w}[n],$$ where ${\bf w}$ is a non-Gaussian r.v. (noise term) and ${...
4
votes
1
answer
2k
views
Friedman's test to identify best of multiple classifiers on multiple domains
I have several classifiers $f_i\ (i=1, \cdots, N)$ and calculated performance measures on multiple domains $(D)$ for each. Thus, there are $N \times D$ values.
I want to find out (increasing ...
4
votes
0
answers
189
views
Non-parametric estimators for time-varying binomial proportion
I have a bunch of count data associated with time intervals (potentially overlapping and of variable lengths), say
$(s_i, t_i, n_i, N_i)$
where $N_i$ is a count of the total number of events ...
4
votes
0
answers
7k
views
Choosing post-hoc test after Kruskal-Wallis
I have a time series (five time steps) of samples from a population of the same ~60 individuals, each sample being a haphazardly chosen (i.e. not completely random) subset of the 60. The sample sizes ...
3
votes
0
answers
41
views
Can a Gaussian Process predict random events?
I know that we can use Gaussian processes effectively for function approximation and regression. However,suppose there is a sequence of points in time $S = \{s_1, s_2, \dots, s_n\}$, where $s_i$ can ...
3
votes
0
answers
314
views
Estimation of Propensity Score using Random Forests
Suppose that one has a binary treatment $Z$, and assume that $Z=1|X=x \sim Bern\left(e(x)\right)$.
Furthermore, suppose I want to estimate the propensity score by a random forest. Are there ...
3
votes
0
answers
479
views
Pros and cons of Nadaraya–Watson estimator vs. RKHS method?
Recently I've been reading some materials about nonparametric methods. Two methods related to the word "kernel" rasied my interest-- Nadaraya–Watson estimator and RKHS method.
What's the ...
3
votes
0
answers
66
views
Looking for the Holy Grail of nonparametric regression
Unfortunately, to state precisely the question, I need some formal preliminaries.
Let $d \in \mathbb{N}$.
For each $d^* \in \{1,\dots,d\}$, define $\mathcal{M}_{d^*}$ be the set of probability ...
3
votes
0
answers
608
views
What does it mean when F values of ANOVAs are not all ~0?
I am running an aligned rank transform ANOVA (a non-parametric ANOVA for an ordinal DV). In my data there are 3 factors and one ordinal DV.
When I create the model:
...
3
votes
1
answer
423
views
Theoretical justification of Parametric bootstrap?
I've been reading about bootstrap, and while it's relatively easy to find theoretical results (consistency and higher-order correctness) for the nonparametric bootstrap (e.g., Asymptotic Statistics by ...
3
votes
0
answers
208
views
Parameter estimation when the likelihood function does not exist
The observations $Z_1,Z_2\cdots$ are i.i.d. We have
$$Z_k = \sum_{i=1}^\infty \frac{X_{ki}}{2^k}.$$
where the $X_{ki}$'s are i.i.d. with a Bernouilli$(p)$ distribution. If $p=\frac{1}{2}$ then $Z_k$ ...
3
votes
0
answers
159
views
When should one use Bradley-Terry instead of gradient boosted trees for pairwise ranking
Both the Bradley-Terry model and Gradient boosted trees can be used to learn a ranking from pairwise comparisons (e.g. with libraries choix and XGboost).
How do they relate to each other? Is there ...