Skip to main content

Questions tagged [nonparametric]

Use this tag to ask about the nature of nonparametric or parametric methods, or the difference between the two. Nonparametric methods generally rely on few assumptions about the underlying distributions, whereas parametric methods make assumptions that allow data to be described by a small number of parameters.

838 questions with no upvoted or accepted answers
15 votes
0 answers
469 views

Penalized spline confidence intervals based on cluster-sandwich VCV

This is my first post here, but I've benefited a lot from this forum's results popping up in google search results. I've been teaching myself semi-parametric regression using penalized splines. ...
generic_user's user avatar
  • 13.5k
12 votes
0 answers
1k views

Computing a bootstrap confidence interval for the prediction error with the percentile and the BCa method

I have two related questions regarding the computation of a non-parametric bootstrap confidence interval for the prediction error. Setting: I have a sample S from a data population P and a learner L, ...
Gitte's user avatar
  • 825
8 votes
0 answers
319 views

Are there any surveys of the opinions of statisticians on the usefulness of classical rank-based nonparametric statistics?

The following comes from a YouTube video: Robustness in Statistics, which I have tried to quote verbatim. In Biology and Medicine these procedures are extremely popular, and I don't know why. They're ...
Galen's user avatar
  • 9,411
8 votes
1 answer
178 views

What do the terms "nearly-optimal rate", "near-minimax rate", "minimax optimal rate" and "minimax rate" mean in the context of posterior consistency?

Definition: A sequence $\epsilon_n$ is a posterior contraction rate at the parameter $θ_0$ if $$\Pi_n(θ: d(θ, θ_0) ≥ M_n \epsilon_n| X^{(n)}) → 0$$ in $P^{(n)}_{θ_0}$-probability, for every $M_n → ∞$. ...
user3911153's user avatar
8 votes
1 answer
389 views

What to do if your regression residuals aren't normally distributed, cannot be transformed and do not conform even when outliers are removed?

I ran a regression on R and my shapiro wilk test showed that some of my residuals are not normally dsitributed. I cannot transform the data to fit a normal distribution and even when i remove outliers,...
Vivienne's user avatar
  • 491
7 votes
0 answers
287 views

Minimizing MISE to find consistent estimator

Consider kernel regression estimation of the mean function $m$ of the process $$y_t = m(x_t) + \epsilon_t,$$ where $\epsilon_t$' s are correlated with covariance function $R(s,t) = \exp \{-\lambda|s-...
Shanks's user avatar
  • 765
7 votes
0 answers
63 views

Estimate fraction of a known distribution in a mixture with unknown second distribution

Suppose I have a set of bulbs, which are known to be healthy. For each bulb I have a value of its brightness. The underlying distribution is not necessarily normal, and possibly have some complex ...
dbolotin's user avatar
  • 171
6 votes
0 answers
193 views

Non-uniform p-values from hoeffd function in Hmisc when data sets are independent

When using the function hoeffd in the CRAN package Hmisc I get unusual p-values for pairs of data sets that are independent. The function hoeffd is an implementation of Hoeffding's $D$ statistic. ...
Alex's user avatar
  • 722
6 votes
0 answers
165 views

Estimating Spline curve by OLS. Is a good idea to fix the knots at Chebyshev sites?

I am writing my master's degree thesis on a novel method for fixing knots in an adaptive way and while reading the literature I've found many references to the so-called Chebyshev sites. This sites or ...
Chaos's user avatar
  • 431
6 votes
0 answers
505 views

Interpolation and Sample size when Visualizing distributions

Let's assume a stochastic simulation or test with a control variable. The task is to visualize the distribution to demonstrate the effect that is being researched. The objective is to get smooth plot, ...
user3644640's user avatar
6 votes
0 answers
444 views

All vs all post-hoc after Aligned Friedman (k classifiers over multiple datasets)

I have k classifiers and n datasets, and I have only one accuracy measurement (which is actually the average of three independent repetitions of the 5-fold-CV, i.e. average over 15 accuracy values) ...
Pablo's user avatar
  • 351
5 votes
0 answers
27 views

Are these two estimated regression coefficient asymptotically equivalent? If not, which one is more efficient?

Suppose I have $Y=\beta_1X_1+\beta_2X_1X_2+g(X_2)+u$, where $E(u|X_1,X_2)=0$ and $S=g(X_2)+e$ with $E(e|X_2)=0$. I have a random sample $\{Y_i,X_{1i},X_{2i},S_i\}_{i=1}^n$. Suppose I first use a ...
ExcitedSnail's user avatar
  • 2,966
5 votes
0 answers
131 views

How can I make a prediction interval for a future response (not its mean) in regression by using bootstrap?

I'd like to know how I can use bootstrap to predict the confidence interval for a future response (not for its mean) no matter what theorical model and error distribution are, I know I can train the ...
Davi Américo's user avatar
5 votes
0 answers
4k views

How to better understand when to use Weibull AFT versus Cox Model for Failure Data

I am struggling to understand when I should consider using a Cox regression model versus using a Weibull AFT model to predict the end of life of mechanical components. I have tried to apply the Cox ...
Py_Mel's user avatar
  • 95
5 votes
0 answers
880 views

How general is the backfitting algorithm?

Hastie \& Tibshirani's original approach to fitting generalized additive models was the backfitting algorithm. For a model of the form $$ y = \alpha + \displaystyle\sum_k f_k(x_k) + \epsilon $$ ...
generic_user's user avatar
  • 13.5k
5 votes
0 answers
1k views

Empirical multivariate probability integral transform

Is there a 'simple' way to obtain a non-parametric empirical multivariate probability integral transform? Univariate case The probability integral transform relates to the transform of any random ...
mic's user avatar
  • 4,388
5 votes
0 answers
522 views

Kolmogorov-Smirnov vs. Kuiper test

I would like to compare two 2D distributions with quantitative variables, illustrated here: For each "x", they are several measures "y". I can't assume these distributions are parametric. A ...
recherche888's user avatar
5 votes
0 answers
683 views

Confusion related to Parzen window

I was going through this tutorial related to Parzen window at http://www.cs.utah.edu/~suyash/Dissertation_html/node11.html. However, I have some confusion related to Parzen window with gaussian kernel ...
user34790's user avatar
  • 6,837
5 votes
0 answers
4k views

Calculate Mantel-Haenszel test in R

I would like to have a reality check of my understanding of the MH statistic. I have been trying to reproduce an example of the Mantel-Haenszel test provided in Conover (1999, p. 192-194). The data ...
blue and grey's user avatar
5 votes
0 answers
102 views

Simultaneous Non-parametric regression and Non-parametric density estimation

Given a collection of $(x,y)$ data, one might do a non/semiparametric regression of $y$ on $x$ to understand how to predict the $E(Y|X)$. Similarly, if one has enough data, it might be useful take a ...
ryan's user avatar
  • 191
4 votes
0 answers
442 views

Derivation of k nearest neighbor classification rule

One way to derive the k-NN decision rule based on the k-NN density estimation goes as follows: given $k$ the number of neighbors, $k_i$ the number of neighbors of class $i$ in the bucket, $N$ the ...
diegobatt's user avatar
  • 426
4 votes
1 answer
106 views

Nonparemetric tests: how to support the null hypothesis you claim to be testing

Let us assume that we have taken an unbalanced number of independent random samples from 5 different populations, which will be analogous to 5 different locations in this example. Each observation ...
Ryan's user avatar
  • 351
4 votes
0 answers
319 views

Are there nonparametric generative models for datasets?

Typically when I see generative models, e.g., Latent Dirichlet Allocation (JMLR) or Linear/Quadratic Discriminant Analysis (wikipedia LDA), they are probabilistic models that belong to the exponential ...
Sleepy 17's user avatar
4 votes
0 answers
90 views

Can Time Varying Coefficient models with a Kalman filter approximate any non-linear function?

I read that Time Varying Coefficients (TVC) models with non-parametric methods can approximate any non-linear function. This is from "Non-Linear Models: Where Do We Go Next - Time Varying Parameter ...
one1's user avatar
  • 63
4 votes
0 answers
672 views

Can "non-parametric" tests be achieved with generalized linear models?

I recently read @Kodiologist's answer to a post here looking for clarification on the relation between GLMs and non-parametric tests. His answer is along the lines of "the approach is not non-...
AdamO's user avatar
  • 63.7k
4 votes
0 answers
2k views

How does the gam library calculate AIC?

I was wondering how the gam library calculates the AIC. I can't find a reference that explains how this package calculates it. ...
Eli's user avatar
  • 2,672
4 votes
0 answers
2k views

How to do a stratified nonparametric test?

I'm trying to use the "coin" (conditional inference) package to perform a stratified nonparametric test for difference in distribution (for count data). I tried a stratified Mann-Whitney-Wilcoxon ...
lostisle's user avatar
4 votes
0 answers
96 views

all regressions: coefficients interpretation

good morning to all, I open this topic with the intention of being useful to me but also to many in my situational. I would like to clarify the "interpretation" of the coefficients in the regression. ...
ANDREA NIGRI's user avatar
4 votes
0 answers
238 views

multiple Kruskal-Wallis instead of ANOVA for non-normally distributed data

Apologies in advanced if I supply too little or vague information, since I am a complete newbie in stats and using this forum. So I did a study which evaluated psychopathic personality traits, ...
debbiemaycry's user avatar
4 votes
0 answers
88 views

Nonparametric estimation of the logarithm of a density

I was wondering whether there is an equivalent to Kernel Density Estimation to estimate nonparametrically the logarithm of a density. Or if there is any nonparametric method for that. (Taking the ...
epsilone's user avatar
  • 786
4 votes
0 answers
406 views

Minimum sample for non parametric test?

Exactly how many sample to be minimum size for non parametric test? Right now i have 3 data with really2 small sample size 1. 5 pairs for wilcoxon signed rank test (pretest and postest in 1 place) 2. ...
tanti's user avatar
  • 41
4 votes
0 answers
543 views

Multiple comparison of non-normal, heteroscedastic data. What test should I use?

I have a set of brain pathology data. These were obtained by counting certain parameters in the brain. Due to availability of human brains, the amount of cases vary a lot across the different groups ...
hintursul's user avatar
4 votes
0 answers
44 views

German tank variant: estimate resolution of camera given cropped photo sizes

Make whatever assumptions you like, but I like the flavor of nonparametric techniques. I have a list of the $x_i$ by $y_i$ resolutions of a number of photos, all cropped from photos taken at the same ...
Simon Kuang's user avatar
  • 2,121
4 votes
0 answers
408 views

Is this how a Bayesian bootstrap works?

I am a bit new to the whole nonparametric and Bayesian idea, so tell me if this is correct: to estimate, say, the mean of a dataset's population we do the following: We define a function $f(x)$ that ...
Simon Kuang's user avatar
  • 2,121
4 votes
0 answers
2k views

Where is the maximum bias and variance in a histogram as non-parametric density estimator?

I am a little bit confused about bias and variance of non-parametric density estimators and hope you can help me. Assuming a constant bandwidth and sample size, I am wondering at which points of the ...
jeffrey's user avatar
  • 755
4 votes
0 answers
138 views

Rank deficient bootstrap resamples

Despite years of stat courses I'm afraid I may still not completely understand bootstrapping. My question here relates to nonparametric boostrapping of regression models. As i understand it you draw ...
MHankin's user avatar
  • 91
4 votes
0 answers
804 views

ANOVA on ranks?

Once in a statistics class I've seen the suggestion that one non-parametric approach for ANOVA would be to perform the ANOVA analysis on the ranks of the original data (basically as an alternative to ...
landroni's user avatar
  • 1,133
4 votes
0 answers
213 views

Nonparametric test for largely skewed count data

My research design looks as follows: an experimental game with 4 participants (human subjects), repeated for 20 rounds. During each round, participants are allowed to form bilateral coalitions which ...
user2700264's user avatar
4 votes
0 answers
246 views

Maximum likelihood estimation and density estimation

Let's consider a general signal processing estimation problem where the measurements are modeled as $${\bf x}[n]={\bf s(\theta)}+{\bf w}[n],$$ where ${\bf w}$ is a non-Gaussian r.v. (noise term) and ${...
Arrigo Benedetti's user avatar
4 votes
1 answer
2k views

Friedman's test to identify best of multiple classifiers on multiple domains

I have several classifiers $f_i\ (i=1, \cdots, N)$ and calculated performance measures on multiple domains $(D)$ for each. Thus, there are $N \times D$ values. I want to find out (increasing ...
Chris's user avatar
  • 599
4 votes
0 answers
189 views

Non-parametric estimators for time-varying binomial proportion

I have a bunch of count data associated with time intervals (potentially overlapping and of variable lengths), say $(s_i, t_i, n_i, N_i)$ where $N_i$ is a count of the total number of events ...
Matt's user avatar
  • 316
4 votes
0 answers
7k views

Choosing post-hoc test after Kruskal-Wallis

I have a time series (five time steps) of samples from a population of the same ~60 individuals, each sample being a haphazardly chosen (i.e. not completely random) subset of the 60. The sample sizes ...
hpy's user avatar
  • 639
3 votes
0 answers
41 views

Can a Gaussian Process predict random events?

I know that we can use Gaussian processes effectively for function approximation and regression. However,suppose there is a sequence of points in time $S = \{s_1, s_2, \dots, s_n\}$, where $s_i$ can ...
Hassan Ali's user avatar
3 votes
0 answers
314 views

Estimation of Propensity Score using Random Forests

Suppose that one has a binary treatment $Z$, and assume that $Z=1|X=x \sim Bern\left(e(x)\right)$. Furthermore, suppose I want to estimate the propensity score by a random forest. Are there ...
mich95's user avatar
  • 111
3 votes
0 answers
479 views

Pros and cons of Nadaraya–Watson estimator vs. RKHS method?

Recently I've been reading some materials about nonparametric methods. Two methods related to the word "kernel" rasied my interest-- Nadaraya–Watson estimator and RKHS method. What's the ...
Marksgy's user avatar
  • 31
3 votes
0 answers
66 views

Looking for the Holy Grail of nonparametric regression

Unfortunately, to state precisely the question, I need some formal preliminaries. Let $d \in \mathbb{N}$. For each $d^* \in \{1,\dots,d\}$, define $\mathcal{M}_{d^*}$ be the set of probability ...
Bob's user avatar
  • 193
3 votes
0 answers
608 views

What does it mean when F values of ANOVAs are not all ~0?

I am running an aligned rank transform ANOVA (a non-parametric ANOVA for an ordinal DV). In my data there are 3 factors and one ordinal DV. When I create the model: ...
RECURSIVE FARTS's user avatar
3 votes
1 answer
423 views

Theoretical justification of Parametric bootstrap?

I've been reading about bootstrap, and while it's relatively easy to find theoretical results (consistency and higher-order correctness) for the nonparametric bootstrap (e.g., Asymptotic Statistics by ...
Xward's user avatar
  • 73
3 votes
0 answers
208 views

Parameter estimation when the likelihood function does not exist

The observations $Z_1,Z_2\cdots$ are i.i.d. We have $$Z_k = \sum_{i=1}^\infty \frac{X_{ki}}{2^k}.$$ where the $X_{ki}$'s are i.i.d. with a Bernouilli$(p)$ distribution. If $p=\frac{1}{2}$ then $Z_k$ ...
Vincent Granville's user avatar
3 votes
0 answers
159 views

When should one use Bradley-Terry instead of gradient boosted trees for pairwise ranking

Both the Bradley-Terry model and Gradient boosted trees can be used to learn a ranking from pairwise comparisons (e.g. with libraries choix and XGboost). How do they relate to each other? Is there ...
vman's user avatar
  • 173

15 30 50 per page
1
2 3 4 5
17