Skip to main content

All Questions

4 votes
1 answer
52 views

In what ways is Gaussian Process Regression both parametric and non-parametric?

Gaussian Process Regression is considered a "non-parametric" model. However, the term "non-parametric" is often used imprecisely to mean different things, leading to questions ...
socialscientist's user avatar
0 votes
0 answers
4 views

learning guarantees for gaussian weighting of training points

I have my training data for binary classification that consists of $N$ pairs $$(x_i\in R^F, y_i \in {-1, 1})$$ $i\in [1,\dots,N]$. My classification rule of a new point $x$ is simply $$ \hat{y}(x) = \...
Franco Marchesoni's user avatar
1 vote
0 answers
171 views

Projection pursuit regression

Projection pursuit regression (PPR) is described in Hastie et al.'s The Elements of Statistical Learning in the chapter on neural networks. The algorithm was introduced by Friedman and Stuetzle (1981)....
Estacionario's user avatar
5 votes
2 answers
549 views

Is density estimation the same as parameter estimation?

I was studying parameter estimation from Sheldon Ross' probability and statistics book. Here the task of parameter estimation is described as follows: Is this task the same of density estimation in ...
tail's user avatar
  • 151
4 votes
3 answers
479 views

Perfect Prediction: Why Would We Ever Use a Statistical Model?

Dear statistics experts I need your help with something that has bothered me for a while now. My problem revolves around perfect prediction and essentially boils down to: Why would we ever set up and ...
This_is_it's user avatar
1 vote
0 answers
39 views

validation and calibration of crop yield data using conditional inference trees

I am trying to validate and calibrate the conditional inference tree model using the crop yield data, and I started by splitting my dataset into training and test sets. After splitting, I had to ...
Jovin Vicent's user avatar
3 votes
0 answers
66 views

Looking for the Holy Grail of nonparametric regression

Unfortunately, to state precisely the question, I need some formal preliminaries. Let $d \in \mathbb{N}$. For each $d^* \in \{1,\dots,d\}$, define $\mathcal{M}_{d^*}$ be the set of probability ...
Bob's user avatar
  • 193
1 vote
0 answers
159 views

which non parametric test to use for anomalous NN model outputs

Assume I have a bunch of trained NN models for classifying MNIST. All of them except one was trained on the same training set while the one was trianed on a different training set (could have ...
Sam's user avatar
  • 403
1 vote
0 answers
436 views

AIPW and Cross-fitting (Stanford stat361)

I am reading lecture note (Stanford stat361: https://web.stanford.edu/~swager/stats361.pdf) written by Stefan Wager. At page 23-24 the author states dependent summands become independent after ...
Ivan.lee's user avatar
3 votes
1 answer
215 views

Random forest with nonnegative dependent variable

I have a modeling framework with an outcome that must necessarily be positive. In the training data, the outcome ranges from close to zero to much higher (approximately 0.05 to 100). Is there a way to ...
bob's user avatar
  • 725
5 votes
1 answer
2k views

Is it possible to use variational autoencoders with Non-Gaussian data?

I am dealing with two scenarios: 1) Non-Gaussian data distribution and 2) non-stationary data). First, I am planning to use a variational autoencoder for modeling the probability distribution of the ...
Amhs_11's user avatar
  • 333
1 vote
0 answers
135 views

Extraction of modes from a multi-modal density function

I am trying to extract modes from a multi-modal density function and not just peaks. For example, in the two density functions below (images), I would like to extract the curves contained in the black ...
curiosus's user avatar
  • 333
3 votes
1 answer
337 views

What is the difference between sieve estimation and structural risk minimization?

I was wondering if you could help me out. I am quite confused about the difference between sieve estimators (Ulf Grenander) and structural risk minimization (SRM) (Vladimir Vapnik). Could anyone give ...
vshas's user avatar
  • 131
1 vote
0 answers
888 views

what are the main differences between parametric and non-parametric machine learning algorithms?

I am interested in parametric and non-parametric machine learning algorithms, their advantages and disadvantages and also their main differences regarding computational complexities. In particular I ...
john price's user avatar
2 votes
1 answer
39 views

Why might the functional form of a distribution be "inappropriate" for a particular application?

Working through Bishop's Pattern Recognition and Machine Learning(a great read so far!) and on page 67 he says: "One limitation of the parametric approach is that it assumes a specific ...
stochasticmrfox's user avatar
4 votes
0 answers
442 views

Derivation of k nearest neighbor classification rule

One way to derive the k-NN decision rule based on the k-NN density estimation goes as follows: given $k$ the number of neighbors, $k_i$ the number of neighbors of class $i$ in the bucket, $N$ the ...
diegobatt's user avatar
  • 426
1 vote
2 answers
451 views

Which Nonparametric Model to use for Small Time Series?

I have the following data: ...
caproki's user avatar
  • 129
1 vote
1 answer
258 views

Does a non-parametric model necessarily have zero bias?

For a parametric model like linear regression, the bias is often interpreted as "the parameters & architecture you chose are inappropriate for the shape of this dataset". For (one ...
kennysong's user avatar
  • 1,061
2 votes
1 answer
56 views

Quantifying importance of a parameter in neural networks' prediction

Say I'm given a neural network, parameterized by a $d$-dimensional vector $\theta$, and an input $x$. Given the prediction of this model $f_{\theta}(x)$, can I somehow quantify importance of each of $...
SpiderRico's user avatar
1 vote
1 answer
50 views

What are the implications of a nonparametric machine learning algorithm?

I've been looking into the advantages of using a Random Forest classifier and stumbled upon this random forests are non-parametric Looking at the definition of what non-parametric statistics mean, ...
emilaz's user avatar
  • 111
4 votes
1 answer
1k views

Can someone explain why neural networks are highly parameterized?

I understand that neural networks by definition, are a parametric model. If I am correct, Parametric methods make an assumption about the functional form, or shape, of f. For a neural network, what ...
user277337's user avatar
1 vote
1 answer
353 views

How Parzen window density estimate $f_n$ converges to f

I am trying to understand how Parzen window density estimate converges to actual density function f(x).[Actually i am trying to learn machine learning on my own using available free resources. Please ...
Nascimento de Cos's user avatar
0 votes
0 answers
14 views

Doubt in kernel based method - unit hypercube(Parzan window estimate)

I recently started studying pattern recognition on my own. Please clarify me the following. https://books.google.co.in/books?id=T0S0BgAAQBAJ&pg=PA53&lpg=PA53&dq=hypercube+of+side+h&...
Nascimento de Cos's user avatar
1 vote
0 answers
50 views

Why mixture model with Gibbs sampling works?

I just have a question about why Gibbs sampling can correctly estimate parameters with random initial value? That is to say,we can sample z by: \begin{align} p(z_i=k \,|\, \cdot) &\...
yi li's user avatar
  • 131
2 votes
0 answers
327 views

Is kernalized linear regression parametric or nonparametric?

We know that for linear regression, we can predict: $$\hat{y} = w^Tx +b$$ Where $w$ is the parameter that minimizes the square loss. It is easy to prove that for the final solution using gradient ...
Ibrahim's user avatar
  • 21
1 vote
0 answers
23 views

Why is "consistent nearest neighbour" Non-parametric? [duplicate]

Definition of "Consistent nearest neighbour", runs our usual KNN classifier but instead of viewing k as a hyper-parameter it always sets k = ceil[log(n)]. So far, I looked-up many references and ...
M.Hossein Rahimi's user avatar
3 votes
0 answers
205 views

Parametric vs non-parametric machine learning methods [duplicate]

I looked-up many references and websites and researched on how to determine if a method is between parametric or non-parametric. I came up with below definitions, A parametric algorithm has a fixed ...
M.Hossein Rahimi's user avatar
11 votes
1 answer
514 views

Do Stochastic Processes such as the Gaussian Process/Dirichlet Process have densities? If not, how can Bayes rule be applied to them?

The Dirichlet Pocess and Gaussian Process are often referred to as "distributions over functions" or "distributions over distributions". In that case, can I meaningfully talk about the density of a ...
snickerdoodles777's user avatar
1 vote
1 answer
986 views

Estimating conditional probability with many samples

I am confused about the estimation of conditional probabilities. Suppose I want to predict a binary outcome variable $Y = 0,1$ given $n$ categorical features $X = (X_1, \ldots, X_n)$, i.e. to ...
user227451's user avatar
0 votes
1 answer
184 views

How to know which two hyperparameters are more important in SVM, KNN and MLP?

I am trying to limit myself to a maximum two hyper-parameters that are important in KNN, SVM and ...
Kim Zac's user avatar
2 votes
0 answers
133 views

Smooth regression algorithms that produce zero training error

I am looking to fit three regression functions $f_1, f_2, f_3:\mathbb{R}^2 \to \mathbb{R}$. For example, let's say $X_1$ is time, $X_2$ is geographic latitude, $f_1$ is the temperature, $f_2$ is the ...
User191919's user avatar
1 vote
1 answer
262 views

Approximate a CDF

Suppose we have $n$ equations with an integral of the form $\int_0^{x_i} F(z)dz = c_i,\ i=1,\ldots,n$ where $F(y)=\mathbb{P}(X \le y)$ is an unknown cumulative distribution function of a non-negative ...
Kumar's user avatar
  • 719
1 vote
0 answers
36 views

Dimension reduction with semi-supervised embeddings

Is there a dimension reduction method (linear or non-linear) where the embeddings/projections of some of the input points are already known in advance and are taken into account during parameter ...
gkcn's user avatar
  • 113
1 vote
1 answer
2k views

Scikit Learn DBSCAN with Dice Coefficient

I am trying to cluster a high dimensional data set - Young People Survey Data https://www.kaggle.com/miroslavsabo/young-people-survey This is my first pass and wanted to give clustering the entire ...
jainp's user avatar
  • 43
44 votes
4 answers
69k views

What exactly is the difference between a parametric and non-parametric model?

I am confused with the definition of non-parametric model after reading this link Parametric vs Nonparametric Models and Answer comments of my another question. Originally I thought "parametric vs ...
Haitao Du's user avatar
  • 37.2k
0 votes
0 answers
402 views

Non-parametric non-linear regression with deep learning

I have a situation where I have an increasing list of real numbers $\vec a$ of variable length (generally about 50 numbers but sometimes more). It turns out that these numbers uniquely correspond to ...
rhombidodecahedron's user avatar
8 votes
2 answers
2k views

Bayesian nonparametric answer to deep learning?

As I understand it, deep neural networks are performing "representation learning" by layering features together. This allows learning very high dimensional structures in the features. Of course, it's ...
cgreen's user avatar
  • 1,002
0 votes
0 answers
472 views

Relation between Nonparametric Statistics and Statistical Learning Theory

I used to hear some Statistics professor complaining about Machine Learning theories: "It is just Non-parametric Statistics". And, when I read Vapnik's book "Statistical Learning Theory", it seems he ...
user112758's user avatar
2 votes
1 answer
520 views

Why is a parametric classifier faster to train than a non-parametric one?

In the tutorial Parametric and Nonparametric Machine Learning Algorithms it says that parametric classifiers are faster than non-parametric classifiers. The reason that non-parametric classifiers are ...
AdiT's user avatar
  • 295
2 votes
0 answers
116 views

How to adjust ratings of N items by pairwise comparisons

I have been keeping a list of movies I've seen in a spreadsheet and assigning them numerical rankings that approximate how I feel about them. A few years ago I implemented a program to read in the ...
Pavel Komarov's user avatar
10 votes
1 answer
9k views

Why KNN and SVM with a gaussian are non-parametric models?

I was told that these two are non-parametric models. But I can't figure out why, especially for KNN. Could anyone answer my questions?
Hanamichi's user avatar
  • 653
8 votes
1 answer
1k views

Nonparametric nonlinear regression with prediction uncertainty (besides Gaussian Processes)

What are state-of-the-art alternatives to Gaussian Processes (GP) for nonparametric nonlinear regression with prediction uncertainty, when the size of the training set starts becoming prohibitive for ...
lacerbi's user avatar
  • 5,226
3 votes
0 answers
57 views

Family of flexible parametric mappings $f_\theta:(0,1) \rightarrow \mathbb{R}$?

For the purpose of reparameterizing a model (mostly with the goal of improving MCMC efficiency), I am looking for a family of flexible parametric mappings $f_\theta:(0,1) \rightarrow \mathbb{R}$ such ...
lacerbi's user avatar
  • 5,226
2 votes
1 answer
381 views

Learn a distribution from distributions on samples [closed]

There's many good ways to learn a distribution $p_X$ of an r.v. $X$ over $k$ symbols given many i.i.d. samples $X_1,\ldots, X_n$. The simplest is to use the sample relative frequencies $\hat{f}_X$ as ...
chausies's user avatar
  • 421
1 vote
2 answers
730 views

Machine Learning Procedure for Fractional/Proportional Data?

I am looking for some suggestions of machine learning procedures that work to predict fraction outcomes where the outcome variables $\in [0,1]$. Can you provide me with any suggestions? I thought ...
StatsStudent's user avatar
  • 11.5k
1 vote
1 answer
236 views

Kernel nonparametric regression

One of the methods for nonparametric regression is using kernels. My question is what are the conditions on the kernels functions in this method? In other words how can I decide if a given function ...
toroto's user avatar
  • 109
0 votes
1 answer
30 views

How to compute the unconditioned density in $1NN$ classier?

Suppose I have $50$ training points $x_1$, $x_2,\ldots,x_{50}$ and they are distributed via bimodal Gaussian on real line. Now, given a new point, for $1NN$, I am trying to find a interval around $x$ ...
JumpJump's user avatar
  • 210
53 votes
9 answers
3k views

Are all models useless? Is any exact model possible -- or useful?

This question has been festering in my mind for over a month. The February 2015 issue of Amstat News contains an article by Berkeley Professor Mark van der Laan that scolds people for using inexact ...
Russ Lenth's user avatar
  • 20.8k
1 vote
0 answers
124 views

What are some examples of applied machine learning problems that requires using mixed models?

What are some examples of applied machine learning problems that requires using mixed models? I'm just introduced to the notion of mixed models. As I understand it, it is a combination of parametric ...
qazwsx's user avatar
  • 737
4 votes
1 answer
2k views

Friedman's test to identify best of multiple classifiers on multiple domains

I have several classifiers $f_i\ (i=1, \cdots, N)$ and calculated performance measures on multiple domains $(D)$ for each. Thus, there are $N \times D$ values. I want to find out (increasing ...
Chris's user avatar
  • 599

15 30 50 per page