All Questions
Tagged with nonparametric density-estimation
39
questions
1
vote
0
answers
40
views
How to show $\sup_{x\in [a,b]}|f_n(x)-f(x)|=O_p(\sqrt{\frac{\log n}{nh}}+h^2)$ when the kernel $K(\cdot) $ is of bounded variation?
Consider the kernel estimate $f_n$ of a real univariate density defined by $$f_n(x)=\sum_{i=1}^{n}(nh)^{-1}K\left\{h^{-1}(x-X_i)\right\}$$
where $X_1,...,X_n$ are independent and identically ...
1
vote
0
answers
43
views
Why is histogram density estimation nonparametric?
My understanding of histogram density estimation:
For $k$ predefined equal-width bins $(b_0, b_1], (b_1, b_2], ..., (b_{k-1}, b_k]$ and $n$ observations $x_1,...,x_n \in (b_0,b_k]$, we estimate ...
0
votes
0
answers
85
views
Expected value (and variance) of a Dirichlet Process
Suppose I have a measure $G$ that follows a Dirichlet Process,
$$G \sim DP(H_0,\alpha)$$
where $H_0$ is some base measure. Is there a closed form solution for the expected value of $G$?
5
votes
2
answers
549
views
Is density estimation the same as parameter estimation?
I was studying parameter estimation from Sheldon Ross' probability and statistics book. Here the task of parameter estimation is described as follows:
Is this task the same of density estimation in ...
1
vote
0
answers
251
views
Bias of kernel density estimator of pdf $f$, where $f$ has bounded first derivative $f'$
Let's say the kernel density estimator is given by
$$\hat f(x) = \frac{1}{nh_n} \sum_{i=1}^n K\left(\frac{X_i-x}{h_n}\right),$$ where $h_n \to 0$, $nh_n \to \infty$, $K$ a symmetric probability ...
0
votes
0
answers
40
views
Kernel Density Estimator: Misunderstanding in Taylor Series and the bias of KDE [duplicate]
Let's say the kernel density estimator is given by
$\hat f(x) = \frac{1}{nh_n} \sum_{i=1}^n K(\frac{X_i-x}{h_n})$, where $h_n \to 0$, $nh_n \to \infty$, $K$ a symmetric probability distribution ...
0
votes
0
answers
50
views
How to prove symmetry of a Uniform kernel?
I am trying to prove this kernel is valid,
$$
K(x) = \frac{1}{2}I(-1 < x < 1)
$$
So far I can integrate to 1, but how do I prove $$k(x) = k(-x)$$
Also, how do we satisfy that k(x) is $\ge$ 0 for ...
1
vote
0
answers
102
views
Optimal rate of convergence of nonparametric density estimators
Suppose that $X_1, X_2, \dots, X_n$ forms an independent and identically distributed sample from some $d$-dimensional probability distribution with unknown probability density function $f$. Let $x$ be ...
1
vote
0
answers
274
views
histogram vs. kernel in density estimation
Assume we have a problem of estimation of a density $f(x)$ over an interval $[0, 1]$. Can a regular histogram (i.e. with equal-sized bins) be viewed as some kind of a kernel?
1
vote
0
answers
135
views
Extraction of modes from a multi-modal density function
I am trying to extract modes from a multi-modal density function and not just peaks. For example, in the two density functions below (images), I would like to extract the curves contained in the black ...
1
vote
0
answers
107
views
Convex hull version of density estimation (or lines of constant density)
Background:
So I had a thought, tried it out, and liked what it did. I'm sure someone else has done this. It feels very convenient. It also gives an interesting take on robust nonparametric density ...
0
votes
0
answers
289
views
Building a classifier using Parzen window
Considering the application of the Parzen window method to model a probability density function in a binary classification problem, and assume a training set where the 4 points {−5, −1, 1, 5} belong ...
2
votes
1
answer
39
views
Why might the functional form of a distribution be "inappropriate" for a particular application?
Working through Bishop's Pattern Recognition and Machine Learning(a great read so far!) and on page 67 he says:
"One limitation of the parametric approach is that it assumes a specific ...
2
votes
0
answers
41
views
Unexpected zero on posterior density of Dirichlet process mixture
I was reading this notebook from the PyMC3 documentation about Dirichlet Process Mixtures and, on the last figure, the estimated density reaches almost zero for a particular value, despite the ...
4
votes
0
answers
442
views
Derivation of k nearest neighbor classification rule
One way to derive the k-NN decision rule based on the k-NN density estimation goes as follows:
given $k$ the number of neighbors, $k_i$ the number of neighbors of class $i$ in the bucket, $N$ the ...
0
votes
0
answers
337
views
Is a non-parametric density estimation required for a bimodal distribution?
How to approach the following two cases is clear, I am mentioning them to set up my question.
(Case 1): For data that appears to be a Gaussian distribution, we can assume the distribution is Gaussian ...
1
vote
1
answer
353
views
How Parzen window density estimate $f_n$ converges to f
I am trying to understand how Parzen window density estimate converges to actual density function f(x).[Actually i am trying to learn machine learning on my own using available free resources. Please ...
3
votes
1
answer
100
views
Usefulness of MISE
I'm currently in a class on nonparametric smoothing, and, while talking about density estimation in general, the professor introduced the notion of MISE (mean integrated square error):
$\text{MISE}\...
4
votes
1
answer
2k
views
Is it appropriate to examine the density plot for time series data?
Usually we use time plot to examine the behaviour of time series data cause it reveals the chronological characteristic. Does it make sense that one looks at the data distribution using some non-...
2
votes
1
answer
839
views
Convergence of kernel density estimate as the sample size grows
Let $X\sim\text{Normal}(0,1)$ and let $f_X$ be its probability density function. I conducted some numerical experiments in the software Mathematica to estimate $f_X$ via a kernel method. Let $\hat{f}...
1
vote
0
answers
131
views
What is the resulting distribution of a data set that was originally normally distributed but has been quantized and had all negative values removed?
I am trying to benchmark a seasonal forecasting model and calculate not just the point forecasts but the forecast densities from the model.
To do this, I generated a simulated data set in the ...
5
votes
1
answer
698
views
Expected value and variance of KDE
I need to find the expected value and variance of KDE given that $$(i) E[u] = 0 \to \int u\phi(u)du=0\\
(ii)V[u] = \sigma^2 \to \int u^2\phi(u)du=\sigma^2$$ where $\phi$ is the kernel function.
I've ...
1
vote
0
answers
42
views
Difficulties with orthogonal density estimation
I am working on an implementation of an orthogonal density estimator, using the basis
$$ \psi_0(t) = 1, \quad \psi_{2j}(t) = \sqrt{2}\text{cos}(2\pi j t), \quad \psi_{2j+1}(t) = \sqrt{2}\text{sin}(2\...
4
votes
1
answer
1k
views
Properties of Kernel Density Estimators
Given
Let $X \in \mathbb{R}$ be a real-valued random variable with theoretical probability density function (pdf) $f(x)$ and corresponding cumulative distribution function (cdf) $F(x)$. Let $X_1, X_2,...
1
vote
1
answer
160
views
Credibility evaluation - how to model conditional continuous density from multiple variables of various types?
I recently got dataset for 37000 households with declared income and a few dozens of other variables of various types: continuous, discrete, binary.
The task is to automatically (unsupervised) ...
2
votes
2
answers
159
views
Dvoretzky-Kiefer-Wolfowitz Vs. KDE fractional convergence
The DKW bound says, roughly and under very general assumptions, that the empirical CDF of $n$ iid samples of a random variable $X$ converges to the exact CDF of $X$ exponentially with the number of ...
1
vote
2
answers
173
views
Closeness of 2-parametric discrete distributions when first 2 moments are matching
Let $\mathcal{D}$ be a particular 2-parameter uni-variate discrete distribution family, and let $D(\theta_1, \theta_2) \in \mathcal{D}$ be one particular distribution from this family, where $\theta_i ...
2
votes
1
answer
183
views
What are some of the common techniques for density estimation?
I'm trying to estimate the probability density function of a real random variable given its iid realizations. What are some of the standard techniques to do this?
One method I have heard of is the ...
4
votes
2
answers
4k
views
Leave one out cross validation in kernel density estimation
I am taking a look at :
http://pages.cs.wisc.edu/~jerryzhu/cs731/kde.pdf
Where they define the following loss function for kernel density estimates
$$J(h) = \int \hat{f_n}^2(x)dx -2\int\hat{f_n}(x)...
9
votes
2
answers
3k
views
Estimating the gradient of log density given samples
I am interested in estimating the gradient of the log probability distribution $\nabla\log p(x)$ when $p(x)$ is not analytically available but is only accessed via samples $x_i \sim p(x)$.
There ...
1
vote
0
answers
190
views
Optimal bandwidth selection in conditional density estimation
Consider the situation that we are estimating a $d$-dimensional density (with suitable regularity conditions) using kernel density estimation,
[Method1,conditional density estimation] We can proceed ...
2
votes
1
answer
840
views
Scaling up the bandwidth for kernel density estimation
Suppose I have $(\mathbf{X}_1, \cdots, \mathbf{X}_n)$ from a multivariate distribution $f$. The multivariate KDE is
\begin{align*}
\widehat{f}_\mathbf{H}(\mathbf{x}) = n^{-1}\sum_{i=1}^{n}K_\mathbf{H}(...
1
vote
0
answers
53
views
Nonparametric density estimation, individual probablities
Consider the problem of doing nonparametric density estimation using kernel density estimator in the common form
$k(\frac{\textbf{x} - \mathbf{x_{j}}}{h})$,
$k(\textbf{u}) = \begin{cases}
1 & \...
0
votes
0
answers
33
views
Density estimation for points regularly spaced on a grid? Infer spacing between pdf peaks?
Due to a fundamental characteristic of the data, points are clustered together on a 1-D grid-like structure with equal spacing.
Plotting these points in a histogram shows a pdf with several ...
9
votes
2
answers
6k
views
Density estimation for large dataset
I have a unidimensional data set with more than 1000000 observations.
Assuming that those observations are independent realizations of the same random variable I need to estimate the underling ...
2
votes
1
answer
381
views
Learn a distribution from distributions on samples [closed]
There's many good ways to learn a distribution $p_X$ of an r.v. $X$ over $k$ symbols given many i.i.d. samples $X_1,\ldots, X_n$. The simplest is to use the sample relative frequencies $\hat{f}_X$ as ...
3
votes
3
answers
223
views
Literature on nonparametric density estimation
I am about to write my bachelor thesis about non-parametric density estimation, especially kernel density estimators and their application in classification. As I am quite new to looking for academic ...
16
votes
3
answers
5k
views
Where is density estimation useful?
After going through some slightly terse mathematics, I think I have a slight intuition of kernel density estimation. But I am also aware that estimating multivariate density for more than three ...
4
votes
3
answers
251
views
Fast multivariate unimodal density estimator
I have a sample $\boldsymbol{x}_i$ for $i$ in $1,\dots, n$, from a $d$ dimensional density $f(\boldsymbol{x})$ and I would like to estimate this unknown density. In addition I know that $f(\boldsymbol{...