1 vote
0 answers

How to show $\sup_{x\in [a,b]}|f_n(x)-f(x)|=O_p(\sqrt{\frac{\log n}{nh}}+h^2)$ when the kernel $K(\cdot) $ is of bounded variation?

Consider the kernel estimate $f_n$ of a real univariate density defined by $$f_n(x)=\sum_{i=1}^{n}(nh)^{-1}K\left\{h^{-1}(x-X_i)\right\}$$ where $X_1,...,X_n$ are independent and identically ...
1 vote
0 answers

Why is histogram density estimation nonparametric?

My understanding of histogram density estimation: For $k$ predefined equal-width bins $(b_0, b_1], (b_1, b_2], ..., (b_{k-1}, b_k]$ and $n$ observations $x_1,...,x_n \in (b_0,b_k]$, we estimate ...
0 votes
0 answers

Expected value (and variance) of a Dirichlet Process

Suppose I have a measure $G$ that follows a Dirichlet Process, $$G \sim DP(H_0,\alpha)$$ where $H_0$ is some base measure. Is there a closed form solution for the expected value of $G$?
5 votes
2 answers

Is density estimation the same as parameter estimation?

I was studying parameter estimation from Sheldon Ross' probability and statistics book. Here the task of parameter estimation is described as follows: Is this task the same of density estimation in ...
1 vote
0 answers

Bias of kernel density estimator of pdf $f$, where $f$ has bounded first derivative $f'$

Let's say the kernel density estimator is given by $$\hat f(x) = \frac{1}{nh_n} \sum_{i=1}^n K\left(\frac{X_i-x}{h_n}\right),$$ where $h_n \to 0$, $nh_n \to \infty$, $K$ a symmetric probability ...
0 votes
0 answers

Kernel Density Estimator: Misunderstanding in Taylor Series and the bias of KDE [duplicate]

Let's say the kernel density estimator is given by $\hat f(x) = \frac{1}{nh_n} \sum_{i=1}^n K(\frac{X_i-x}{h_n})$, where $h_n \to 0$, $nh_n \to \infty$, $K$ a symmetric probability distribution ...
0 votes
0 answers

How to prove symmetry of a Uniform kernel?

I am trying to prove this kernel is valid, $$ K(x) = \frac{1}{2}I(-1 < x < 1) $$ So far I can integrate to 1, but how do I prove $$k(x) = k(-x)$$ Also, how do we satisfy that k(x) is $\ge$ 0 for ...
1 vote
0 answers

Optimal rate of convergence of nonparametric density estimators

Suppose that $X_1, X_2, \dots, X_n$ forms an independent and identically distributed sample from some $d$-dimensional probability distribution with unknown probability density function $f$. Let $x$ be ...
1 vote
0 answers

histogram vs. kernel in density estimation

Assume we have a problem of estimation of a density $f(x)$ over an interval $[0, 1]$. Can a regular histogram (i.e. with equal-sized bins) be viewed as some kind of a kernel?
1 vote
0 answers

Extraction of modes from a multi-modal density function

I am trying to extract modes from a multi-modal density function and not just peaks. For example, in the two density functions below (images), I would like to extract the curves contained in the black ...
1 vote
0 answers

Convex hull version of density estimation (or lines of constant density)

Background: So I had a thought, tried it out, and liked what it did. I'm sure someone else has done this. It feels very convenient. It also gives an interesting take on robust nonparametric density ...
0 votes
0 answers

Building a classifier using Parzen window

Considering the application of the Parzen window method to model a probability density function in a binary classification problem, and assume a training set where the 4 points {−5, −1, 1, 5} belong ...
2 votes
1 answer

Why might the functional form of a distribution be "inappropriate" for a particular application?

Working through Bishop's Pattern Recognition and Machine Learning(a great read so far!) and on page 67 he says: "One limitation of the parametric approach is that it assumes a specific ...
2 votes
0 answers

Unexpected zero on posterior density of Dirichlet process mixture

I was reading this notebook from the PyMC3 documentation about Dirichlet Process Mixtures and, on the last figure, the estimated density reaches almost zero for a particular value, despite the ...
4 votes
0 answers

Derivation of k nearest neighbor classification rule

One way to derive the k-NN decision rule based on the k-NN density estimation goes as follows: given $k$ the number of neighbors, $k_i$ the number of neighbors of class $i$ in the bucket, $N$ the ...
0 votes
0 answers

Is a non-parametric density estimation required for a bimodal distribution?

How to approach the following two cases is clear, I am mentioning them to set up my question. (Case 1): For data that appears to be a Gaussian distribution, we can assume the distribution is Gaussian ...
1 vote
1 answer

How Parzen window density estimate $f_n$ converges to f

I am trying to understand how Parzen window density estimate converges to actual density function f(x).[Actually i am trying to learn machine learning on my own using available free resources. Please ...
3 votes
1 answer

Usefulness of MISE

I'm currently in a class on nonparametric smoothing, and, while talking about density estimation in general, the professor introduced the notion of MISE (mean integrated square error): $\text{MISE}\...
4 votes
1 answer

Is it appropriate to examine the density plot for time series data?

Usually we use time plot to examine the behaviour of time series data cause it reveals the chronological characteristic. Does it make sense that one looks at the data distribution using some non-...
2 votes
1 answer

Convergence of kernel density estimate as the sample size grows

Let $X\sim\text{Normal}(0,1)$ and let $f_X$ be its probability density function. I conducted some numerical experiments in the software Mathematica to estimate $f_X$ via a kernel method. Let $\hat{f}...
1 vote
0 answers

What is the resulting distribution of a data set that was originally normally distributed but has been quantized and had all negative values removed?

I am trying to benchmark a seasonal forecasting model and calculate not just the point forecasts but the forecast densities from the model. To do this, I generated a simulated data set in the ...
5 votes
1 answer

Expected value and variance of KDE

I need to find the expected value and variance of KDE given that $$(i) E[u] = 0 \to \int u\phi(u)du=0\\ (ii)V[u] = \sigma^2 \to \int u^2\phi(u)du=\sigma^2$$ where $\phi$ is the kernel function. I've ...
1 vote
0 answers

Difficulties with orthogonal density estimation

I am working on an implementation of an orthogonal density estimator, using the basis $$ \psi_0(t) = 1, \quad \psi_{2j}(t) = \sqrt{2}\text{cos}(2\pi j t), \quad \psi_{2j+1}(t) = \sqrt{2}\text{sin}(2\...
4 votes
1 answer

Properties of Kernel Density Estimators

Given Let $X \in \mathbb{R}$ be a real-valued random variable with theoretical probability density function (pdf) $f(x)$ and corresponding cumulative distribution function (cdf) $F(x)$. Let $X_1, X_2,...
1 vote
1 answer

Credibility evaluation - how to model conditional continuous density from multiple variables of various types?

I recently got dataset for 37000 households with declared income and a few dozens of other variables of various types: continuous, discrete, binary. The task is to automatically (unsupervised) ...
2 votes
2 answers

Dvoretzky-Kiefer-Wolfowitz Vs. KDE fractional convergence

The DKW bound says, roughly and under very general assumptions, that the empirical CDF of $n$ iid samples of a random variable $X$ converges to the exact CDF of $X$ exponentially with the number of ...
1 vote
2 answers

Closeness of 2-parametric discrete distributions when first 2 moments are matching

Let $\mathcal{D}$ be a particular 2-parameter uni-variate discrete distribution family, and let $D(\theta_1, \theta_2) \in \mathcal{D}$ be one particular distribution from this family, where $\theta_i ...
2 votes
1 answer

What are some of the common techniques for density estimation?

I'm trying to estimate the probability density function of a real random variable given its iid realizations. What are some of the standard techniques to do this? One method I have heard of is the ...
4 votes
2 answers

Leave one out cross validation in kernel density estimation

I am taking a look at : Where they define the following loss function for kernel density estimates $$J(h) = \int \hat{f_n}^2(x)dx -2\int\hat{f_n}(x)...
9 votes
2 answers

Estimating the gradient of log density given samples

I am interested in estimating the gradient of the log probability distribution $\nabla\log p(x)$ when $p(x)$ is not analytically available but is only accessed via samples $x_i \sim p(x)$. There ...
1 vote
0 answers

Optimal bandwidth selection in conditional density estimation

Consider the situation that we are estimating a $d$-dimensional density (with suitable regularity conditions) using kernel density estimation, [Method1,conditional density estimation] We can proceed ...
2 votes
1 answer

Scaling up the bandwidth for kernel density estimation

Suppose I have $(\mathbf{X}_1, \cdots, \mathbf{X}_n)$ from a multivariate distribution $f$. The multivariate KDE is \begin{align*} \widehat{f}_\mathbf{H}(\mathbf{x}) = n^{-1}\sum_{i=1}^{n}K_\mathbf{H}(...
1 vote
0 answers

Nonparametric density estimation, individual probablities

Consider the problem of doing nonparametric density estimation using kernel density estimator in the common form $k(\frac{\textbf{x} - \mathbf{x_{j}}}{h})$, $k(\textbf{u}) = \begin{cases} 1 & \...
0 votes
0 answers

Density estimation for points regularly spaced on a grid? Infer spacing between pdf peaks?

Due to a fundamental characteristic of the data, points are clustered together on a 1-D grid-like structure with equal spacing. Plotting these points in a histogram shows a pdf with several ...
9 votes
2 answers

Density estimation for large dataset

I have a unidimensional data set with more than 1000000 observations. Assuming that those observations are independent realizations of the same random variable I need to estimate the underling ...
2 votes
1 answer

Learn a distribution from distributions on samples [closed]

There's many good ways to learn a distribution $p_X$ of an r.v. $X$ over $k$ symbols given many i.i.d. samples $X_1,\ldots, X_n$. The simplest is to use the sample relative frequencies $\hat{f}_X$ as ...
3 votes
3 answers

Literature on nonparametric density estimation

I am about to write my bachelor thesis about non-parametric density estimation, especially kernel density estimators and their application in classification. As I am quite new to looking for academic ...
16 votes
3 answers

Where is density estimation useful?

After going through some slightly terse mathematics, I think I have a slight intuition of kernel density estimation. But I am also aware that estimating multivariate density for more than three ...
4 votes
3 answers

Fast multivariate unimodal density estimator

I have a sample $\boldsymbol{x}_i$ for $i$ in $1,\dots, n$, from a $d$ dimensional density $f(\boldsymbol{x})$ and I would like to estimate this unknown density. In addition I know that $f(\boldsymbol{...
