This is a question from a mathematical statistics textbook, used at the first and most basic mathematical statistics course for undergraduate students. This exercise follows the chapter on nonparametric inference. Part 1 is quite straightforward, however, with part 2-6 I am stuck. I've looked into van der Vaart's book Asymptotic Statistics pages 344-346, but this seems to be the course book in a more advanced course. An attempt at a solution is given. Any help is appreciated.
Exercise
Suppose $x_1, ..., x_n$ are independent and identically distributed (i.i.d.) observations of a random variable $X$ with unknown distribution function $F$ and probability density function $f\in C^m$, for some $m>1$ fixed. Let $$f_n(t)=\frac{1}{n}\sum_{i=1}^n \frac{1}{h}k\left(\frac{t-x_i}{h}\right)$$ be a kernel estimator of $f$, with $k\in C^{m+1}$ a given fixed function such that $k\geq 0$, $\int_{\mathbb{R}} k(u)\mathrm{d}u=1$, $\mathrm{supp} (k)=[-1,1]$ and bandwidth $h=h(u)$ (for the time being unspecified).
- Show that $\mathbb{E}[f_n(t)]=\int_{\mathbb{R}} k(u) f(t-hu)\mathrm{d}u$.
- Make a series expansion of $f$ around $t$ in terms of $hu$ in the expression for $\mathbb{E}[f_n(t)]$. Suppose that $k$ satisfies $\int_{\mathbb{R}} k(u)\mathrm{d}u=1$, $\int_{\mathbb{R}} k(u)u^l\mathrm{d}u=0$ for all $1<l<m$ and $\int_{\mathbb{R}} k(u)u^m\mathrm{d}u<\infty$. Determine the bias $\mathbb{E}[f_n(t)]-f(t)$ as a function of $h$.
- Suppose that $\mathrm{Var}[k(X_1)]<\infty$ and determine $\mathrm{Var}[f_n(t)]$ as a function of $h$.
- Determine the mean square error $\mathrm{mse}[f_n(t)]$ from 2 and 3 as a function of $h$.
- For what value of $h$, as a function of $n$, is $\mathrm{mse}[f_n(t)]$ smallest?
- For the value of $h=h(n)$ obtained from 5, how fast does $\mathrm{mse}[f_n(t)]$ converge to 0, when $n$ converges to $\infty$?
Note; $h=h(u)$ and $1<l<m$ are most likely typos for $h=h(n)$ and $1\leq l<m$ respectively.
Attempt
- By linearity of the expectation, identical distribution of $x_1,...,x_n$, the law of the unconscious statistician and the change of variables $u=(t-x)/h$, \begin{align} \mathbb{E}[f_n(t)]&=\frac{1}{n}\sum_{i=1}^n \mathbb{E}\left[\frac{1}{h}k\left(\frac{t-x_i}{h}\right)\right]\\ &=\mathbb{E}\left[\frac{1}{h}k\left(\frac{t-x}{h}\right)\right]\\ &=\int_{\mathbb{R}}\frac{1}{h}k\left(\frac{t-x}{h}\right)f(x)\mathrm{d}x\\ &=\int_{\mathbb{R}}\frac{1}{h}k(u)f(t-hu)h\mathrm{d}u\\ &=\int_{\mathbb{R}}k(u)f(t-hu)\mathrm{d}u. \end{align}
- From $f\in C^m$, it follows that $$f(t-hu)=\sum_{l=0}^m \frac{f^{(l)}(t)}{l!} (-hu)^l+o((hu)^m).$$ Then from part 1 and linearity of integration, \begin{align} \mathbb{E}[f_n(t)]&=\int_{\mathbb{R}}k(u)\left(\sum_{l=0}^m \frac{f^{(l)}(t)}{l!} (-hu)^l+o((hu)^m)\right)\mathrm{d}u \\ &=\sum_{l=0}^m\int_{\mathbb{R}}k(u)\frac{f^{(l)}(t)(-hu)^l}{l!}\mathrm{d}u+\int_{\mathbb{R}}k(u)o((hu)^m)\mathrm{d}u. \label{remain} \end{align} From the given conditions on $k$, the $l=0$ term reads \begin{equation} \int_{\mathbb{R}} k(u)f(t)\mathrm{d}u=f(t)\int_{\mathbb{R}} k(u) \mathrm{d}u=f(t). \end{equation} The $1\leq l<m$ terms are $$\int_{\mathbb{R}} k(u)\frac{f^{(l)}(t)}{l!} (-hu)^l\mathrm{d}u=\frac{f^{(l)}(t)(-h)^l}{l!}\int_{\mathbb{R}} k(u)u^l\mathrm{d}u=0.$$ Finally, the $l=m$ term is $$ \frac{f^{(m)}(t)(-h)^m}{m!}\int_{\mathbb{R}} k(u)u^m\mathrm{d}u<\infty.$$ The remainder term is given in Misius's answer (+1). Putting it all together: $$\mathbb{E}[f_n(t)] = f(t) + \frac{f^{(m)}(t)(-h)^m}{m!} \int_{\mathbb{R}}k(u)u^m \mathrm{d}u + o(h^m),$$ and thus $$\mathbb{E}[f_n(t)]-f(t)=\frac{f^{(m)}(t)(-h)^m}{m!} \int_{\mathbb{R}}k(u)u^m \mathrm{d}u + o(h^m)=A(t)h^m+o(h^m),$$ where $A(t)=\frac{f^{(m)}(t)(-1)^m}{m!} \int_{\mathbb{R}}k(u)u^m \mathrm{d}u<\infty.$
- See Misius's answer.
\begin{align} \mathrm{mse}[f_n(t)]&=\mathrm{Var}[f_n(t)]+\mathrm{Bias}^2[f_n(t)] \\ &=\left(\frac{f(t)}{nh}\int_{\mathbb{R}}k^2(u)\mathrm{d}u+o\left(\frac{1}{nh}\right)\right)+ \left(A(t)h^m+o(h^m)\right)^2 \\ &=\left(\frac{f(t)}{nh}\int_{\mathbb{R}}k^2(u)\mathrm{d}u+o\left(\frac{1}{nh}\right)\right)+\left(A(t)^2h^{2m}+2A(t)h^mo(h^m)+o(h^{2m})\right) \\ &=\left(\frac{f(t)}{nh}\int_{\mathbb{R}}k^2(u)\mathrm{d}u+o\left(\frac{1}{nh}\right)\right)+\left(A(t)^2h^{2m}+o(h^{2m})+o(h^{2m})\right)\\ &=\left(\frac{f(t)}{nh}\int_{\mathbb{R}}k^2(u)\mathrm{d}u+o\left(\frac{1}{nh}\right)\right)+\left(A(t)^2h^{2m}+o(h^{2m})\right)\\ &\approx \frac{f(t)}{nh}\int_{\mathbb{R}}k^2(u)\mathrm{d}u+A(t)^2h^{2m}. \end{align}
From the approximation obtained in part 4, it follows that $\mathrm{mse}[f_n(t)](h)$ has an absolute minimum for $h\in(0,\infty)$, since $\mathrm{mse}[f_n(t)](h)\to\infty$ for $h\to 0$ and $h\to \infty$. The absolute minimum is found by differentiating $\mathrm{mse}[f_n(t)](h)$ and solving for $h$ when the derivative equals $0$, that is \begin{equation} h=\left(\frac{f(t)\int_{\mathbb{R}}k^2(u)\mathrm{d}u}{A^2(t)2mn}\right)^{1/(2m+1)}. \end{equation}
Plugging in the value of $h$ obtained in part 5 into the approximation obtained in part 4, one finds that \begin{equation} \mathrm{mse}[f_n(t)]\propto n^{-2m/(2m+1)}, \end{equation} since both $1/nh$ and $h^{2m}$ reduce to $n^{-2m/(2m+1)}$ for $h\propto n^{-1/(2m+1)}$.