Estimating Expected Order Statistics

Question

I have a fairly basic question that I'm looking for a reference for.

First, a couple definitions. Let's say $X_1,\ldots,X_n$ are IID samples from a distribution $F$ over $[0,1]$. For any $k\in\{1,\ldots,n\}$, we can define the $k$th of $n$ order statistic $X_{k:n}$ to be the $k$th highest of the $n$ samples. Let $\mu_{k:n}$ denote the expected value of $X_{k:n}$.

For any given $k$ and $n$, I would like to estimate $\mu_{k:n}$ for $F$. Specifically, I would like find an estimator $\hat \mu_{k:n}$ which takes a profile of $m$ IID samples and minimizes the mean absolute error $ E[|\hat \mu_{k:n}-\mu_{k:n}|]$ in the worst case over all $F$ (again, distributed over $[0,1]$). (With $m$ generally being distinct from and larger than $n$.)

What kinds of guarantees are known for this problem (in terms of $k$, $n$, $m$, and maybe $\mu_{k:n}$)? Is there anything that does significantly better than just dividing $m$ into blocks of $n$ samples and computing an empirical mean of the order statistics?

The pdf of the $k^{th}$ order statistic of a distribution with pdf $f(x)$ and cdf $F(x)$ is $$ f_{(k)}(x)=nf(x)\binom{n-1}{k-1}F(x)^{k-1}\left(1-F(x)\right)^{n-k} $$ whence the expectation can be computed in the usual way. Is this helpful? Or are you looking for something else? — Sycorax, Commented Oct 21, 2018 at 4:20
The $k$-th order statistic is a natural estimate of its expectation. — Xi'an, Commented Oct 21, 2018 at 4:46
@Sycorax I'm aware of the formula for order statistics, but I'm not sure how to turn it into an estimator without doing something obvious like taking the empirical CDF and plugging it into the formula for $\mu_{k:n}$. (Which one could presumably analyze with something like DKW, though I'm not sure that's exactly the right tool.) I'm wondering if there's anything better. (And if it wasn't clear before, I'm interested in theoretical guarantees.) — Lemke, Commented Oct 21, 2018 at 15:29

Xi'an · Accepted Answer · 2018-10-29 09:05:15Z

I would like find an estimator $\hat{μ}_{k:n}$ which takes a profile of $m$ iid samples and minimizes the mean absolute error $\mathbb{E}[|\hat{μ}_{k:n}−μ_{k:n}|]$ in the worst case over all $F$

Minimising an error over all possible distributions is impossible since all distributions include Dirac masses at an arbitrary $a\in (0,1)$ for which the minimiser is $\hat{μ̂}_{k:n}=a$.

If no constraint is imposed on $F$, I think the solution has to rely on an empirical cdf $\hat{F}_m$ based on the sample of size m, from which an estimate of $μ_{k:n}$ can be derived by simulation (i.e., bootstrap in this case).

Stack Exchange Network

Estimating Expected Order Statistics

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
estimation
nonparametric
expected-value
order-statistics
mean-absolute-deviation
or ask your own question.

Hot Network Questions

Estimating Expected Order Statistics

1 Answer 1

Not the answer you're looking for? Browse other questions tagged estimationnonparametricexpected-valueorder-statisticsmean-absolute-deviation or ask your own question.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
estimation
nonparametric
expected-value
order-statistics
mean-absolute-deviation
or ask your own question.