Theory
Bonett (2006) proposed an approximate confidence interval for the variance and standard deviation of nonnormal distributions. It's based on a variance-stabilizing transformation for $\hat{\sigma}^2$ that is $\log(\hat{\sigma}^2)$ and the application of the delta method. Specifically, $\operatorname{Var}(\log(\hat{\sigma}^2))\approx\{\gamma_{4} - (n-3)/n\}/(n - 1)$ where $\gamma_4 = \mu^4/\sigma^4$ and $\mu^4$ is the population fourth central moment. In practice, $\gamma_4$ is unknown and has to be estimated. Bonett suggested
$$
\bar{\gamma_4}=n\sum(Y_i - m)^4/\left(\sum(Y_i - \hat{\mu})^2\right)^2
$$
where $m$ is a trimmed mean with trim-proportion of $1/\left\{2(n - 4)^{1/2}\right\}$. Curto (2021) proposed to modify the above estimator by replacing $m$ with the median $\hat{\mu}_{med}$.
Bonett's 100(1-$\alpha$)% confidence interval for $\sigma^2$ is given by
$$
\exp\left\{\log(c\hat{\sigma}^2)\pm z_{1-\alpha/2}\operatorname{se}\right\}\tag{1}
$$
where $z_{1-\alpha/2}$ is the two-sided critical $z$-value from the standard normal distribution, $\operatorname{se} = c\left[\left\{\bar{\gamma_4} - (n - 3)/n\right\}/(n - 1)\right]^{1/2}$ and $c=n/(n - z_{1-\alpha/2})$ is an empirically determined small-sample adjustment factor. Taking the square root of the endpoints of $(1)$ gives a confidence interval for $\sigma$.
Burch (2017) suggested yet another method which is more complicated than the one detailed here.
Example
Using the trimodal distribution given by @Ben, we have
set.seed(142857)
x <- c(rnorm(100, mean = 8, sd = 4),
rnorm( 50, mean = 12, sd = 3),
rnorm(150, mean = 16, sd = 2))
ci_sigma2(x, alpha = 0.05)
[1] 19.46593 26.81422
So an approximate 95% confidence interval for the population variance is $(19.47;\,26.81)$.
Simulation
Let's see how well the interval $(1)$ performs. I will use the same distribution (rmydist
) as @knrumsey did with $n=200$. Here's the code for the simulations
set.seed(142857)
res <- replicate(1e4, {
x <- rmydist(200)
ci_tmp <- ci_sigma2(x, 0.05)
c(ifelse(ci_tmp[1] < 8.612 && ci_tmp[2] > 8.612, 1, 0),
diff(ci_tmp))
})
rowMeans(res)
[1] 0.935500 4.641079
The coverage is $0.935$ which is similar to the other methods while the average width is a bit larger.
The confidence interval proposed by @Ben has a simulated coverage of $0.919$ with an average width of $4.39$.
library(stat.extend)
set.seed(142857)
res <- replicate(1e4, {
x <- rmydist(200)
ci_tmp <- as.data.frame(CONF.var(alpha = 0.05, x = x))
c(ifelse(ci_tmp$Lower < 8.612 && ci_tmp$Upper > 8.612, 1, 0),
ci_tmp$Upper - ci_tmp$Lower)
})
rowMeans(res)
[1] 0.919100 4.387456
R
code
The following R
code implements Bonett's method
gamma4 <- function(y) {
n <- length(y)
m <- mean(y, trim = 1/(2*(n - 4)^(1/2)))
# m <- median(y) # Suggested by Curto (2021)
n*sum((y - m)^4)/sum((y - mean(y))^2)^2
}
ci_sigma2 <- function(x, alpha = 0.05) {
n <- length(x)
z <- qnorm(1 - alpha/2)
cfac <- n/(n - z)
se <- cfac*((gamma4(x) - (n - 3)/n)/(n - 1))^(1/2)
exp(log(cfac*var(x)) + c(-1, 1)*z*se)
}
The function VarCI
from the DescTools
package implements a number of different methods, including the one from Bonett described above.
References
Bonett, D. G. (2006). Approximate confidence interval for standard deviation of nonnormal distributions. Computational Statistics & Data Analysis, 50(3), 775-782. (link)
Burch, B. D. (2017). Distribution-dependent and distribution-free confidence intervals for the variance. Statistical Methods & Applications, 26, 629-648. (link)
Curto, J. D. (2021). Confidence intervals for means and variances of nonnormal distributions. Communications in Statistics-Simulation and Computation, 1-17. (link)