Why Central limit theorem is not giving the correct Level of significance?

Question

Consider the following problem: A company manufacturing RAM chips claims the defective rate of the population is $5 \%$. Let $p$ denote the true defective probability. We want to test if:

$H_{0}: p=0.05$
$H_{1}: p>0.05$

We are going to use a sample of $100$ chips from the production to test.

Let $X$ denote number of defective chips in a sample of $100$.

Let us choose the critical value as $10$, that is we reject $H_0$ when $X \geq 10$ and do not reject $H_0$ when $X<10$.Now i am interested to find the corresponding level of significance $\alpha$.

We can do this using Binomial distribution as follows: $$\begin{aligned} \alpha &=\operatorname{Pr}(\text { Type I error })=\operatorname{Pr}\left(\text { reject } H_{0} \text { when } H_{0} \text { is true }\right) \\ &=\operatorname{Pr}(X \geq 10 \text { when } p=0.05) \\ &=\sum_{x=10}^{100} B(x ; n=100, p=0.05), \quad \text { binomial distribution } \\ &=\sum_{x=10}^{100}\left(\begin{array}{c} 100 \\ n \end{array}\right) 0.05^{x} 0.95^{100-x}=0.0282 \:\text{(Not sure how the author have calculated this)} \end{aligned}$$

Now i am trying to use Central Limit Theorem.

Let $X_i$ be a random variable that the $i$th chip is defective with Bernoulli's distribution with PMF: $$\begin{aligned} &\operatorname{Pr}\left(X_{i}=1\right)=0.05 \\ &\operatorname{Pr}\left(X_{i}=0\right)=0.95 \end{aligned}$$

We have:

$\mu=E[X_i]=0.05$ and $\sigma^2=Var[X_i]=0.0475$.

Also the sample size $n=100$.

Now $X=X_1+X_2+...X_{100}$ and let $\bar{X}$ be the sample mean.

So we have $$\operatorname{Pr}(X \geqslant 10)=\operatorname{Pr}(\bar{X} \geqslant 0.1)$$

$$\Rightarrow \operatorname{Pr}(X \geqslant 10)=\operatorname{Pr}\left(\frac{\bar{X}-0.05}{\frac{\sigma}{\sqrt{n}}} \geqslant \frac{0.1-0.05}{\frac{\sigma}{\sqrt{n}}}\right)=\operatorname{Pr}(Z \geqslant 2.29)=1-\Phi(2.29)=0.011$$

But the answer is $0.0282$. Where i went wrong?

The normal approximation of the binomial distribution is exactly that, an approximation. Usually, one only uses this approximately when the normal condition is satisfied, that is when $np$ and $n(1-p)$ are both at least $10$. This isn't the case in this problem. There are also "continuity corrections" to make this approximation more accurate. See here: pressbooks.lib.vt.edu/introstatistics/chapter/…. — user801306, Commented Jan 6, 2022 at 2:19
use stattrek.com/online-calculator/binomial.aspx, you can find $P(X>10) = 0.011, P(X \ge 10) = 0.0282$. I think the reason is the point probability $P(X=10)$ cannot be ignored. You may need some "average" correction in CLT approximation. — nsigma, Commented Jan 6, 2022 at 2:19
Let $X∼\mathsf{Binom}(n−100,p=.05).$ "Not sure how the author calculated" $P(X≥10)=1−P(X≤9)=0.0382:$ My guess is by using statistical software. For example, in R. where pbinom is a binomial CDF, code 1 - pbinom(9, 100, .05) returns $0.02818829.$ Surely the author did not use a normal approximation because the point is that the normal approximation works poorly for $\mathsf{Binom}(n−100,p=.05).$ – BruceET 20 mins ago — BruceET, Commented Jan 6, 2022 at 17:48
Related to the comment above. In R, one can use binom.test to do an exact binomial test of $H_0: p=.05$ against $H_a: p>.05:$ In R, code binom.test(10, 100, p=.05, alt="g")$p.val returns P-value $0.02818829.$ — BruceET, Commented Jan 6, 2022 at 18:07

BruceET · Accepted Answer · 2022-01-06 17:14:22Z

Graphical comment on the poor normal approximation for $\mathsf{Binom}(n-100,p=.05).$ [Using R to make graph.]

R code to make figure:

x = 0:20
bin.pdf = dbinom(x, 100, .05)
mu = 100*.05;  mu
[1] 5
sg = sqrt(100*.05*.95): sg
[1] 2.179449
x = 0:20
bin.pdf = dbinom(x, 100, .05)
hdr = "Comparing PDF BINOM(100, .05) with NORM(5, 2.18)"
plot(x, bin.pdf, type="h", lwd=2, main=hdr)
 abline(h=0, col="green2")
 abline(v=0, col="green2")
 curve(dnorm(x, 5, 2.18), add=T, col="blue", lwd=2)

Stack Exchange Network

Why Central limit theorem is not giving the correct Level of significance?

1 Answer 1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
probability
statistics
hypothesis-testing
central-limit-theorem
.

Hot Network Questions

Why Central limit theorem is not giving the correct Level of significance?

1 Answer 1

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged probabilitystatisticshypothesis-testingcentral-limit-theorem.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
probability
statistics
hypothesis-testing
central-limit-theorem
.