Confidence Intervals for Proportion

Fei Ye

November 2024

1 Learning Goals for Confidence Intervals


2 Confidence Interval for a Proportion (1 of 2)


3 Confidence Interval for a Proportion (2 of 2)


4 Distribution of Confidence Intervals

By the central limit theorem, the random variable \(\hat{p}\) is normally distributed. The chance that $$p\in \left[\hat{p}-z_{\alpha/2}\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}, \hat{p}+z_{\alpha/2}\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\right]$$ is the same as the chance that $$\hat{p}\in \left[p-z_{\alpha/2}\sqrt{\frac{p(1-p)}{n}}, p+z_{\alpha/2}\sqrt{\frac{p(1-p)}{n}}\right].$$

It follows that \(z_{\alpha/2}\) satisfies the following equation $$P(-z_{\alpha/2}<\dfrac{\hat{p}-p}{\sqrt{\frac{p(1-p)}{n}}}<z_{\alpha/2})=1-\alpha.$$


5 Example: Proportion of Students Taking Busses (1 of 2)

In a random sample of 100 students in college, 65 said that they come to college by bus.

  1. Give a point estimate of the proportion of all students who come to college by bus.

  2. Construct a 99% confidence interval for that proportion.

Solution: A good point estimate would be a sample proportion. Here the sample proportion is \(\hat{p}=65/100=0.65\).

As \(n\hat{p}=100\cdot 0.65=65>10\) and \(n(1-\hat{p})=100\cdot 0.35=35>10\), which implies the sample is large enough, approximately the standard error is $$\hat{\sigma}_{\hat{P}}=\sqrt{\frac{\hat{p}(1-\hat{p})}{n}}=\sqrt{\frac{0.65(1-0.65)}{100}}\approx0.048.$$


6 Example: Proportion of Students Taking Busses (2 of 2)

At 99% level of confidence, \(\alpha/2=(1-\text{confidence level})/2\), the critical value \(z_{\alpha/2}\) is determined by the equation \(P(Z<z_{\alpha/2})=1-\alpha/2=0.5+\text{confidence level}/2\). In Excel, the critical value is \(z_{\alpha/2}\) =NORM.S.INV(0.5+99%/2) \(\approx 2.576\).

The marginal error is \(E=z_{\alpha/2}\cdot \hat{\sigma}_{\hat{P}}=2.576\cdot 0.048=0.123,\) and the confidence interval at 99% level is $$[\hat{p}-E, \hat{p}+E]\approx [0.65-0.123, 0.65+0.123]=[0.527, 0.773].$$

Conclusion: we are 99% confident that the proportion of all students at the college who take bus is in the interval \([0.527, 0.773]\).

}

7 Factors Affect the Width of Confidence Intervals


8 Sample Size Determination


9 Example: Minimum Sample Size - Error in Proportion

Suppose you want to estimate the proportion of students at QCC who live in Queens. By surveying your classmates, you find around 70% live in Queens. Use this as a guess to determine how many students would need to be included in a random sample if you wanted the error of margin for a 95% confidence interval to be less than or equal to 2%.

Solution: We may use \(\hat{p}=0.7\) as a reasonable guess for the population proportion.

At the 95% level, the critical value is \(z_{\alpha/2}=\) NORM.S.INV(0.5+0.95/2) \(\approx 1.96\).

Since the marginal error is \(E=0.02\), the appropriate minimal sample size is determined by $$n=\left(\frac{z_{\alpha/2}}{{E}}\right)^2\cdot \hat{p}(1-\hat{p})=(1.96/0.02)^2\cdot 0.7\cdot(1-0.7)=2016.84.$$

Since the sample size has to be an integer, to get a error no more than 2% at the level 95%, the minimal sample size should be at least 2017.


10 Example: Minimum Sample Size - Error in Mean

Find the minimum sample size necessary to construct a 99% confidence interval for the population mean with a margin of error \(E =0.2\). Assume that the estimated population standard deviation is \(\sigma=1.3\).

Solution: At the 99% level, the critical value \(z_{\alpha/2}=\) NORM.S.INV(0.5+0.99/2) \(\approx 2.576\).

The desired marginal error is \({E}=0.2\).

The estimated population standard deviation is \(\sigma=1.3\).

Then the minimal sample size is approximately $$n=\left(\dfrac{z_{\alpha/2}\cdot \sigma}{{E}}\right)^2\approx (2.576\cdot 1.3/0.2)^2 \approx 280.4.$$

To get a error no more than 0.2 at the level 95%, the minimal sample size should be at least 281.


Practice: Confidence Interval of Proportion of Kids


Practice: Confidence Intervals for Product Quality

To understand the reason for returned goods, the manager of a store examines the records on 40 products that were returned during the last year. Reasons were coded by 1 for “defective,” 2 for “unsatisfactory,” and 0 for all other reasons, with the results shown in the table.

0 0 0 0 2 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0
0 0 0 1 0 0 0 0 2 0 2 0 0 0 0 0 0 2 0 0
  1. Give a point estimate of the proportion of all returns that are because of something wrong with the product, that is, either defective or performed unsatisfactorily.

  2. Construct an 80% confidence interval for the proportion of all returns that are because of something wrong with the product.


Practice: Sample Size for Mean with Given Error


Practice: Sample Size for Proportion with Given Error


Practice: Confidence Interval of Proportion Given Table


Lab Instructions in Excel


11 Normal Distributions and Marginal Errors


Lab Practice: Confidence Interval for Proportion

Foothill College’s athletic department wants to calculate the proportion of students who have attended a women’s basketball game at the college. They use student email addresses, randomly choose 220 students, and email them. Of the 145 who responded, 22 had attended a women’s basketball game.

Calculate and interpret the approximate 90% confidence interval for the proportion of all Foothill College students who have attended a women’s basketball game.