Topic 6: Two-Way Tables

Fei Ye

January 2024

1. Learning Goals


2. Two-way Frequency Tables


3. Example: Body Image and Gender

The following table summarize responses of a random sample of 1,200 U.S. college students as part of a larger survey.

About Right Overweight Underweight Row Totals
Female 560 163 37 760
Male 295 72 73 440
Column Totals 855 235 110 1,200

Source: https://courses.lumenlearning.com/wmopen-concepts-statistics


4. Two-Way Relative Frequency Tables and Probability


5. Example: Probabilities about Body and Gender

The following table shows joint and marginal probabilities of body image and gender.

About Right Overweight Underweight Row Totals
Female \(\frac{560}{1200}=46.67\%\) \(\frac{163}{1200}=13.58\%\) \(\frac{37}{1200}=3.08\%\) \(\frac{760}{1200}=63.33\%\)
Male \(\frac{295}{1200}=24.58\%\) \(\frac{72}{1200}=6.00\%\) \(\frac{73}{1200}=6.08\%\) \(\frac{440}{1200}=36.67\%\)
Column Totals \(\frac{855}{1200}=71.25\%\) \(\frac{235}{1200}=19.58\%\) \(\frac{110}{1200}=9.17\%\) \(\frac{1200}{1200}=100.00\%\)

The following table shows probabilities of randomly select male or female who has a certain body image.

About Right Overweight Underweight Row Totals
Female \(\frac{560}{760}=73.68\%\) \(\frac{163}{760}=21.45\%\) \(\frac{37}{760}=4.87\%\) \(\frac{760}{760}=100.00\%\)
Male \(\frac{295}{440}=67.05\%\) \(\frac{72}{440}=16.36\%\) \(\frac{73}{440}=16.59\%\) \(\frac{440}{440}=100.00\%\)

6. Example: Community College Enrollment (1 of 2)

The following table summarizes the full-time enrollment at a community college.

Arts-Sci Bus-Econ Info Tech Health Science Graphics Design Culinary Arts Row Totals
Female 4,660 435 494 421 105 83 6,198
Male 4,334 490 564 223 97 94 5,802
Column Totals 8,994 925 1,058 644 202 177 12,000

Solution: \(P(\text{Male})=\dfrac{5802}{12000}\approx 0.4835=48.35\%.\)

Source: https://courses.lumenlearning.com/wmopen-concepts-statistics


7. Example: Community College Enrollment (2 of 2)

Solution: \(P(\text{Info Tech}|\text{Male})=\dfrac{564}{5802}\approx 0.097=9.7\%.\)

Solution: \(P(\text{Male and Info Tech})=\dfrac{564}{12000}= 0.047=4.7\%.\)

Solution: \(P(\text{Male and Info Tech})=\dfrac{564}{12000}=\dfrac{5802}{12000}\cdot \dfrac{564}{5802}=P(\text{Male})\cdot P(\text{Info Tech}|\text{Male}).\)


Practice: Weights and Heights

This table on the right relates the weights and heights of a group of individuals participating in an observational study.

Weight/Height Tall Medium Short
Obese 18 28 14
Normal 20 51 28
Underweight 12 25 9
  1. Find the total for each row and column
  2. Find the probability that a randomly chosen individual from this group is Short.
  3. Find the probability that a randomly chosen individual from this group is Obese and Short.
  4. Find the probability that a randomly chosen individual from this group is Underweight given that the individual is Tale.

Source: https://courses.lumenlearning.com/introstats1/chapter/contingency-tables/


8. Test of (No) Association


9. Example: Association Between Body and Gender (1 of 2)

Is body image related to gender?

About Right Overweight Underweight Row Totals
Female 560 163 37 760
Male 295 72 73 440
Column Totals 855 235 110 1,200

10. Example: Association Between Body and Gender (2 of 2)

Solution: Using Excel (stacked bar chart), we may compare side-by-side the conditional body image distributions for females and males

Stacked Bar Chart for Gender and Body Images

As a result of our analysis, we know that the conditional distributions of body images for males and females are quite different. We can conclude that there is enough difference to believe that those two categorical variables are in fact related.


11. Percentage Reduction of Risk


12. Example: Physicians’ Health Study (1 of 2)

Researchers in the Physicians’ Health Study (1989) designed a randomized double-blind experiment to determine whether aspirin reduces the risk of heart attack. Here are the final results.

Heart Attack No Heart Attack Row Totals
Aspirin 139 10,898 11,037
Placebo 239 10,795 11,034
Column Totals 378 21,693 22,071

Does aspirin lower the risk of having a heart attack?

Solution: We fisrt compute two conditional probabilities: \(P(\text{heart attack}|\text{aspirin})\) and \(P(\text{heart attack}|\text{placebo})\).

Source: https://courses.lumenlearning.com/wmopen-concepts-statistics


13. Example: Physicians’ Health Study (2 of 2)

The result shows that taking aspirin reduced the risk from 0.022 to 0.013.

The percentage reduction of risk is $$ \frac{P(\text{heart attack}|\text{aspirin})-P(\text{heart attack}|\text{placebo})}{P(\text{heart attack}|\text{placebo})}=\frac{\text{0.013}-\text{0.022}}{\text{0.022}}=\frac{-\text{0.009}}{\text{0.022}}\approx -\text{0.41}. $$

Therefore, we conclude that taking aspirin results in a 41% reduction in risk.


14. Hypothetical Two-way Tables

A hypothetical two-way table, also known as a hypothetical 1000 two-way table, is a two-way table constructed from given probability conditions with 1000 as the total frequency. It can be used to answer complex probability questions.

Sometimes, the total frequency can be taken to be 10,000 or a higher power of 10 so that joint frequencies are integers.


15. Example: Birth Gender Prediction (1 of 3)

A pregnant woman often opts to have an ultrasound to predict the gender of her baby. Assume the following facts are known:

Use the above facts to answer the following questions.

  1. If the examination predicts a girl, how likely the baby will be a girl?

  2. If the examination predicts a boy, how likely the baby will be a boy?


16. Example: Birth Gender Prediction (2 of 3)

Solution:

Assume that we have ultrasound predictions for 1,000 random babies.

Using those facts, we may create a two-way frequency table.


17. Example: Birth Gender Prediction (3 of 3)

Girl Boy Row Totals
Predict Girl \(480\cdot\frac{9}{10}= 432\) \(520-390 = 130\) \(432+130=562\)
Predict Boy \(480-432=48\) \(520\cdot\frac34=390\) \(48+390=438\)
Column Totals \(48\%\cdot 1000=480\) \(1000-480=520\) \(1,000\)

If the examination predicts a girl, the probability that the born baby is a girl is $$P(\text{Girl}|\text{predict girl})=\frac{432}{562} \approx 0.769=76.9\%.$$

If the examination predicts a boy, the probability that the born baby is a boy is $$P(\text{Boy} | \text{predict boy}) = \frac{390}{438} \approx 0.890=89\%.$$


Practice: Highway Safty

The table below is based on a 1988 study of accident records conducted by the Florida State Department of Highway Safety.

Nonfatal Fatal Row Totals
Seat Belt 412,368 510 412,878
No Seat Belt 162,527 1,601 164,128
Column Totals 574,895 2,111 577,006

Does wearing a seat belt lower the risk of an accident resulting in a fatality?

Source: https://courses.lumenlearning.com/wmopen-concepts-statistics


Practice: Drug Screening

A large company has instituted a mandatory employee drug screening program. Assume that the drug test used is known to be 99% accurate. That is, if an employee is a drug user, the test will come back positive (“drug detected”) 99% of the time. If an employee is a non-drug user, then the test will come back negative (“no drug detected”) 99% of the time. Assume that 2% of the employees of the company are drug users.

If an employee’s drug test comes back positive, what is the probability that the test is wrong and the employee is in fact a non drug user?

Source: https://courses.lumenlearning.com/wmopen-concepts-statistics


Lab Instruction in Excel


18. Create Stacked Bar Chart


Lab Practice: Gender vs Program Selection

The following table summarize results from a study on program selection and gender.

Arts-Sci Bus-Econ Info Tech Health Science Graphics Design Culinary Arts Row Totals
Female 4,660 435 494 421 105 83 6,198
Male 4,334 490 564 223 97 94 5,802
Column Totals 8,994 925 1,058 644 202 177 12,000

Use Excel to answer the following question about the study.