Choose appropriate null and alternative hypotheses.
Determine whether the test should be one-sided or two-sided.
Calculate \(Z\)-test statistics and \(t\)-test statistics.
Calculate the \(P\)-value.
Determine whether to reject or fail reject the alternative hypotheses.
Interpret the results of a test of significance in context.
The testing procedure starts with an initial assumption that the statement on population parameter is true.
We test this initial assumption using a random sample. If the initial assumption is really the truth, then the test statistic from a random sample shouldn’t be too far away from the center of the sampling distribution. Conversely, if it is too far away, then we should not believe in the initial assumption.
To determine how far is too far away, we need to specify a threshold, a prior probability, or equivalently a critical value.
If the test statistic is at least extreme as the critical value, then the test is significant enough to allow us to reject the initial assumption. Otherwise, we cannot draw a definite conclusion.
The prior probability measures the chance the initial assumption was wrongly rejected.
A statistical hypothesis is a statement about a population parameter.
A hypothesis test is a process that uses sample statistics to test a hypothesis.
To test a population parameter, we choose a pair of hypotheses, the null hypothesis and the alternative hypothesis which are contradictory to each other.
The null hypothesis, denoted by \(H_0\), is the statement about the population parameter that is assumed to be true.
The alternative hypothesis, denoted \(H_a\), is a statement about the population parameter that is contradictory to the null hypothesis.
Solution: Keep in mind that the null hypothesis should always contain the equal sign. The alternative hypothesis is contrary to the null hypothesis.
The logic of hypothesis testing and two types of error can be summarized in the following table.
\(H_0\) is true | \(H_0\) is false | |
---|---|---|
Reject \(H_0\) | Type I Error | Correct decision |
Fail to Reject \(H_0\) | Correct decision | Type II Error |
The interpretation of hypothesis testing is summarized in the following table.
If the claim to be tested is in \(H_0\) | If the claim to be tested is in \(H_a\) | |
---|---|---|
Reject \(H_0\) | There is enough evidence to reject the claim | There is enough evidence to support the claim |
Fail to Reject \(H_0\) | There is not enough evidence to reject the claim | There is not enough evidence to support the claim |
Rejecting the null hypothesis when it is indeed true is called a type I error. The maximum allowable probability of making a type I error is the level of significance, denoted by \(\alpha\).
Failing to reject the null hypothesis when it is false is called a type II error. The probability of a type II error is usually denoted by \(\beta\). The power of a hypothesis test, equals \(1-\beta\), is the probability of rejecting the null hypothesis when it is false.
Null: Hypothesis: The person is non-pregnant.
Source: An illustration of errors. See also the interactive demonstration of errors and the power.
If \(H_a\) has the form \(\mu\neq \mu_0\) the test is called a two-tailed test.
If \(H_a\) has the form \(\mu<\mu_0\) the test is called a left-tailed test.
If \(H_a\) has the form \(\mu>\mu_0\) the test is called a right-tailed test.
Each of the last two forms is also called a one-tailed test.
To make a decision, one may also compare probabilities. The observed significance (\( P \)-value) of a test statistic is the probability of obtaining a sample statistic at least as extreme as the (observed) test statistic, given that the null hypothesis were true.
\(P\)-Value as Tail area
Sign in \(H_a\) | \(\ne\) | \(<\) | \(>\) |
---|---|---|---|
\(P\)-value | Double of the tail area | Left tail area | Right tail area |
Making decision by comparing the \(P\)-value with the significance level \(\alpha\):
reject \(H_0\) if \(p≤\alpha\) and
do not reject \(H_0\) if \(p>\alpha\).
Consider the following testing hypotheses
\(H_{0}: p=0.50\) vs. \(H_{a}: p\ne 0.50, n=360, \hat{p}=0.56\).
Find the \(P\)-value for the test and make a decision at the 5% level of significance.
Solution: Because \(H_a\) is \(p\ne p_0\) and \(\hat{p}=0.56>p_0\), the \(P\)-value is the double of the right tail area, that is, the \(P\)-value equals \(2P(\hat{p}>0.56)\).
We first find the standard error of the null distribution: $$\text{SE}=\sqrt{p_0(1-p_0)/n}=\sqrt{0.5\cdot0.5/360}=0.03.$$
The \(P\)-value is approximately 0.0455 which can be calculated by the Excel function 2*(1-NORM.DIST(0.56,0.5,0.03,TRUE)
.
Since the \(P\)-value is smaller than \(\alpha\), we reject the null hypothesis \(H_0\).
Decide whether the following statements are true or false. Explain your reasoning.
Suppose we’re conducting a hypothesis testing for a population mean. Find the \(P\)-value for each of the following testing scenario with the given sample size \(n\) and the test statistics \(t\).
Lab Instructions in Excel
Let \(Z\) be a standard normal random variable. In Excel, \(P(Z<z)\) is given by NORM.S.DIST(z,TRUE)
.
Let \(X\) be a normal random variable with mean \(\mu\) and standard deviation \(\sigma\), that is \(X\sim \mathcal{N}(\mu, \sigma^2)\). In Excel, \(P(X<x)\) is given by NORM.DIST(x,mean,sd,TRUE)
.
When a cumulative probability \(p=P(X<x)\) of a normal random variable \(X\) is given, we can find \(x\) using NORM.INV(p,mean,sd)
.
When a cumulative probability \(p=P(Z<z)\) of a standard normal random variable \(Z\) is given, we can find \(z\) using NORM.S.INV(p)
.
Suppose a Student’s \(t\)-distribution has the degree of freedom \(\text{df}=n-1\).
To find a probability for a given \(t\)-value
The area of the left tail of the \(t\)-value may be calculated by the function T.DIST(t,df,true)
.
The area of the right tail of the \(t\)-value may be calculated by the function T.DIST.RT(t, df)
.
The area of two tails of the \(t\)-value (\(t>0\)) may be calculated by function T.DIST.2T(t,df)
.
To find the critical value for a given probability \(p\)
When the area of the left tail is given, the function T.INV(p,df)
may be used.
When the area of both tails is given, the function T.INV.2T(p,df)
may be used. This function is good for construction confidence interval.
Suppose the population standard deviation is \(\sigma=4.3\). At the significance level \(\alpha=0.02\), construct the a standardized rejection region for the following test for the population mean
Test \(H_0: \mu=21.6\) vs. \(H_a: \mu<21.6\).
Make a decision if a random sample has the size \(n=70\) and mean \(\bar{x}=20.5\).