Summary of
Hypothesis Testing
- Make an assumption about a population or populations
(generally about some quantitative aspect such as mean, proportion etc.)
- Choose a level of significance (tolerable probability of
rejecting the assumption when it is true)
- Take a random sample from the population
- Calculate an appropriate test (sample) statistic
- Identify the sampling distribution of the test
statistic
- Based on the sampling distribution of the statistic, judge
if the test statistic is too large or small (too extreme) to be
consistent with the assumption. If so, reject the assumption.
- using the critical
value for the test statistic
- using the p-value
Simple example:
- µ = 5 (assumption about the population- the
null hypothesis)
- Unless explicitly given otherwise, assume the level of
significance to be .05
Suppose we know the population standard deviation and
it is s
= .05
- Take a sample of 10 random observations from the
population
- Compute test statistic: two equivalent (in the sense
that they will always result in the same conclusion) alternatives:
- Xbar = 5.04 (arithmetic
mean)
- z = ( µ - Xbar)/ (s /sqrt(n))
= (5 - 5.04) / (.05/sqrt(10)) = 2.53
("normal variate of 5.04")
- Sampling distributions:
- The sampling
distribution of Xbar is the distribution of all
possible means of samples of size 10, when the null hypothesis is indeed
true. In this case it is normal with mean 5 and standard deviation of
.05/sqrt(10) = 0.016. (this is sometimes referred to as
the standard error of the test statistic)
- The sampling
distribution of z is normal with mean 0 and standard deviation of 1. See illustration
- Decision:
At a level of significance of .05 the critical value of Xbar is µ + z. s/ sqrt(10)
= 5 + 1.645 (.05)/ sqrt(10) = 5.0232.
What this says is this: If the mean of the population were indeed 5, then only largest five percent of all possible
sample means would be as large as 5.0232 or larger. Thus the observed value of
the test statistic Xbar = 5.04 is
among these relatively rare extreme values. Thus at this level of
significance (akin to standard of proof in a court of law) there is
enough contrary evidence to reject the assumption.
Likewise the 95th percentile of the z distribution is 1.645. Again the
observed value of the statistic z = 2.53 is too extreme to be consistent with
the hypothesis.
If the mean of the distribution had
actually been 5, the probability of observing as high an Xbar as 5.04 is the p-value of 5.04. This is only
0.0057. Since a value as high as 5.04 is such an unlikely event, we
suspect that the population mean is not 5 (perhaps higher) and reject the null
hypothesis. Obviously the probability of a value for the z statistic as
high as 2.53 has the same small value of 0.0057. We therefore reason: if
the mean had actually been 5, we would not be likely to observe such an
extremely high value for z. Thus there is sufficient evidence to reject the
assumption.
Notes
- You can use excel functions
to obtain the critical as well as the p-values of the test statistics.
- = NORMINV(0.95,5.0,
0.016) = 5.026007409 (critical Xbar value)
- =NORMDIST
(5.04,5.0,.016,1) = 0.994296853 (left tail probability up to 5.04) thus
right tail probability is 1- 0.9943= 0.0057
- =NORMSINV(.95)=1.645
- =NORMSDIST(2.53)=0.994296853
(left tail probability up to 2.53) thus right tail probability = 1- .9943
= 0.0057
- If we did not know the population standard deviation
and had to estimate it from the same sample using s the
corresponding test statistic would have been t rather than z.