Summary of Hypothesis Testing
- Make an assumption about a
population or populations (generally about some quantitative aspect such
as mean, proportion etc.)
- Choose a level of
significance (tolerable probability of rejecting the assumption when it is
true)
- Take (a) random sample(s)
from the population(s)
- Calculate an appropriate test
(sample) statistic
- Identify the sampling
distribution of the test statistic
- Based on the sampling
distribution of the statistic, judge if the test statistic is too large or
small (too extreme) to be consistent with the assumption. If so,
reject the assumption.
- using the critical
value for the test statistic
- using the p-value
Simple example:
- µ = 5 (assumption
about the population- the null hypothesis)
- Unless explicitly given
otherwise, assume the level of significance to be .05
Suppose we know the
population standard deviation and it is s =
.05
- Take a sample of 10 random
observations from the population
- Compute test statistic: two
equivalent (in the sense that they will always result in the same
conclusion) alternatives:
- Xbar = 5.04
(arithmetic mean)
- z = ( µ - Xbar)/ (s /sqrt(n)) = (5 - 5.04) / (.05/sqrt(10))
= 2.53 ("normal variate of 5.04")
- Sampling distributions:
- The sampling
distribution of Xbar is the distribution of all possible means of samples
of size 10, when the null hypothesis is indeed true. In this case it is
normal with mean 5 and standard deviation of .05/sqrt(10) = 0.016. (this
is sometimes referred to as the standard error of the test
statistic)
- The sampling
distribution of z is normal with mean 0 and standard deviation of 1. See illustration
- Decision:
At a level of significance of .05
the critical value of Xbar is µ + z.
s/ sqrt(10) = 5 + 1.645 (.05)/ sqrt(10) = 5.0232.
What this says is this: If the mean of the population were indeed 5, then
only largest five percent of all possible sample means would be as large
as 5.0232 or larger. Thus the observed value of the test statistic Xbar =
5.04 is among these relatively rare extreme values. Thus at this level of
significance (akin to standard of proof in a court of law) there is
enough contrary evidence to reject the assumption that µ = 5.
Likewise the 95th percentile of the z distribution is 1.645. Again the
observed value of the statistic z = 2.53 is too extreme to be consistent with
the hypothesis.
If the mean of the distribution
had actually been 5, the probability of observing as large an
Xbar as 5.04 or larger is the p-value of 5.04. This is only 0.0057. Since
a value as high as 5.04 is such an unlikely event, we suspect that the
population mean is not 5 (perhaps higher) and reject the null hypothesis.
Obviously the probability of a value for the z statistic as high as 2.53 has
the same small value of 0.0057. We therefore reason: if the mean had
actually been 5, we would not be likely to observe such an extremely high
value for z. Thus there is sufficient evidence to reject the assumption.
Notes
- You can use excel functions
to obtain the critical as well as the p-values of the test statistics.
- = NORMINV(0.95,5.0,
0.016) = 5.026007409 (critical Xbar value)
- =NORMDIST
(5.04,5.0,.016,1) = 0.994296853 (left tail probability up to 5.04) thus
right tail probability is 1- 0.9943= 0.0057
- =NORMSINV(.95)=1.645
- =NORMSDIST(2.53)=0.994296853
(left tail probability up to 2.53) thus right tail probability = 1- .9943
= 0.0057
- If we did not know the
population standard deviation and had to estimate it from the same sample
using sample standard deviation, s the corresponding test statistic
would have been t rather than z.