This project studies the conditions that allows us to use analysis of variance to determine if a group of populations have a common mean. We begin with several data sets having no known underlying conditions and proceed with the study.
The following charts list the estimated city miles per gallon obtained for samples of 1993 models of cars as reported by Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993).
Model and City MPG |
The populations actually consist of all possible models and not just those listed. Before we can use ANOVA, we must verify or accept the hypothesis that the populations are normally distributed. Use the TESTNORM program with 8 partitions and a 0.05 level of significance to test whether or not we can accept the claim that each of these three populations is normally distributed.
If one or more of the populations is rejected as being normal, then we cannot use ANOVA. In this case, proceed to the last section below.
Suppose we have accepted the hypothesis that each population is normally distributed. Then secondly, ANOVA requires that each population have the same variance. We must therefore test various pairs of populations in order to accept or reject the hypothesis that they have a common variance. To do so, we can use the RATFTEST program.
If X and Y represent one pair of the populations above, then we can denote the unknown variances by VarX and VarY respectively. We wish to test the hypothesis Ho: VarX / VarY = 1. If we accept the claim, then we can assume that X and Y have the same variance. Then we test another population with one of these two. We continue until we either accept that all populations have the same variance or find a pair for which we reject the hypothesis.
Compute the sample variance for each population above, and then test the hypothesis that they all have a common variance. If the hypothesis is rejected, then ANOVA cannot be used; thus, proceed to the last section.
So now the conditions of normality and common variance have been accepted. Thus, choose an appropriate level of significance and use ANOVA to test whether each population has a common mean. If you accept this hypothesis, then you can state your conclusions.
Suppose that you reject that all means are equal. Then at least one pair of populations have different means. To get an idea of which means appear to be different, you can simply look at the values of the individual sample means. Find the populations whose sample means are furthest apart. Then use the T2MNTEST program to test the hypothesis that the difference in means is equal to 0. If you reject the hypothesis, then you have found a pair with different means. Find all such pairs.
The required conditions for ANOVA have not been met; but we can still test whether or not each population has the same distribution. We can do so with the non-parametric Kruskal-Wallis test using the KRUSKAL program.
And if the Kruskal-Wallis test yields rejection of the hypothesis, then we should use the test again on various populations two at a time to determine which pairs have different distributions.
Thus if ANOVA cannot be used, then use the Kruskal-Wallis test to determine if the populations have the same distribution, or to find which pairs of populations have different distributions.
Now let Population 1 = Compact & Sporty; Population 2 = Midsize & Vans; and Population 3 = Large & Small.
Rework the project with these three populations.
Return to Table of Contents.