The Wilcoxon signed rank test is another nonparametric method of testing whether two populations X and Y have the same continuous distribution. To perform the test, we assume that we have independent pairs of sample data from the populations {(x1, y1), (x2, y2), . . ., (xn, yn)}.
We then rank the absolute value of the differences yi - xi. The Wilcoxon test statistic W is the sum of the ranks from the positive differences. Assuming that the two populations have the same continuous distribution (and no ties occur), then W has a mean and standard deviation given by
and
We test the null hypothesis H0: no difference in distributions. A one-sided alternative is Ha: population Y yields lower measurements. We use this alternative if we expect or see that W is much lower than its expected value, which means that there were fewer and/or smaller positive differences among yi - xi. In this case, the p-value is given by a normal approximation. We let N~ (µ , s) and compute the left-tail P(N <= W) (using continuity correction if W is an integer).
If we expect or see that W is the much higher than its expected value, then there were more and/or larger positive differences. Now we should use the alternative Ha: population Y yields higher measurements. In this case, the p-value is given by the right-tail P(N >= W), again using continuity correction if needed.
If the two sums of ranks are close, we could use a two-sided alternative Ha: there is a difference in distributions. In this case, the p-value is given by twice the smallest tail value (2*P(N <= W) if W < µ , or 2*P(N >= W) if W > µ ).
Again we note that if there are ties, then the validity of this test is questionable.
Before executing the SIGNRANK program, we must enter the X data into list L1 and enter the Y data into list L2 (use xStat and yStat on the TI-86, or c1 and c2 in a list called dist on the TI-89). Then execute the program by entering 1, 2, or 3 to specify the desired alternative X > Y, X < Y, or X does not equal Y.
The program will sort the absolute value of the differences L2 - L1 into list L3 (fStat or c3), but it will disregard any zero differences. The population size is then decreased to count only the nonzero differences.
The rank of each sorted measurement in L3 is placed next to it in L4 (LW or c4). All sequences of ties are assigned an average rank. The expected sum of ranks from the positive differences is diplayed followed by the actual sums of the ranks of the positive differences and of the negative differences. The program then displays the P-value for the entered alternative.
Example. A music store chain is trying to determine if there has been a decrease in new CD sales since a local university allowed students to have access to file sharing servers. Sales at two stores that have generally had the same amount of sales are to be compared. The second store is next to the university in question. Below are unit sales of new releases for a period of 4 weeks (Sunday to Saturday) at each store:
Test the null hypothesis that there is no difference in distribution versus the alternative that sales at Store 2 are lower.
Solution. We enter the data from the stores into lists L1 and L2 (xStat and yStat, or c1 and c2 in a list called dist). Be sure to enter the data in the same order for each list since the sales correspond to specific days. Upon executing the SIGNRANK program with the alternative 1 for X > Y, we see that the expected sum of the positive ranks is 189 while the actual sums of the negative ranks and positive ranks are respectively 228 and 150. The P-value is 0.1775.
If there were no difference in distribution, then there would be a 17.75% chance of the positive ranks among yi - xi summing to as low as 150. (This sum of positive ranks comes from the days for which sales at Store 2 were more than sales at Store 1.) Although this p-value of 0.1775 is low, it may not be considered low enough to reject the null hypothesis in favor of the alternative.
1. A study is being conducted on whether entering college students gain weight during the freshman year. Below are the "Before" and "After" weights for a random sample of 30 students.
| | ||||
Test the hypothesis that there is "no difference" in before and after weights vs. the alternative that there is a weight gain.
2. The Sign Test, the Rank Sum Test, and the Signed Rank test may also be used to test hypotheses about the median of one continuous distribution X.
Consider the playing times (in minutes) listed below for a random sample of compact discs released in 1999 on the five major labels. Use each nonparametric test to see if the median playing time is 62 minutes versus the alternative that it is less than 62. (Note: Fill one list completely with 62 and the other with the sample times.)
1. After entering the data into the appropriate lists (L1 and L2, or xStat and yStat, or c1 and c2 in a list called dist) and executing the SIGNRANK program with the alternative 2 for X < Y, we obtain the following results: The expected sum from the positive ranks is 203, while the actual sums from the negative and positive ranks are respectively 83.5 and 322.5. The P-value is 0.00325.
If their were no change in weights, then there would be only a 0.00325 probability of the positive ranks summing to as much as 322.5. This low p-value gives significant evidence to reject the null hypothesis in favor of the alternative.
2. We will enter the value 62 into list L1 (xStat or c1) a total of 36 times. Then enter the sample times into list L2 (yStat or c2).
Sign Test Results: 12 positive changes out of 36; P-value = 0.03262. If the median were 62, then there would be only a 0.03262 probability of there being as few as 12 out of 36 playing times that are more than 62 minutes. We appear to have evidence to reject the hypothesis that the median is 62 in favor of the alternative that it is less than 62.
Rank Sum Results: Expected 1st Sum = 1314; actual sums of ranks are 1530 and 1098; P-value is 0.0076. If the median were 62, then there would be only a 0.0076 probability of the playing time ranks summing to as low as 1098. Again we have evidence to reject the null hypothesis.
Signed Rank Results: Expected Pos. Sum of Ranks = 333; actual sums of neg. and pos. ranks are 443 and 223; P-value is 0.0427. If the median were 62, then there would be only a 0.0427 probability of the positive signed ranks among yi - 62 summing to as low as 223. We now seem to have conclusive evidence that the median playing time is actually less than 62 minutes.
Return to Table of Contents.