Minitab doesn't do a chi-square goodness of fit test. You should do this part by hand.
Set up a table something like this. This table should be part of your Word output.
Red | Orange | Yellow | Green | Blue | Brown | Total | |
---|---|---|---|---|---|---|---|
Claimed Percent | 100% | ||||||
Observed | |||||||
Expected | |||||||
(Obs - Exp) | 0 | ||||||
(Obs - Exp)2 | ----- | ||||||
(Obs-Exp)2/Exp |
Multiply the claimed percent by the total number of observed M&Ms to find the expected M&Ms for each color. The test statistic is the sum of the last row.
Your test statistic has a χ2 distribution with df = # of categories - 1. It is a right tail test, so the p-value is the area to the right of the test statistic.
The number that Minitab gives you will be the area to the left of the test statistic, but you need the area to the right. Make the appropriate adjustment.
Follow the instructions question 3 on technology project 3 except make these changes.
Start a new worksheet for this problem.
The Fitted Line plot also contains the regression equation and the value of r2, the percent of the variation that can be explained by the regression model. The output in the session window on Minitab gives much of the same information including an ANOVA table that contains the F test statistic and the p-value that can be used for checking correlation.
While the p-value can be found from the ANOVA table, it doesn't give the value of r, the correlation coefficient.
The output gives you the correlation coefficient first and the p-value second. The null hypothesis is that there is no significant linear correlation.
One of the assumptions in regression is that the residuals (the differences between the estimated value and the actual value) have a normal distribution. If you turned on the storage of residuals during the fitted line plot part, then you should have a new variable called RESI1 that contain the residuals.
If you determined that there was significant linear correlation (positive or negative) by rejecting the null hypothesis of no significant linear correlation, then you should use the regression equation given by the computer. This was found when you did the fitted line plot. Your equation should look something like "fatality rate = 3.03814 - 0.0230872 seatbelt" (probably not that exactly).
If, however, you decided that there was no signficant linear correlation because you retained the null hypothesis of the correlation test, then you should use the mean of y (y-bar) for the estimated equation. Your equation should be something like "fatality rate = ####" where #### is the numerical value of the mean of the fatality variable. You'll have to do descriptive statistics to find out what that is.
There was one year that had a really low seatbelt usage rate. Copy your data into another worksheet and remove that row (that way you'll still have the original data should you need to come back to it).
Start this process at the part that says "Making a fitted line plot".
There are a couple of ways to tell which is the better model. You can look at the p-values (smaller means more significant results), the correlation coefficients (bigger means more correlation), or the value of r2. Some of these might be the same, in which case you'll have to look at something else.
Comment on which model is a better predictor of fatality rates in Illinois.