Statdisk Instructions for Chapter 10 Technology Exercise

Seatbelt Safety (Question 2)

Start a new worksheet for this problem.

  1. Label columns as seatbelt and fatality
  2. Gather the information for 1985 through 2003 from Figure 1 of the June 2005 Safety Belt Usage in Illinois. This information is from the Illinois Department of Transportation and is at http://www.dot.state.il.us/trafficsafety/seatbelt%20june%202005.pdf. The data is in a chart, so you'll have to read the percents from the top of the bars. Data is available as late as 2005, but we're only able to get information through 2003 for the next part, so we'll stop in 2003.
  3. Gather the information for 1985 through 2002 from the Illinois 2003 Toll of Motor Vehicles Crashes page from the National Highway Traffic Safety Administration at http://www.nhtsa.dot.gov/STSI/State_Info.cfm?Year=2003&State=IL. There is a table toward the bottom of the page that is titled "Fatalities and Fatality Rate per 100 Million VMT". You want the Total Fatality Rate column. Be sure you use the "Fatality Rate" column and not the "Fatalities" column. The data for 2004 is not available, so we're going to stop with 2003.

The National Highway Traffic Safety Administration site listed above is shut down as of April 18, 2006, because of technical difficulties. Here are the data you need from that site (thanks go to Google for holding cached copies of pages).

Year 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 00 01 02 03
Rate 1.18 1.17 1.11 1.20 1.10 0.99 0.86 0.78 0.71 0.74 0.74 0.70 0.60 0.61 0.63 0.61 0.60 0.62 0.60

Making the Fitted Line Plot (part b)

Statdisk will make the graph, but you can't copy and paste it into Word. Use Minitab for this part.

Checking for Significant Linear Correlation (parts d & e)

  1. Choose Analysis / Correlation and Regression
  2. If you put the seatbelt usage in column 1 and the fatality rate in column 2, then all you need to do is click Evaluate, otherwise adjust the columns first.

Give the Regression Equation (part h)

If you determined that there was significant linear correlation (positive or negative) by rejecting the null hypothesis of no significant linear correlation, then you should use the regression equation given by the computer. Do not write "Y = b0 + b1x" but replace the b0 and b1 by their values. Also replace the Y and x by the names of the variables. Your equation should look something like "fatality = 3.03814 - 0.0230872 seatbelt" (probably not that exactly).

If, however, you decided that there was no signficant linear correlation because you retained the null hypothesis of the correlation test, then you should use the mean of y (y-bar) for the estimated equation. Your equation should be something like "fatality = ####" where #### is the numerical value of the mean of the fatality variable. You'll have to do descriptive statistics to find out what that is.

Amtrak Delays (Question 3)

Gathering the Data

Amtrak keeps data available for the last five (5) days only. Since you need at least six days of information, you will need to collect information on more than one date. Do NOT wait until this is due to start it.

  1. Visit the Amtrak website at http://www.amtrak.com/
  2. The center portion of the screen is split into two parts, Fare Finder and Train Status. Go to the Train Status section.
    1. Leave the Departs box empty
    2. Put CHI in the Arrives box
    3. Put 300 in the optional Train No box.
    4. Click Next
  3. Record the delay in minutes for the indicated date. If the train is early, record the delay as negative. If the delay is given in hours and minutes, you need to convert it into minutes before recording.
  4. Change the date to a previous day and click Resubmit
  5. When you are done with the 300 train, change the train number to 22 and go through the cycle with the different dates.
  6. When you are done with the 22 train, change the train number to 324 and repeat the cycle of dates.

Entering the Information into Statdisk

When entering information into Statdisk, ignore any missing data. Do not put blank rows in the Statdisk data. It is not important that the days match up horizontally.

  1. Label three columns with the names of the trains.
  2. Enter the delays for each train in the appropriate columns

Summarize the Delay Times (part c)

  1. Choose Data / Descriptive Statistics
  2. Click Evaluate to find the statistics for the first train
  3. Change the column to 2 and click evaluate. Record the statistics for the second train.
  4. Change the column to 3 and click evaluate. Record the statistics for the third train.

There is no easy way to combine the numbers using Statdisk. The best way is to retype all the data in a new column and then describe that column. Minitab is once again more powerful.

ANOVA table (parts d - g)

  1. Choose Analysis / One-Way Analysis of Variance
  2. Double check to make sure the proper columns are selected (it defaults to the first three, which is what we want)
  3. Click Evaluate
  4. Copy the results into the ANOVA table. Notice that the order of the columns is different than what is on your worksheet.

The MS(Total) value does not appear in the ANOVA table output in Minitab as it is not technically part of the ANOVA. However, find it by dividing the SS(Total) by df(Total).

The test statistic and p-value are the F and p-value values from the table.

The output also contains the critical F value, but that is not part of the table.