# Technology Exercise 6: Learning About the World

## Internet Delays (Question 1)

There is a command called traceroute that will show the route that information travels to get from one host on the Internet to another. It also shows the round trip time (in milliseconds) that it takes to get to each point along the route.

We're going to collect information about how quickly sites on the Internet in the United States can reach Richland's web server. Of more use to students might be the reverse condition, how quickly can we reach sites on the Internet. So, even though we're collecting the time it takes from the remote server to Richland, we'll use it as the time from Richland to them.

We will collect information from 12 different sites on the Internet. We will do this 5 times at different times. It's possible that the connections may be really good during one time of the day but slow at another time. That's why we're repeating the process and collecting at 5 different times. Those different times could be at different times of the day or on different days. I just ask that there be at least three hours between samplings.

You could collect information from home if you have an Internet connection and then bring it into school in an Excel file. Since the traceroute is from the remote website to Richland and not to your home machine, it won't be affected by dial-up or broadband connections.

### Creating the Worksheet

1. Label column 1 as "sample"
2. Label column 2 as "site"
3. Label column 3 as "time"
4. Choose Calc / Make Patterned Data / Simple Set of Numbers
1. Store the pattern in sample
3. End with the last value of 5
4. List each value 36 times
5. Click OK
5. Choose Calc / Make Patterned Data / Simple Set of Numbers
1. Store the pattern in site
3. End with the last value of 12
4. List each value 3 times
5. Repeat the entire list 5 times
6. Click OK
6. Save your file in R:\01\tech6 or R:\02\tech6 depending on which section you're in. Use a filename that's unique to your group.

When you gather the information, there will be three times for each location. These should go in separate rows.

### Gathering the Data

Click on each of the links that follow below. If the traceroute is successful, the last line should indicate www.richland.edu (64.107.104.12) and then have three times in milliseconds (ms or msec). Some sites will include an AS number, you can ignore that.

#### Sample Traceroute Output

```traceroute to 64.107.104.12 (64.107.104.12), 30 hops max, 40 byte packets
1 inside.fw1.sjc2.mfnx.net (208.184.213.129) 0.293 ms 0.428 ms 0.242 ms
2 99.ge-5-1-1.er10a.sjc2.us.above.net (64.124.216.10) 0.573 ms 0.555 ms 0.596 ms
3 so-2-0-0.mpr3.sjc2.us.above.net (64.125.30.89) 0.487 ms 6.114 ms 0.596 ms
4 so-4-1-0.cr1.ord2.us.above.net (64.125.30.206) 52.288 ms 52.626 ms 52.059 ms
5 pos4-0.mpr1.ord1.us.above.net (208.185.0.198) 51.978 ms 51.946 ms 52.092 ms
6 atm-0-0-0.nap.lincon.net (206.220.243.106) 54.319 ms 56.580 ms 52.386 ms
7 atm2-0-sub02-soib1-peoria-core.peoria.lincon.net (206.166.9.170) 60.125 ms 68.693 ms 74.794 ms
8 atm1-0-sub11-peoria-core-fat-elvis.springfield.lincon.net (206.166.9.186) 60.386 ms 59.775 ms 64.263 ms
9 atm-3-1-0-11-fat-elvis-liberace.springfield.lincon.net (206.166.9.234) 178.928 ms 70.812 ms 132.370 ms
10 10.9.31.2 (10.9.31.2) 66.950 ms 66.038 ms 61.614 ms
11 www.richland.edu (64.107.104.12) 74.816 ms 67.278 ms 61.843 ms ```

Enter the site number (these are already entered if you followed the steps above) and the three times into Minitab. Each site will take up three rows of data. Do not enter the units on the times. The data above would look like this.

row sample site time
1 1 1 74.816
2 1 1 67.278
3 1 1 61.843

Click on each link below to start the traceroute. Be patient, some of the traceroutes can take a while. Also, there will be a noticeable delay on most of the sites right before you reach Richland. In many cases, you will see a "* * *". This is okay, just be patient. After you have entered the information into Minitab, hit the back button on your browser to come back to here and visit the next site.

Collect all 12 sites' information at the same sitting so that we are comparing similar conditions. Wait at least three hours before repeating the process. If you want to do this at home, create an Excel spreadsheet with the two columns, then bring it into school and copy and paste the information into Minitab.

1. above.net
2. io.com
3. his.com
4. sdsc.edu
5. calweb.com
6. getnet.com
7. princeton.edu
8. opus1.com
9. socket.net
10. wvi.com (rounds to nearest ms)
11. xmission.com (generates entire page before displaying, be patient; rounds to nearest ms)
12. playground.net (generates entire page before displaying, be patient)

### Conducting the Hypothesis Test

Check the assumptions. If they aren't satisified, address any concerns that may have in interpreting the data.

To make a probability plot, do the following. You can also use the graphical summary under Descriptive Statistics to get a histogram.

1. Choose Stat / Basic Statistics / Normality Test
2. Select the time variable
3. Click OK

To conduct the hypothesis test

1. Choose Stat / Basic Statistics / 1 Sample t
2. Select the variable time
3. Enter the claimed value for the mean in the test mean box.
4. Go into options and make sure the values are set properly.
5. Click OK

## Email Spam (Question 2)

I have collected some information about spam email and placed it into a file. There is too much information (about 70% of the email we get at Richland gets labeled as spam), so we will need to take a random sample to work with. The data is in the worksheet spam.mtw.

1. Choose File / Open Worksheet
2. Navigate through the system to the tech6 folder for your section.
3. Highlight the spam.mtw file
4. Click Open

### Taking a Random Sample

1. The worksheet has one column in it called "spam". Label another column as "score".
2. Choose Calc / Random Data / Sample from Columns
1. Sample 35 rows
2. Sample from the spam column
3. Store the data in the score column
4. Do not sample with replacement
5. Click OK

### Checking Assumptions

Check your assumptions by getting a graphical summary of the data and a probability plot.

### Testing the claim

1. Choose Stat / Basic Statistics / 1 Sample t
2. Select the score variable
3. Enter the claimed value for the test mean
4. Go into options and make sure they're set properly for your test.
5. Click OK

## Designated Hitter (Question 3)

### Entering the Data

1. Choose File / New / Minitab Worksheet to create a new worksheet.
2. Label three columns as Team, League, and OBP.
3. Go to the Major League Baseball site at http://mlb.mlb.com/NASApp/mlb/mlb/stats/mlb_sortable_team_stats.jsp and enter these options on the left side in the "Sortable Team Stats" section.
1. Choose Major League
2. Choose Hitting Stats
3. Split by the Entire Season
4. The Timeframe is 2003 to Date
5. Click GO. Note that sometimes this site is slow. If it doesn't respond after a few seconds, you may want to click on the stop button and re-click GO.
6. Click on Team at the top of the data.
4. For each team, enter the name of the team, the League (AL or NL), and the On Base Percentage (OBP)

### Performing the Hypothesis Test

1. Choose Stat / Basic Statistics / 2 Sample t
2. The samples are in one column
3. The samples are in the OBP column
4. The subscripts are in the League column
5. Check the Options and make sure they're set properly
6. Click OK

## Better Pitcher at Home? (Question 4)

### Entering the Data

Just use the same worksheet you started for question 3.

1. Label two more columns, one as Home and one as Away.
2. Go to the Major League Baseball site at http://mlb.mlb.com/NASApp/mlb/mlb/stats/mlb_sortable_team_stats.jsp and enter these options on the left side in the "Sortable Team Stats" section.
1. Choose Major League
2. Choose Pitching Stats
3. Split by Home
4. The Timeframe is 2003 to Date
5. Click GO. Note that sometimes this site is slow. If it doesn't respond after a few seconds, you may want to click on the stop button and re-click GO.
6. Click on Team at the top of the data.
3. Enter the Earned Run Average (ERA) values into the Home column for the proper teams. If you clicked on Team like step f says, then the order of the teams should be the same as what you previously entered. If not, be careful to match up the ERA values with the proper team.
4. Repeat step 2, except this time, Split by Away.
5. Enter the ERA values for each team in the Away column.

### Performing the Hypothesis Test

1. Choose Stat / Basic Statistics / Paired t
2. The first sample is in Home
3. The second sample is in Away
4. Go into options and make sure they are set properly. Note that Minitab compares sample 1 to sample 2, so make sure your alternative is set up properly with Home on the left and Away on the right.
5. Click OK

## Show the T Approaches the Normal (Question 5)

Start this in a new worksheet.

1. Label the first column as x, the second as z, the third as t1, and the fourth as t5.
2. Go to Calc / Make Patterned Data / Simple Set of Numbers. Start at -3 and go to 3 with a step size of 0.01. Store the results into x
3. Go to Calc / Probability Distributions / Normal. Select Probability Density, set the input column to x and the optional storage to z.
4. Go to Calc / Probability Distributions / t. Select Probability Density, set the degrees of freedom to 1, the input column to x and the optional storage to t1
5. Go to Calc / Probability Distributions / t. Select Probability Density, set the degrees of freedom to 5, the input column to x and the optional storage to t5
6. Go to Graph / Plot
1. Create three plots. For the y-variable, use z, t1, and t5. Use x for each of the x variables.
2. Change the data display from symbol to connect.
3. Go into Edit attributes and change the line type to solid for all three graphs. Choose three different colors for the graphs. Change the line thickness to 2. Make a note of which graph is which color, you may want to use that information in step g to correctly label the graphs.
4. Click on Frame and choose Multiple Graphs. Overlay the graphs on the same page
5. Click on Annotation and choose Title. Enter all of the names of the people in your group on the first line and change the font size for that line to 1.0.
6. Click OK to generate the graph
7. Double click on the graph to bring up the toolbar. Click on the big T in the upper right. Click on the graph next to the normal curve and label it "z". Label the studentâ€™s t curves with "t=1" and "t=5". You may need to reposition the text after entering it so that it is viewable. You can do that by clicking on the object, holding down the mouse button, and dragging it to a new location.
8. Remove the label on the vertical axis (you have three graphs, but just one label) by clicking on it and hitting delete.
7. Copy the graph into Word and add an explanation of what we're looking at.