Notes about the September 13, 2004, lecture.

The number one issue on Sep 13 was finding the regression equation. Many of you asked for more paper and pencil work rather than Minitab work. Others of you asked why we're doing so much Minitab if it's not on the test. Let me address the second issue first.

Soapbox mode on

The output from Minitab is on the test. You are not required to go through and enter data and generate the output yourself, but you will be required to interpret the output from Minitab. The first day of class, I told you the emphasis was on understanding the statistics, not finding them. We do not have a heavy emphasis on calculations on the test, nor do we have a heavy emphasis on the technology used to obtain the results. We do have a heavy emphasis on understanding the material.

The test is not the only reason we cover material in this class. In fact, only 62.5% of your grade is based on the tests. Another 12.5% of the grade is based on the technology projects and that does require you to go through and use Minitab. Furthermore, just because something isn't on the test doesn't mean it's not important. I have to write a test that you can get done in the time allocated and if I included everything that I thought was important, you wouldn't have time to finish the test. Instead of complaining that we're covering things that aren't going to be on the test, you should be thankful that not everything we cover is going to be on the test. If it bothers you that much, I can stop telling you what's going to be on the test.

Also, you should be reading your book before I cover material. Many of the questions people had about things would be answered if you read the text. I am not going to cover everything in class that will be on the test because I am expecting you to read your textbook. Many requested additional paper and pencil practice. There are problems in the back of the chapter in the text that provide that.

Soapbox mode off

Okay, now for the question at hand.

How do we find the regression equation?

The 50 players in the Major League with the most at bats as of 5 pm on Sep 13, 2004, were selected and the number of runs scored (y) versus the number of at bats (x) were compared.

The Pearson correlation coefficient of atbats and runs was 0.144 and the p-value was 0.318.

Consider the following graph. Each variable has a line drawn at its mean and the scale for each axis is one standard deviation for that variable.

Use the information to find the regression equation.

See the baseball data used for this example.

Problems 1 and 2 in chapter 8 give more practice where you are given some of the values and asked to find the others.