## Grocery Store Project - Phase 5 - Inferential Statistics.

The number in parentheses after each question is the section number the material covers.

#### Q1: Does the mean price of your store differ significantly from the mean of all the stores? (7.4)

In the data description phase, the steering committee found the mean price for all of the data. Compare the mean for your store (set a filter) to that single value. Give results for both regular and sale price. Give the results for both the categories and the combined store data.

Example. If the mean regular price for all of the data is \$1.24, then compare the regular price data for your store to the value 1.24. If the sale price of all items is \$1.22, then compare the sale price data for your store to the value 1.22.

The steering committee should work with their individual groups for this part.

#### Gathering the mean of all the Stores

This was supposed to be done by the steering committee, so you could look at their output, except that we had a couple of missing pieces, so we'll just recreate the information that we need to be safe.

This problem is easiest done using Statdisk in conjunction with SPSS. Gather the summary statistics for all stores and use those values as the claimed means.

Analyze / Descriptive Statistics / Descriptives

Go into Options and select just the information you need. The only piece of information that you need for all of the stores is the mean, so turn off the standard deviation, minimum, and maximum.

Click "Continue" in the "Descriptives: Options" dialog box.

Click "OK" in the "Descriptives" dialog box.

Now, gather the same information for all of the product categories. To do this, you need to split the file on product catgeory.

Data / Split File
Compare Groups by Product Category [group]

Now, repeat the above Descriptive Statistics with the split file.

#### Getting the information for your store

Now, we're going to select just your store and gather the same information. We also need the standard deviation for your store.

Let's begin by selecting just your store using the select cases command. This is the familiar command where the store equals your store number. I'll use Cub Foods as the example here.

Data / Select Cases
If store = 1

Say "OK" in the "Select Cases" dialog box.

Now, we'll go back and get the descriptive statistics for your store.

Be sure to go into Options and turn on the "Standard Deviation".

Click "Continue" in the "Descriptives: Options" dialog box.

Click "OK" in the "Descriptives" dialog box.

Since the file was still split from before, we now have the descriptives for your store by product category, we just need to get the information for the entire store.

Data / Split File
Analyze all cases

Then run the Descriptive Statistics again.

#### Cleaning up the output

We now have output that needs cleaned up. Change the headings to be something descriptive.

You will also notice that the output doesn't match up exactly. The output for all the stores is given as all products and then each category, while the output for your store is given as each category and then all products.

You can move things around in the output file, but you need to be a little careful when doing it.

You may want to widen the frame on the left side of the output window so you can see more of the names.

Then, select the last "Descriptives" section by clicking on the yellow icon to the left of it.

Now drag it up to the item you want it to follow and release it. In this case, that means drag it (hold the left mouse button down and move the mouse) to the "Descriptive Statistics" line in the second "Descriptives" section. Notice you get a little red arrow next to where it is going to go. Release the left mouse button to place it.

The problem is now that the information is inside of a different Descriptives group, rather than at the main level like it should be. This, by the way, won't affect how it is printed, it is only for logical grouping of the information.

You can fix this by clicking on the green left arrow at the top of the output window to "promote" the block one level.

When you do that, you will get the sections in the order you would like them in.

Now, print the information and use it in Statdisk to find the information needed.

In Statdisk, choose a hypothesis testing for a single mean.

The claimed mean will be the mean from all of the stores. The sample size, mean, and standard deviation come from your store.

Repeat the test for each variable for all of the product categories and the entire store.

If you include the output from SPSS, then all you need to do for each comparison is give a probability value and a note about whether the mean of your store is significantly different from the mean of all of the stores.

Here is the original output from SPSS in HTML and PDF form.

Here is the final output from SPSS in HTML and PDF form.

#### Q2: Is the sale price for your store significantly cheaper than the regular price? (8.2)

Set a filter for your store and then compare regular price and sale price. The steering committee should use all available data.

Compare the data for the entire store and for each category (split the file).

#### Making the Comparison

This is a dependent case because we're comparing the regular price to the price paid for all products and each product category. SPSS does this using the paired-samples t test.

The first thing to do is select just your store. Be sure to use a different store number if you're not Cub Foods.

Data / Select Cases
If store = 1

Also, make sure that the file is not split so that we can analyze all the data from the store.

Data / Split File
Analyze all cases

Ok, now we're ready to make the comparison.

Analyze / Compare Means / Paired-Samples T Test

Select each of the variables that you want to compare and then click the arrow to move them to the "paired variables" window of the dialog box. In this case, we want to compare regular price and price paid.

After you click on the right arrow key, you will get a window that looks like this.

Click "OK" to make the comparison.

Now, split the file and repeat the "paired-samples t test".

Data / Split File
Compare groups by Product Category [group]

#### Warnings

You may get some warnings from SPSS that look something like this.
 No statistics are computed for a split file in the Paired Samples Correlations table. The split file is: Product Category=Fruits and Fruit Juices.

The reason this happens is because all of the regular prices and sale prices are the same for the "Fruits and Fruit Juices" product category.

Since all of the prices are exactly the same (identical), there is obviously no difference between the prices. Since all of the differences are zero, the standard deviation will be zero. The formula for the test statistic involves division by the standard error of the mean, and SPSS is intelligent enough to know that you can't divide by zero.

For purposes of your explanation, say that there is no difference in the regular and sale price for fruits and fruit juices.

#### Two Tail Tests

The other thing to watch out for is that SPSS only gives the probability value for a two tail test. Your claim here, though, is that the sale price is significantly cheaper. That's not a two tail test, that's a one tail test.

So, you will need to divide the p-values that SPSS gives by two to find the p-value for this test. Another way to look at it is that if you want alpha=0.05 as a one tail, then if SPSS gives anything less than 0.10, it is significant (since 0.10 divided by 2 is 0.05).

#### Output

As always, you should clean up the output from SPSS.

We don't need the "Paired Samples Correlation". This is from chapter 9, correlation and regression, and not needed for what we're doing now. You may just click on it and hit delete.

We do not need the confidence intervals, even though it gives it to us. This is a bit tricker to remove. Basically, you have to double click the box, then highlight the numeric values for the lower and upper limits and hit the Delete key (or go to Edit / Clear).

You can also click and delete some of the labels on the left (they really don't describe anything since there's only one comparison here) to help shorten the width of the output. If you remove enough of the output, it will print on one page, rather than two.

You can also drag and resize the window.

Here is the original output from SPSS in HTML and PDF form.

Here is the final output from SPSS in HTML and PDF form.

#### Q3: Is the mean price of your store significantly different from each of the other stores? (8.5)

There will be four comparisons here. For example, if you are in the Cub Foods group, then you should make the following comparisons: 1) Cub-Eagle's, 2) Cub-Kroger's, 3) Cub-Schnuck's, and 4) Cub-WalMart.

Give the results for both the regular and sale prices. Compare the data for the entire store and each category (split the file).

The steering committee should work with their individual groups for this question.

#### Making the comparisons

Use the "Independent Samples T Test" in SPSS to make the comparisons. Do NOT filter the file using "select cases".

Analyze / Compare Means / Independent-Samples T Test

We're going to test the regular price [regular] and price paid [price], so select those as test variables.

The independent-samples t test has to have a grouping variable (for classification). This is a third variable used to say which groups we're wanting to compare. Since we're comparing one store to another, we want to use the Store Name as the grouping variable.

You have to define which groups we're going to use. Since we want to compare Cub Foods to all other stores, we will make four comparisons. Cubs against Eagles (1,2); Cubs against Krogers (1,3); Cubs against Schnucks (1,4); and Cubs against WalMart (1,5).

Click on the "Define Groups" button to specify which groups you want to use.

Then click "Continue" in the "Define Groups" dialog box.

Click "OK" in the "Independent-Samples T Test" dialog box.

Repeat for the other three comparisons, changing the groups for each test.

That takes care of the comparison of one store to another. You're also supposed to compare the individual products categories, so you need to split the file.

Data / Split File
Compare groups by Product Category [group]

Now repeat the same four comparisons you made before.

#### Output

You'll want to go through and clean up the output and make comments.

You may wish to re-order the output so that the Cubs / Eagles comparison is together, rather than all of the complete store comparisons first and then all of the store by product comparisons.

Definitely go through and remove some of the extra information from the tables, or it will take over 30 pages to print. You can remove the confidence intervals from the tables as described under question 2, but that won't be enough to get it down to a reasonable size.

##### If your table is too wide ... use print to fit

Double click on any table that is too wide to fit on the page. Then choose "Table Properties" by either clicking the right mouse button inside the table or by going to the "Format" menu choice and clicking on "Table Properties".

Under the "Printing" tab of the "Table Properties" dialog box, check the box for "Rescale wide table to fit page".

The checkbox below that "Rescale long table to fit page" is what you want to use if your table is too long to fit on one page, but you would really like it on one anyway.

You want to make sure that you remove as many unnecessary columns from the tables first, otherwise you will get some really small print.

##### Forcing Pagebreaks - Using Page Titles

You will notice on the sample output that I have each page is labeled with a title.

Click on the output right before where you want the new title to begin. This would normally be the last output of the previous command.

Insert / New Page Title
Enter the text for the title.

Warning! The title will continue from this point until the end of the document unless you go back and insert a blank title.

##### Highlight portions of the text

You may highlight portions of the output that you want to draw people's attention to. For example, I highlighted the product categories that had significant differences in red and bold.

To highlight or change the appearance of text in a table, double click the table, and then highlight the text you wish to change.

• Press ^B to make it bold
• Press ^I to italicize it
• Press ^U to underline it
• Use Format / Font to change the font, size, or color of the text

If you are editing inserted text or titles, then there are pull down menus and icons for each of the above.

I was able to get the output down to 13 pages using the guidelines suggested above.

Here is the original output from SPSS in HTML and PDF form.

Here is the final output from SPSS in HTML and PDF form.

#### Q4: Is there a significant difference in the mean prices of the stores? (11.2)

This will be the same report by all of the groups, so the steering committee should work with their individual groups.

Give the results for both the regular and sale prices. Give the results for each category (split the file) and the overall prices.

If there are significant differences in the mean prices, which stores are different?

Which is the cheapest store? Which is the most expensive store?