Technology Exercise 3: Inferential Statistics
First Amendment (Question 1)
Generating 200 samples of size 1000 (Question 1a)
Don't bother to label your columns, we'll refer to them as C1, C2, C3, etc.
You can specify a range of columns as C1-C10 (to refer to the ten columns C1,
C2, C3, ..., C10).
This problem satisfies the conditions of a binomial experiment. There are
a fixed number of independent trials each having exactly two possible outcomes.
The number of trials is 1000, one person's answer doesn't affect another's,
and there are two possible answers (opposed or not opposed). Note that not
opposed isn't the same as in favor of, it might include those who are undecided
or don't care, but we need two categories, success or failure, so success is
someone who is opposed and failure is everyone else.
Since this is a binomial problem, each of the 1000 persons is a Bernoulli
trial with probability of 0.30 of success.
- Choose Calc / Random Data / Bernoulli
- Generate 1000 rows of data
- Store them in C1-C200 (columns 1-200). This allows you to do all 200
trials at once. Pretty nifty, huh?
- The probability of success is 0.3.
- Click OK
Confidence Intervals (Question 1b)
- Choose Stats / Basic Statistics / 1 Proportion
- The samples are in columns C1-C200
- Click on Options
- Check the box that says "Use test and interval based on normal
distribution"
- Optionally (we won't use it here, but my example uses it) change
the test proportion to be 0.3
- Click OK
- Click OK
- Look at each of the 200 lines of output. Each one will contain a confidence
interval inside parentheses like (0.259934, 0.316066)
- Count the number of times that 0.3 does not fall in the confidence interval.
This happens when the lower number is more than 0.300000 (begins with a 0.3)
or the higher number is less than 0.300000 (begins with 0.2)
- Subtract the number of times that the interval does not contain the 0.3
from 200 to find the number of times the value fell in the confidence interval
- Find the percent of times the value of 0.3 fell in the confidence interval.
Sample Output
Do NOT copy all of the output into your Word document. Make a chart like what
appears below. That is, copy the header and then each case where 0.3 did NOT
fall in the interval. Then go ahead and answer the question about the percent
of the time time that 0.3 did fall in the interval.
Please note that your Z-Values and P-Values are not used here, but will look
drastically different if you don't change the test proportion to 0.3 as indicated
in the optional step above.
Variable X N Sample p 95% CI Z-Value P-Value
C31 269 1000 0.269000 (0.241516, 0.296484) -2.14 0.032
C42 341 1000 0.341000 (0.311619, 0.370381) 2.83 0.005
C58 271 1000 0.271000 (0.243452, 0.298548) -2.00 0.045
C65 262 1000 0.262000 (0.234746, 0.289254) -2.62 0.009
C85 334 1000 0.334000 (0.304768, 0.363232) 2.35 0.019
C121 271 1000 0.271000 (0.243452, 0.298548) -2.00 0.045
C122 266 1000 0.266000 (0.238613, 0.293387) -2.35 0.019
C125 268 1000 0.268000 (0.240548, 0.295452) -2.21 0.027
C129 341 1000 0.341000 (0.311619, 0.370381) 2.83 0.005
C139 270 1000 0.270000 (0.242484, 0.297516) -2.07 0.038
C169 272 1000 0.272000 (0.244420, 0.299580) -1.93 0.053
C173 335 1000 0.335000 (0.305746, 0.364254) 2.42 0.016
In my sample, there were 12 times that the interval did not contain the 0.3.
That means that there were 188 times that did or 94% of the confidence intervals
contain the true population proportion. That's pretty close to the 95% of the
time implied by the confidence level.
Finding the Means (Question 1c)
- Choose Stats / Basic Statistics / Store Descriptive Statistics
- The variables are C1-C200
- Click on Statistics and check Sum. Uncheck everything else.
- Click OK. What this command did was to add up the number of 1's in
each column and create 200 new columns called sum1, sum2, ..., sum200
that each contain the number of successes for each of the 200 samples.
- Unfortunately, those 200 samples need to be in one column so we can work
with them instead of in 200 columns. This step will fix that. Choose Data
/ Stack / Columns.
- Stack the columns sum1-sum200 (or you could use C201-C400)
- Store the stacked data into a new worksheet. You can leave the name
of the worksheet blank or give it a name like "amendment" to
make it easier to find.
- Make sure the Use variable names in subscript column box is not checked.
- Click OK.
- The new worksheet contains two columns. One called Subscripts that contains
the numbers 1 through 200 and C2, it isn't labeled, and contains the number
of successes for each of the 200 trials. Label C2 as "x". In a
binomial experiment, x represents the number of successes.
- Label a blank column as "p" for p-hat. When dealing with proportions,
p hat (a p with a carat symbol over it) is the sample proportion.
- Choose Calc / Calculator
- Store the results into p
- The expression is "x /1000". The 1000 is our sample size.
- Click OK
- Choose Stat / Basic Statistics / Graphical Summary
- The variable is p
- Click OK
When you copy this into Word, be sure to drag the box a little bit bigger
so that you can read all the numbers.
Be sure to close the worksheet that contains the data in C1-C200 and Sum1-Sum200.
If you don't, the file will be about 75 times bigger than it needs to be and
can easily fill up all of the allocated space on the network drive. When this
happened, students were unable to save their work and lost everything they
had done. So be sure to close that first worksheet before saving your file.
Central Limit Theorem (Question 2)
Create a Discrete Probability Distribution (Question 2a)
A probability distribution is a list of all the values that a random variable
x can assume along with their associated probabilities.
Here is an example of a probability distribution.
x |
-5 |
1 |
3 |
10 |
p(x) |
0.24 |
0.15 |
0.36 |
0.25 |
Your probability distribution has to have at least 5 values for x. Remember
that the probabilities have to add up to 1.
Enter your probability distribution into a table in Word. Add one more column
for the total and two more rows to the table for xp(x) and x2p(x)
and complete them. Your table should look similar to this
x |
-5 |
1 |
3 |
10 |
Total |
p(x) |
0.24 |
0.15 |
0.36 |
0.25 |
1.00 |
xp(x) |
-1.2 |
0.15 |
1.08 |
2.5 |
2.53 |
x2p(x) |
6 |
0.15 |
3.24 |
25 |
34.39 |
You will need to use the values from the total column to answer part b. That's
not done with Minitab.
Creating the Worksheet
- Label the first column as x and the second column as p
- Enter the values for x into the first column
- Enter the corresponding probabilities into the second column
- Label the third, fourth, and fifth columns as mean4, mean25, and mean100,
but don't put anything in them yet.
Simulating Samples (Question 2c)
- Choose Calc / Random Data / Discrete
- Generate 1000 rows of data
- Where you store your data depends on your sample size.
- When n=4, store the data in C101-C104
- When n=25, store the data in C101-C125
- When n=100, store the data in C101-C200
- The values are in the x column
- The probabilities are in the p column
- Click OK
- Choose Calc / Row Statistics
- Select the Mean as the statistic to calculate
- The Input variables depends on your sample size
- When n=4, use C101-C104
- When n=25, use C101-C125
- When n=100, use C101-C200
- Where you store the result depends on your sample size
- When n=4, use mean4
- When n=25, use mean25
- When n=100, use mean100
- Click OK
- Repeat steps 1 and 2 with n=25 and n=100
- Choose Stats / Basic Statistics / Graphical Summary
- The variables are mean4, mean25, and mean100
- Click OK
Creating a the Graph from a Hypothesis Test (Question 3)
You
are asked to create the graph to the right.
- Label two columns of the worksheet as z and p
- Choose Calc / Make Patterned Data / Simple Set of Numbers
- Store the patterned data in z
- Start at -3.5
- Go to 3.5
- Use a step size of 0.1
- Choose Calc / Probability Distributions / Normal
- Choose Probability Density
- The input column is z
- The storage is p
- Choose Graph / Scatterplot / With Connect Line
- The y variable is p and the x variable is z
- Choose Scale
- Under Axes and Ticks, uncheck all boxes for the high y-scale and
the high x-scale. Check all boxes for the low x-scale. The only box
that should be checked under the low y-scale is the axis line.
- Under Reference Lines, show reference lines for X positions of
-1.96 and 1.96. Enter them both separated by a space. NOTE: When
you come back to do the one-tail diagrams, these numbers will change
and there will only be one of them.
- Choose Labels
- Enter a Title of "Two Tailed Hypothesis Test". NOTE:
This will change for a one-tail test. Enter the appropriate title.
- Choose Data View
- Under Data Display, turn off the Symbols
- Click OK to generate the graph
- Now comes the fun part, we get to clean up the graph and annotate it! Make
sure your graph is essentially correct before continuing. If you have to
go back and re-generate the graph using a scatterplot, you will lose all
annotations.
- Most of these involve editing a piece of the graph. There are a few ways
to do this. You can 1) double left click the item, 2) position your mouse
over the item, click the right mouse button, and choose Edit, or 3) left
click on the item and then press Control-T on the keyboard (or choose the
right mouse button and pick edit). The instructions below will just say "edit
this part of the graph" and you can use any of the methods to perform
the edit.
- To clean up the labels we don't want, click on the "p" on
the left side of the graph and delete it. Do the same thing for the "z" at
the bottom of the graph.
- Edit the normal curve
- Change the line type to custom with a size of 3
- Edit the y axis scale on the left side of the graph.
- Under Scale, change the minimum scale range to 0 and the maximum
scale range to 0.6.
- Under Show, uncheck all boxes, we don't want a vertical axes at
all
- Edit the x axis scale at the bottom of the graph
- Under Scale, change the minimum scale range to -3.5 and the maximum
scale range to 3.5
- Under Attributes, change the line size to 2
- Under Font, change the font to Arial and the font size to 12
- Edit the reference line at 1.96. (NOTE: This number will change for
the one tail tests)
- Under Attributes, select a Custom line type of dashed lines with
a thickness of 2. Check the box that says to apply attributes to
all reference lines to save having to do this to the other line at
-1.96
- Under text, change it from 1.96 to "Critical Value z=1.96"
- Under font, change the font to Arial, the color to red and the
font size to 12. Check the box to apply font changes to all reference
lines
- Under alignment, set the text angle to 90 degrees, choose a custom
position and make it "Below, to the left"
- Edit the reference line at -1.96 (NOTE: You will not have a second
reference line for a one tail test).
- Under text, change the text to "Critical Value z=-1.96"
- Under alignment, set the text angle to 90 degrees, choose a custom
position and make it "Below, to the right
- Edit the title and change the font to Arial and the font size to 20
points.
- At
this point, we are ready to start adding things to the graph, rather than
just changing what is there. There is a graph annotation toolbar that is
active while you're editing a graph. If the toolbar does not appear on
your screen, then go to Tools / Toolbars / Graph Annotation Tools and turn
it on. Until this point, we have been using
the Select Mode (the arrow), but now we're going into insert Text (the T)
or a Lines (the line). Click on the appropriate mode to insert the right
object. Here are generic instructions for each type of object.
- Inserting Text
- Click the T from the graph annotation toolbar
- Click on the graph where you would like the upper left corner of
the text to be (this can be changed later)
- Type the text
- Click OK
- After the text is on the screen, you can edit it, change the font
(to Arial), color, or size. You can also drag it to a new location
on the screen since it is very unlikely that you will get it where
you want it the first time
- Fine control over the position of the text can be obtained by using
the arrow keys or shift arrow (for medium control).
- Inserting Lines
- Click on the Line on the graph annotation toolbar
- Click (and hold) the left mouse button where you would like the
line to start
- Drag the mouse to where you would like the line to end. You can
force a horizontal, vertical, or diagonal (45 degree) line by holding
the shift key down while dragging. Let up on the mouse button when
you reach the destination.
- If you don't get the line exactly where you want it, you can drag
the line once it's on the screen. You can lengthen the lines by draging
the ends, but be careful because the shift key doesn't keep it horizontal
or vertical when at this point.
- You can then edit the line to change the type, color, or size,
or add arrows
- Fine control over the position of the line can be obtained by using
the arrow keys or shift arrow (for medium control).
- Insert the horizontal lines for the areas. You will need three of them,
one for the left portion, one for the middle, and one for the right.
For each line, do the following.
- Change the style to small dashes, the color to dark green, the thickness
to 2 and add arrows where appropriate.
- Use the mouse or arrow keys to carefully align the lines with each
other
- Insert the labels for "Critical Region", "Non-Critical Region", "Retain
Ho", and "Reject Ho". There is not an easy
way to put in an H0, so just use a lowercase O so it is close.
- Change the font to Arial
- Change the font size to 14.
- The labels for the areas are the difficult part. You
can choose the easy way and I'll be okay with that if you want to.
- The easy way.
- Insert text that says "1-a=0.95" and "a/2=0.25". That
is, use the a instead of an alpha.
- Then edit the text and set the font to Symbol and make the size
12
- The more difficult way, but it keeps the Arial font going
- Insert the text, but when you need an alpha, insert three spaces
instead (only necessary if the alpha is in the middle of the line
like the area for the non-critical region).
- Change the font to Arial and the size to 12
- Now go back and create another text object. This one should only
contain the letter a.
- Change the font on this a to be Symbol and the size to be 14
- Carefully position the alpha so that it is in the proper location
within the annotation.
- Okay, now that I think about it, I should have probably just stuck
with the easy way when I created it, but I was trying to get all the
numbers to match up with the same font.
- Don't close your graph out. Make sure you save it as part of your project.
But when you're all done, copy and paste it into Word.
- Now repeat the same thing for the left and right tailed hypothesis tests.
Remember that you will only have one critical value for those and it won't
be at -1.96 any more.