Technology Exercise 3: Randomness & Probability

General Social Survey (Question 1)

Getting the data

  1. Visit the General Social Survey website at http://www.icpsr.umich.edu:8080/GSS/homepage.htm
  2. Click on Analyze
  3. Select Frequencies or crosstabulation and then click Start
  4. Enter these values (without the quotes) on the screen that appears.
    1. The Row variable should be "sex"
    2. The Column variable should be "race"
    3. The Filter variable should be "year(2000)"
  5. Uncheck any percentaging boxes.
  6. Click Run the Table

Entering the data into Minitab

  1. Label three empty columns as "sex", "race", and "frequency"
  2. Enter "male" three times and then "female" three times in the sex column.
  3. Enter "white", "black", and "other" in the race column. Repeat the sequence so the whole sequence is there twice.
  4. Enter the frequencies from the webpage for each pair of categorical variables.

Your data should look something like this (but be sure to use your data from the year 2000)

row sex race frequency
1 male white 1002
2 male black 144
3 male other 86
4 female white 1239
5 female black 256
6 female other 105

Saving the data

  1. Choose File / Save Project As
  2. Navigate the file system to the R: drive
  3. Choose section 01 or 02 as appropriate
  4. Choose the tech3 folder
  5. Save your data under a name that is unique to your group.

Be sure to open up this file any time you need to do further work on your project.

Creating the contingency table

  1. Choose Stat / Tables / Cross Tabulation and Chi-Square from the menu
  2. The row variable should be sex and the column variable should be race.
  3. Tell Minitab that the Frequencies are in the frequency column.
  4. Display the Counts and nothing else.
  5. Click OK

Creating a joint probability distribution

Repeat the above steps for the contingency table (remember that control-e is a quick way to pull up the last command), except display the Total percents. Typically, probability distributions are written as decimals, but Minitab displays them as percents. Be sure to answer as a decimal when you answer the questions later.

Creating conditional probability distributions

Repeat the above steps for a contingency table, except display the Row percents or the Column percents (depending on which conditional percents you want - note that I asked for both). If you check Row percents and sex is your row variable, then you'll get the percent of each sex that are white, black, or other and be able to answer questions like "What is the chance that a randomly selected male is black?"

After you have created all of the probability distributions, you can use them to answer the questions.

Demonstrate the Law of Large Numbers (Question 2)

Graph demonstrating law of large numbersYour output for this section will be a graph similar to the one to the right. My example uses 0.5 as the probability of success for each trial. Be sure you use the probability given you in class.

Generating the data

  1. Choose File / New / Minitab Worksheet. This data is so very different from the last problem that we'll put it in a different worksheet just to keep it separate.
  2. Label empty columns as "n", "x1", "x2", "x3", "x4", "p1", "p2", "p3", and "p4".
  3. Fill in the numbers from 1 to 1000 into the "n" column
    1. Choose Calc / Make Patterned Data / Simple Set of Numbers
    2. Store the patterned data in "n".
    3. Start with the first value of 1
    4. Go to the last value of 1000
    5. Click OK
  4. Generate the random data
    1. Choose Calc / Random Data / Bernoulli
    2. Generate 1000 rows of data
    3. Store in columns x1, x2, x3, and x4
    4. Use the probability of success that was given to you. Enter it as a decimal.
    5. Click OK
  5. Find the accumulated probabilities
    1. Choose Calc / Calculator
    2. Store the results into "p1"
    3. The expression is "pars(x1)/n"
    4. Click OK
    5. Repeat steps a-d, but change the variables. Use p2 with x2, p3 with x3, and p4 with x4.

Making the Graph

  1. Choose Graph / Time Series Plot / Multiple
  2. The Series are contained in columns p1, p2, p3, and p4.
  3. Turn off the symbols. There are too many data points to show symbols.
    1. Click on Data View
    2. Uncheck the Symbols box
    3. Click OK
  4. Add a reference line at your probability so we can see the values getting close to it.
    1. Click on Time / Scale
    2. Choose the Reference Lines tab
    3. For the Y positions, enter your probability
    4. Click OK
  5. Change the Title
    1. Click on Labels
    2. Use a title like "Law of Large Numbers"
    3. Give it a subtitle with the probability of success, something like "p = 0.5"
    4. Click OK
  6. Click OK

Cleaning up the Graph

  1. Get rid of the legend, it's not necessary for this graph. Do this by right clicking the mouse over the legend and selecting Delete from the menu.
  2. Change the line style.
    1. Click on the points on the graph
    2. Click the right mouse button and choose Edit Connect Line
    3. Change the lines to custom instead of automatic
    4. Change the type of line to be a solid line (instead of none)
  3. Change the size of the reference line at your probability
    1. Right click on the reference line and choose Edit Y: Ref Line
    2. Change the line to be custom instead of automatic
    3. Change the style of line to be dashed
    4. Change the color of the line to be dark brown or some other color that's not being used
    5. Change the size of the line to be 2
  4. Relabel the axes
    1. Double click on the word "Index" below the X axis
    2. Change the text to be "# of trials"
    3. Click OK
    4. Double click on the word "Data" to the left of the Y axis
    5. Change the text to be "probability"
    6. Click OK

Check your graph and make sure everything is okay. If it is, then copy and paste it into Word.

Jalen Rose (Question 3)

Generating the data

  1. Create a new worksheet to hold the Jalen Rose data.
  2. Label two empty columns as "shots made" and "probability"
  3. The number of shots made can be anywhere from 0 (a really bad game) to 19 (a most excellent game). Jalen can't make less than 0 shots, nor can he make more than attempts. Fill in the "shots made" column with the whole numbers from 0 to 19
    1. Go to Calc / Make Patterned Data / Simple Set of Numbers
    2. You should be able to figure it out on your own by now.
  4. Find the probabilities for each of those shots made.
    1. Choose Calc / Probability Distributions / Binomial
    2. Choose Probability
    3. The number of trials is 19
    4. The probability of success is 0.406
    5. The input column is "shots made"
    6. The output column is "probability"
    7. Click OK

Copying the probability distribution into Word

  1. Highlight the "shots made" and "probability" columns by clicking on the C1 and C2 at the top.
  2. Copy the data
  3. Switch to Word
  4. Paste the data.
  5. Highlight the data in Word
  6. Add a tab at about 1" so that the data lines up with the headings. To add a tab, just click on the ruler where you want the tab to go. If the ruler isn't showing, go to View / Ruler.

As an alternative, you can go to Manip / Display Data in Minitab. Display both variables and then you can copy and paste into Word as normal. This includes the row number, which isn't necessary, which is why I gave the previous instructions.

Making a graph

  1. Choose Graph / Scatterplot / Simple
  2. The Y variable should be "probability"
  3. The X variable should be "shots made"
  4. Add a title by choosing Labels
  5. Click OK to generate the graph

Cleaning up the graph

  1. Fix the scale on the X axis so that it goes from 0 to 19
    1. Right click on the x-axis and choose Edit X scale
    2. Uncheck the Auto check boxes for the minimum and maximum under the scale range
    3. Change the X minimum to be -0.5 and the X maximum to be 19.5. It is necessary to subtract and add 0.5 for the graph to appear properly.
    4. Under Major Tick Positions, click number of ticks and enter 20
    5. Click OK
  2. (Optional, Recommended) Change the color and size of the symbols.
    1. Right clicking on the points and choosing Edit Symbols.
    2. Change to custom instead of automatic
    3. Change the point size to something like 1.5
  3. Click OK

As always, copy and paste the graph into Word and then explain what we're looking at.