# Technology Exercise 2: Gathering Data

This project deals with simulations. There are two kinds of simulations that we're going to look at. One involves finding the chance or likelihood of something happening; the other involves finding the average or expected number of times something will take to occur.

This project will go much quicker if you work in your teams. Have two of the members perform the simulation part and the third person record their results.

## Having Two Kids of Each Sex? (Question 1)

This is a simulation that is looking for the average number of kids you'll have to have before having at least two of each gender. We will simulate families and determine the size of each family. We'll repeat this process until we've generated 50 families and then find the average of the fifty families.

### Setup

There are two sexes to choose from, boy and girl. We'll put those genders into a column and then sample from the column.

1. Label blank columns in Minitab as "sex", "family", and "size".
2. Enter "boy" and "girl" (or "male" and "female") into the sex column.

### Simulation

Since boys and girls can occur more than once, we'll be sure to sample with replacement. Although theoretically possible to have 25 kids without having at least two of each gender, it's highly unlikely. Make the number larger if you don't feel like taking the risk.

1. Choose Calc / Random Data / Sample from Columns
2. Sample 25 rows from the "sex" column and store the samples into "family".
3. Check the Sample with replacement box.
4. Click OK.

### Interpretation of Results

1. Look through the "family" column until you have at least two boys and two girls. Record the number of kids it took into the next available spot in the "size" column
2. We now simulate another family and determine its size. The easiest way to do this is to hit Control-E and enter. Control-E brings up the last command, which was to generate the first family. You can also click on the "Edit Last Dialog" icon, but it's really quicker just to hit Control-E and enter.
3. Repeat steps 1 and 2 until you have 50 families.
4. Find the descriptive statistics for the family "size".

### Sample Data

Here are some sample families as generated by Minitab. The family size is the number of kids before there are at least two boys and two girls.

# Size Family
1 4 girl, girl, boy, boy
2 5 girl, boy, girl, girl, boy
3 6 boy, boy, boy, girl, boy, girl

Although you're only asked to find the average family size, you may also want to comment on the shape and spread of the data.

## Picking Teams (Question 2)

There are four couples to play games. The easiest way to do this is to just use the last name of the couples and make up four different names. Put each name in twice, but make sure you do not sample with replacement.

### Setup

1. Create columns called "name", "teams", and "good".
2. Pick four last names and enter each one twice into the "name" column.

### Simulation

1. Choose Calc / Random Data / Sample from Columns
2. Sample 8 rows from "name" and store the results into "teams".
3. Make sure that the box for sampling with replacement is not checked.
4. Click OK

### Interpreting the Results

Assume that players 1 and 2 make one team, players 3 and 4 make another, players 5 and 6 make the third, and players 7 and 8 make the fourth team.

1. If any team is made up of two people with the same last name, then we did not get what we wanted (remember, we do not want couples to be on the same team). In that case, enter "no" into the next blank spot in the "good" column. If no team has the same name for both players, then we got what we want and enter "yes" into the next blank row of the "good" column.
2. Hit control-E and enter to repeat the process with another sampling of teams.
3. Repeat steps 1 and 2 until you have simulated the experiment 50 times.
4. Create a frequency table for the "good" column.
1. Choose Stat / Tables / Tally Individual variables
2. Turn on percents
3. Click OK

We're interested in the percent of "yes" results.

### Sample Data

In the following example, each column represents a different selection of teams. The bottom row represents whether or not that selection was good (no couples on a team) or not. The "good" will be a column when you do this in Minitab.

# Teams Teams Teams Teams
1 Smith Jones Anderson Williams
2 Jones Smith Smith Jones
3 Jones Williams Williams Smith
4 Anderson Jones Jones Smith
5 Williams Smith Smith Jones
6 Smith Anderson Williams Anderson
7 Anderson Williams Anderson Williams
8 Williams Anderson Jones Anderson
good? yes yes yes no

As you can see in the fourth simulation, team 2 (players 3 and 4) were both Smiths.

## Da Bulls! (Question 3)

### Setup

1. Label two columns as "shot" and "streak".

That's all the setup there is that's necessary.

### Simulation

1. Choose Calc / Random Data / Bernoulli
2. Generate 19 rows of data
3. Store them in column "shot"
4. The probability of success is 0.406.
5. Click OK

### Interpreting the Results

The Bernoulli distribution will return a one 40.6% of the time and a 0 the other 59.4% of the time. We will let 1 stand for a shot that was made and a 0 stand for a shot that was missed.

1. Look through the 19 shots and see if there are at least four (4) made shots (represented by 1) in a row. If there are, then enter "yes" in the next available spot of the "streak" column. If there aren't, then enter "no" in the next spot of the "streak" column.
2. Hit control-E and enter to generate another 19 shots.
3. Repeat steps 1 and 2 until you've simulated 50 games.
4. Create a frequency table with percents for the "streak" variable.

### Sample Data

# streak shots
1 no 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 0
2 no 0 1 0 1 1 1 0 0 0 0 0 1 0 0 1 1 0 0 0
3 no 1 0 0 0 1 0 1 1 0 0 1 0 0 0 1 1 0 1 1
4 yes 1 1 0 1 1 1 1 1 1 0 1 1 0 0 1 1 1 0 0

Notice that it has to be at least four 1's in a row. It is okay to be more, as in the fourth game simulated above. If it should happen that he has at least four shots in a row twice in a game, you still only count the overall game as "yes" once ... don't count it twice.

## Shopping Spree (Question 4)

Let's see how you can do with this one on your own. Here are some general steps. You fill in the particulars.

1. Create a column that contains the name of five cards from the deck, one of which should be an ace.
2. Create a column to represent the order of the cards drawn.
3. Create a column that will contain the amount of the shopping spree won based on the position the ace is drawn in.
4. Draw the cards (you decide how many and whether or not replacement is necessary).
5. Enter the amount of money won.
6. Repeat until you've got 50 people who've won.
7. Find the average amount of money won.

## One Pair (Question 5)

### Setup

We need to create a deck of cards. There are thirteen possible values since we don't care about what suit the card is from. However, we can't just put in the thirteen values and use replacement because we might possibly get five of a kind and that would be really bad for a poker hand. So, what we have to is actually enter each of the thirteen possible values four times (one for each suit).

1. Create columns called "cards", "hand", and "pair".
2. Choose Calc / Make Patterned Data / Text Values
3. Store the patterned data in "cards"
4. The text values are "ace 2 3 4 5 6 7 8 9 10 jack queen king". Just separate each one with a space.
5. Either list each value 4 times or list the whole sequence 4 times. Either one is okay, but not both. Use 1 for the other value.
6. Click OK

### Simulation

1. Choose Calc / Random Data / Sample from Columns
2. Sample 5 cards from the "cards" column and store the results in the "hand" column.
3. Make sure there is no replacement.
4. Click OK

### Interpreting the Results

1. Look at the five cards in your hand. If there is exactly one pair (one of the values that occurs twice) then enter "yes" in the next available spot of the "pair" column. Otherwise, enter "no" in that spot.
2. Hit control-E and enter to generate another poker hand.
3. Repeat steps 1 and 2 until you've generated 50 poker hands.
4. Create a frequency table with percents for the "pair" variable.

### Sample Data

# pair hand
1 no 3 7 8 5 ace
2 yes 2 7 8 3 2
3 yes jack 2 6 queen 2
4 no 5 8 9 jack 10

The 2nd and 3rd rows each had exactly one pair. They both happened to be a pair of 2's, which is the worst pair you can have, but at least it was a pair. Should you happen to get data like "2 7 2 8 2", you would not call that a pair because it's three of a kind. Technically, there is a pair in three of a kind, but we're looking for the probability of getting exactly one pair.