Stats: Conditional Probability


Conditional Probability

Recall that the probability of an event occurring given that another event has already occurred is called a conditional probability.

The probability that event B occurs, given that event A has already occurred is

   P(B|A) = P(A and B) / P(A)

This formula comes from the general multiplication principle and a little bit of algebra.

Since we are given that event A has occurred, we have a reduced sample space. Instead of the entire sample space S, we now have a sample space of A since we know A has occurred. So the old rule about being the number in the event divided by the number in the sample space still applies. It is the number in A and B (must be in A since A has occurred) divided by the number in A. If you then divided numerator and denominator of the right hand side by the number in the sample space S, then you have the probability of A and B divided by the probability of A.

Examples

Example 1:

The question, "Do you smoke?" was asked of 100 people. Results are shown in the table.
. Yes No Total
Male 19 41 60
Female 12 28 40
Total 31 69 100

After that last part, you have just worked a Bayes' Theorem problem. I know you didn't realize it - that's the beauty of it. A Bayes' problem can be set up so it appears to be just another conditional probability. In this class we will treat Bayes' problems as another conditional probability and not involve the large messy formula given in the text (and every other text).

Example 2:

There are three major manufacturing companies that make a product: Aberations, Brochmailians, and Chompielians. Aberations has a 50% market share, and Brochmailians has a 30% market share. 5% of Aberations' product is defective, 7% of Brochmailians' product is defective, and 10% of Chompieliens' product is defective.

This information can be placed into a joint probability distribution
Company Good Defective Total
Aberations 0.50-0.025 = 0.475 0.05(0.50) = 0.025 0.50
Brochmailians 0.30-0.021 = 0.279 0.07(0.30) = 0.021 0.30
Chompieliens 0.20-0.020 = 0.180 0.10(0.20) = 0.020 0.20
Total 0.934 0.066 1.00

The percent of the market share for Chompieliens wasn't given, but since the marginals must add to be 1.00, they have a 20% market share.

Notice that the 5%, 7%, and 10% defective rates don't go into the table directly. This is because they are conditional probabilities and the table is a joint probability table. These defective probabilities are conditional upon which company was given. That is, the 7% is not P(Defective), but P(Defective|Brochmailians). The joint probability P(Defective and Brochmailians) = P(Defective|Brochmailians) * P(Brochmailians).

The "good" probabilities can be found by subtraction as shown above, or by multiplication using conditional probabilities. If 7% of Brochmailians' product is defective, then 93% is good. 0.93(0.30)=0.279.

The second question asked above is a Bayes' problem. Again, my point is, you don't have to know Bayes formula just to work a Bayes' problem.

Bayes' Theorem

However, just for the sake of argument, let's say that you want to know what Bayes' formula is.

Let's use the same example, but shorten each event to its one letter initial, ie: A, B, C, and D instead of Aberations, Brochmailians, Chompieliens, and Defective.

P(D|B) is not a Bayes problem. This is given in the problem. Bayes' formula finds the reverse conditional probability P(B|D).

It is based that the Given (D) is made of three parts, the part of D in A, the part of D in B, and the part of D in C.

                            P(B and D)
   P(B|D) =  -----------------------------------------
              P(A and D)  + P(B and D)  + P(C and D)

Inserting the multiplication rule for each of these joint probabilities gives

                            P(D|B)*P(B)
   P(B|D) =  -----------------------------------------
              P(D|A)*P(A) + P(D|B)*P(B) + P(D|C)*P(C)

However, and I hope you agree, it is much easier to take the joint probability divided by the marginal probability. The table does the adding for you and makes the problems doable without having to memorize the formulas.


Table of Contents