Stats: Normal Approximation to Binomial




Recall that according to the Central Limit Theorem, the sample mean of any distribution will become approximately normal if the sample size is sufficiently large.

It turns out that the binomial distribution can be approximated using the normal distribution if np and nq are both at least 5. Furthermore, recall that the mean of a binomial distribution is np and the variance of the binomial distribution is npq.

Continuity Correction Factor

There is a problem with approximating the binomial with the normal. That problem arises because the binomial distribution is a discrete distribution while the normal distribution is a continuous distribution. The basic difference here is that with discrete values, we are talking about heights but no widths, and with the continuous distribution we are talking about both heights and widths.

The correction is to either add or subtract 0.5 of a unit from each discrete x-value. This fills in the gaps to make it continuous. This is very similar to expanding of limits to form boundaries that we did with group frequency distributions.

Examples
Discrete Continuous
x = 6 5.5 < x < 6.5
x > 6 x > 6.5
x >= 6 x > 5.5
x < 6 x < 5.5
x <= 6 x < 6.5

As you can see, whether or not the equal to is included makes a big difference in the discrete distribution and the way the conversion is performed. However, for a continuous distribution, equality makes no difference.

Steps to working a normal approximation to the binomial distribution

  1. Identify success, the probability of success, the number of trials, and the desired number of successes. Since this is a binomial problem, these are the same things which were identified when working a binomial problem.
  2. Convert the discrete x to a continuous x. Some people would argue that step 3 should be done before this step, but go ahead and convert the x before you forget about it and miss the problem.
  3. Find the smaller of np or nq. If the smaller one is at least five, then the larger must also be, so the approximation will be considered good. When you find np, you're actually finding the mean, mu, so denote it as such.
  4. Find the standard deviation, sigma = sqrt (npq). It might be easier to find the variance and just stick the square root in the final calculation - that way you don't have to work with all of the decimal places.
  5. Compute the z-score using the standard formula for an individual score (not the one for a sample mean).
  6. Calculate the probability desired.


Table of Contents