# Stats: Normal Approximation to Binomial

Recall that according to the Central Limit Theorem, the sample mean of any distribution will become approximately
normal if the sample size is sufficiently large.

It turns out that the binomial distribution can be approximated using the normal distribution if np
and nq are both at least 5. Furthermore, recall that the mean of a binomial distribution is np and
the variance of the binomial distribution is npq.

## Continuity Correction Factor

There is a problem with approximating the binomial with the normal. That problem arises
because the binomial distribution is a discrete distribution while the normal distribution is a
continuous distribution. The basic difference here is that with discrete values, we are talking
about heights but no widths, and with the continuous distribution we are talking about both
heights and widths.

The correction is to either add or subtract 0.5 of a unit from each discrete x-value. This fills in
the gaps to make it continuous. This is very similar to expanding of limits to form boundaries that
we did with group frequency distributions.

Examples

Discrete |
Continuous |

x = 6 |
5.5 < x < 6.5 |

x > 6 |
x > 6.5 |

x >= 6 |
x > 5.5 |

x < 6 |
x < 5.5 |

x <= 6 |
x < 6.5 |

As you can see, whether or not the equal to is included makes a big difference in the discrete
distribution and the way the conversion is performed. However, for a continuous distribution,
equality makes no difference.

Steps to working a normal approximation to the binomial distribution

- Identify success, the probability of success, the number of trials, and the desired number of
successes. Since this is a binomial problem, these are the same things which were identified
when working a binomial problem.
- Convert the discrete x to a continuous x. Some people would argue that step 3 should be done
before this step, but go ahead and convert the x before you forget about it and miss the
problem.
- Find the smaller of np or nq. If the smaller one is at least five, then the larger must also be, so
the approximation will be considered good. When you find np, you're actually finding the
mean, mu, so denote it as such.
- Find the standard deviation, sigma = sqrt (npq). It might be easier to find the variance and just
stick the square root in the final calculation - that way you don't have to work with all of the
decimal places.
- Compute the z-score using the standard formula for an individual score (not the one for a
sample mean).
- Calculate the probability desired.

Table of Contents