Percentiles

The *k*th percentile is the value that is greater than *k* percent
of the data values after ranking them. The *k*th percentile is denoted
P_{k}

Procedure for finding

- Rank the data from lowest to highest
- Multiply the sample size by
*k*/100 to find the depth of the*k*th percentile - If the depth is a whole number, add 0.5. If the depth is not a whole number, round up to the next higher whole number.
- The
*k*th percentile is the value in the depth position. If the depth ends in 0.5, then you need to average the two values on either side. For example, if the depth is 19.5, then the*k*th percentile will be the average of the 19th and 20th values.

Formula

The depth of the

kth percentile is given. Round up if a decimal, add 0.5 if a whole number.

Quartiles

The quartiles are three values that divide the data into four equally sized groups. The first quartile is denoted Q1 and has 25% of the values less than it and 75% of the values greater than it. The second quartile is the same as the median and has 50% of the values less than it and 50% of the values greater than it. The third quartile is denoted Q3 and has 75% of the values less than it and 25% of the values greater than it.

Procedure for finding

- Find Q1 by finding the 25th percentile using the instructions above
- Find the median by finding the 50th percentil using the instructions above
- Find Q3 by finding the 75th percentile using the instructions above

Five Number Summary

The five number summary consists of the minimum, 1st quartile, median, 3rd quartile, and maximum values. Those five numbers divide the data into four equal groups, each containing 25% of the data.

Procedure for finding

- Rank the data from lowest to highest
- Find the minimum and maximum values
- Find the quartiles as instructed above

Interquartile Range

The interquartile range is the difference between the quartiles.

Procedure for finding

- Find the quartiles as instructed above
- Subtract the first quartile from the third quartile

Formula

Semiquartile Range

The semiquartile range is half the difference between the quartiles.

Procedure for finding

- Find the quartiles as instructed above
- Subtract the first quartile from the third quartile
- Divide the difference by 2

Formula

Midquartile

The midquartile range is the midpoint between the quartiles.

Procedure for finding

- Find the quartiles as instructed above
- Add the first and third quartiles together
- Divide by 2

Formula

10-90 Percentile Range

The 10-90 percentile range is the difference between the 90th and 10th percentiles. See the trimmed mean for another instance of where the data between the 10th and 90th percentiles are used.

Procedure for finding

- Find the 10th percentile using the instructions above
- Find the 90th percentile using the instructions above
- Subtract the 10th percentile from the 90th percentile

Formula

Mild Outliers

Mild outliers are data values that lie between 1.5 and 3 times the interquartile range below the 1st quartile or above the 3rd quartile

Procedure for finding

- Rank the data from lowest to highest. This is not necessary, but it will make life easier
- Find the quartiles as instructed above
- Find the interquartile range by subtracting Q
_{1}from Q_{3} - Find the lower range of values for mild outliers by taking Q
_{1}-3*IQR and Q_{1}-1.5*IQR - Find the upper range of values for mild outliers by taking Q
_{3}+1.5*IQR and Q_{3}+3*IQR. - Any data value that falls within the lower range or within the upper range is a mild outlier.

Extreme Outliers

Extreme utliers are data values that lie at least three times the interquartile range below the 1st quartile or above the 3rd quartile

Procedure for finding

- Rank the data from lowest to highest. This is not necessary, but it will make life easier
- Find the quartiles as instructed above
- Find the interquartile range by subtracting Q
_{1}from Q_{3} - Find the lower cutoff for mild outliers by taking Q
_{1}-3*IQR - Find the upper cutoff for mild outliers by taking Q
_{3}+3*IQR. - Any data value that falls below the lower cutoff or above the upper cutoff is an extreme outlier

Standard Score : z-score

The standard score or z-score is a value that is found by subtracting the mean and then dividing by the standard deviation. No matter what the original mean and standard deviation of the data were, after applying the standardization transformation, the mean will be zero and the standard devation will be one.

The z-score allows us to compare samples that have different means and standard devations to see how scores compare relative to their sample.

Procedure for finding

- Find the mean of the data
- Find the standard deviation of the data
- Subtract the mean from the data value
- Divide by the standard deviation

Formula