# Construction of a Box-and-Whisker plot

## Box-and-whisker plot

A Box‑and‑Whisker plot (Tukey, 1977) is constructed as follows:

- a box is drawn from the
**1**to^{st}**3**(the 25^{rd}quartile^{th}and 75^{th}percentiles) - a horizontal line is drawn at the
**median**(the 50^{th}percentile) - the Interquartile range (IQR) is calculated:
**IQR**= 3^{rd}− 1^{st}quartile - an imaginary line is drawn at the 3
^{rd}quartile + 1.5 × IQR; this is the**Upper Inner fence** - the highest value (observation, measurement) just below the upper inner fence is the
**upper adjacent value**; a horizontal line is drawn at this value; - a vertical line is drawn from the 3
^{rd}quartile to the upper adjacent value - an imaginary line is drawn at the 3
^{rd}quartile + 3 × IQR; this is the**Upper Outer fence** - all values higher than the upper inner fence are always represented in the graph
- a value higher than the upper outer fence is called a
**Far out**value (these are drawn using a different symbol) - a value higher than the upper inner fence but not higher than the upper outer fence, is called an
**Outside value**

- a value higher than the upper outer fence is called a
- similar lines are drawn at the lower side of the plot

Note that John Tukey did not use the term 'outlier' for 'outside' and 'far out' values.

## Notched Box-and-Whisker plot

A notched Box‑and‑Whisker plot (McGill et al., 1978) is constructed in the same way as a Box-and-Whisker plot (described above), but in this variation of the box-and-whisker plot a confidence interval(*) for the median is provided by means of notches surrounding the medians.

(The illustration does not show all the details of the regular Box-and-Whisker plot)

The notches surrounding the medians provide a measure of the rough significance of differences between the values. Specifically, if the notches about two medians do not overlap in the display, the medians are, roughly, significantly different at about a 95% confidence level (McGill et al., 1978).

In the following example, there is probably no significant difference between the medians in the two samples because the notches overlap.

MedCalc calculates the notches according to McGill et al. (1978), as follows:

where IQR is the Interquartile range and N is the number of cases in the sample.

(*) Important: this confidence interval is **not** a 95% confidence interval of the median. It is **a** confidence interval that allows comparison of the medians.

## Literature

- McGill R, Tukey JW, Larsen WA (1978) Variations of box plots. The American Statistician, 32, 12-16.
- Tukey JW (1977) Exploratory data analysis. Reading, Mass: Addison-Wesley Publishing Company.

## MedCalc procedures that offer Box-and-Whisker plots

- Box-and-Whisker plot: Box-and-Whisker plot for one variable (no Notched Box-and-Whisker plot)
- Data comparison graphs: Box-and-Whisker plots for two variables
- Multiple comparison graphs: Box-and-Whisker plots for subgroups (one-way classification) of one variable
- Clustered multiple comparison graphs: Box-and-Whisker plots for subgroups (two-way classification) of one variable
- Multiple variables graphs: Box-and-Whisker plots for several variables
- Clustered multiple variables graphs: Box-and-Whisker plots for subgroups of several variables

## See also

## Recommended book

## Exploratory Data Analysis

John W. Tukey

Buy from Amazon