Creates a dot plot for a single variable with different graph options such as the inclusion of a Bar, Line or Marker for mean or median, with choice of different error bars for mean (95% CI, 1 SEM, 1 SD, 2 SD, 3 SD, range) or median (95% CI, 25-75 percentiles, 10-90 percentiles, 5-95 percentiles, 2.5-97.5 percentiles, 1-99 percentiles, range), and/or Box-and-whisker plot (Tukey, 1977).
Select the variable of interest, and optionally a filter to include only particular cases in the graph.
Several elements can be selected to add onto the dot plot, and some of these can be combined:
Bar, Horizontal Line and/or Marker for mean or median
The following error bars are available if Bars, Horizontal Line and/or Markers is selected:
- If mean is selected: (none), or 95% CI for the mean, 1 SD, 2 SD, 3 SD, 1 SEM, range
- Note that 2 SEM is not in this list: when the number of cases is large, mean ± 2 SEM corresponds to the 95% confidence interval (CI) for the mean. When the number of cases is small, then the 95% CI interval is calculated as mean ± t * SEM, where t is taken from a t-table (with DF=n−1 and area A=95%).
- Although 1 SEM gives the more narower error bar, this option is not recommended since the resulting error bar may be highly misleading, especially when the number of cases in the groups is different. Preferably the 95% CI for the mean is used for providing a valid graphical comparison of means (Pocock, 1984), or use 2 SD as an indication for the variability of the data.
- If median is selected: (none), or 95% CI for the median, 25-75 percentile, 10-90 percentile, 5-95 percentile, 2.5-97.5 percentiles, 1-99 percentile, range
- When the number of cases is small, it is possible that the 95% CI for the median is not defined and that it will not be displayed in the graph.
- When you use percentile ranges, take into account the number of observations: you need at least 100 observations for 1-99 percentiles, at least 20 for 5-95 percentiles, at least 10 for 10-90 percentile and at least 4 for 25-75th percentiles.
In a Box-and-Whisker plot, the central box represents the values from the lower to upper quartile (25 to 75 percentile). The middle line represents the median. A line extends from the minimum to the maximum value, excluding "outside" and "far out" values which are displayed as separate points.
- An outside value is defined as a value that is smaller than the lower quartile minus 1.5 times the interquartile range, or larger than the upper quartile plus 1.5 times the interquartile range (inner fences).
- A far out value is defined as a value that is smaller than the lower quartile minus 3 times the interquartile range, or larger than the upper quartile plus 3 times the interquartile range (outer fences). These values are plotted with a different marker in the warning color (see Colors section of Format graph).
Option: if the data require a logarithmic transformation, select the Logarithmic transformation option.
When you click an individual observation in the graph, the corresponding case is displayed in a pop-up window (see also Select variable for case identification command). If you double-click an observation, the spreadsheet window will open with the corresponding case highlighted.
Click Info on the context menu that appears after right-clicking in the graph window to get detailed information on the data represented in the graph (sample size, etc).
Exploratory Data Analysis
John W. Tukey
Buy from Amazon