A histogram is used to summarize discrete or continuous data. In other words, it provides a visual interpretation of numerical data by showing the number of data points that fall within a specified range of values.
Enter the name of a variable and optionally a filter. If you have previously entered this variable and filter in the box for summary statistics, then this new variable will be selectable in the Variable list (click the button).
- Show Normal distribution: option to have a Normal distribution curve (with Mean and Standard Deviation of the data represented in the histogram) superimposed over the histogram.
- Relative frequency (%): option to express frequencies as percentages.
After you click OK the program collects the data, performs some initial calculations, and displays the following dialog box:
In this dialog box, the program gives the mean, standard deviation, minimum and maximum value for the selected variable. Next, the default lower and upper limits, and the default number of classes in the histogram are displayed. If you prefer other values than these default values, you can make the necessary changes. For Lower and Upper limit, the program will not accept values greater or less than the minimum and maximum of the variable. When you click OK the program will continue with the new settings, but when you click the program will display the histogram with the initial default settings.
This is the histogram for the variable Weight:
The first bar in this histogram represents the number of cases (frequency) with weight ≥ 55 and < 60. The second bar represents the number of cases with weight ≥ 60 and < 65, etc.
When the Show Normal distribution option is selected, a Normal distribution plot (with Mean and Standard Deviation of the data represented in the histogram) is superimposed over the histogram.
Using the histogram it can be evaluated visually whether the data are distributed symmetrically, Normally or Gaussian or whether the distribution is asymmetrical or skewed.
When the distribution is not Normal, it can not accurately be described by mean and standard deviation, but instead the median, mode, quartiles and percentiles should be used. The latter statistics are reported in the Summary statistics window.
To change the titles, colors or axis scaling used in the graph, see Format graph.