Close
SciStat
 

Multiple Box-and-whisker plot

The Box-and-whisker plot (Tukey, 1977) displays a graphical statistical summary of a variable. This plot can also be useful to detect outliers.

  • In the basic Box-and-whisker plot, the central box represents the values from the lower to upper quartile (25 to 75 percentile). The middle line represents the median. The horizontal line extends from the minimum to the maximum value, excluding "outside" and "far out" values which are displayed as separate points.
  • An outside value is defined as a value that is smaller than the lower quartile minus 1.5 times the interquartile range, or larger than the upper quartile plus 1.5 times the interquartile range (inner fences).
  • A far out value is defined as a value that is smaller than the lower quartile minus 3 times the interquartile range, or larger than the upper quartile plus 3 times the interquartile range (outer fences). These values are plotted with a different marker in a different color.

Required input

Data model

  • 1 continuous and 1 categorical variable: select this option when you have one continuous variable and one variable that defines a subgroup classification, for example:

  • 1 continuous and 2 categorical variables: select this option when you have one continuous variable and two variables that define 2 classifications, for example:

  • n continuous variables without subgroups: select this option when you have several continuous variable, without subgroups, for example:

  • n continuous and 1 categorical variable: select this option when you have several continuous variables and one variable that defines a subgroup classification, for example:

Variables

  • Select the variables of interest.
  • Optionally select a filter to include a subset of cases.

Options

  • If the data require a logarithmic transformation (e.g. when the data are positively skewed), select the Logarithmic transformation option.
  • Notched box-and-whisker plot: in this variation of the box-and-whisker plot (McGill et al, 1978) confidence intervals for the medians are provided by means of notches surrounding the medians. If the notches about two medians do not overlap, the medians are significantly different at a ± 95% confidence level.
  • Markers: includes all the data points in the graph. This option is very useful since in this case the graph has the advantage not to conceal the real data.

Literature

  • Altman DG (1991) Practical statistics for medical research. London: Chapman and Hall.
  • McGill R, Tukey JW, Larsen WA (1978) Variations of box plots. The American Statistician, 32, 12-16.
  • Tukey JW (1977) Exploratory data analysis. Reading, Mass: Addison-Wesley Publishing Company.

See also

Link

Go to Multiple Box-and-whisker plot.