What is a histogram?
Histograms are a statistical way of representing the frequencies of data values in particular intervals. The more traditional description is that a histogram is a chart plot of a frequency table where the height of the bars in it tell us how many data points are in each interval.
What is a box plot?
Box plots, sometimes referred to as ‘Box and Whisker Plots’, are a visual way of summarizing basic characteristics of a data set. Box Plots show the highest and lowest values (that are not outliers), the middle value and the values at the 1st quarter and 3rd quarter mark. Outliers are shown as dots after the ‘whiskers’. This gives as a simple way to quickly understand the spread of the data and is great for quickly comparing two data sets.
Think of the quartiles as, if your data set was ranked from its lowest to its highest value, Q1 would be the middle of the low values, (below the median) and Q3 is the middle of the high values (above the median). Box Plots break out a data set into 4 sets, before Q1, Q1 to the Median (Q2), the Median to Q3 and, after Q3. The interquartile range is defined as Q3 – Q1.
The lowest part of the box in the Box Plot is Q1, there is a line inside the box, the median, and the end of the box, Q3.
Outliers are typically defined as 1.5 * (Q3 – Q1) and if a data point is (1.5* the interquartile range) away from Q1 and Q3 (the edges of the box), it is considered an outlier. The whiskers of the box plot are lines from each end of the box out to the farthest data point that is not an outlier.