Fundamentals: Scales imply Patterns

When it comes to graphing data in a chart, the scale of the data is the most important factor to determine which graphical representation might be useful. Please pardon me for the examples using the “Iris Data” and the “Titanic Data”; but these data sets are prototypes for multivariate continuous data and multivariate categorical data everybody can relate to.

The first pair of plots (upper row) should puzzle everybody and only extremists of the one or the other graphing method would actually plot the data this way.

The second pair of plots (lower row) uses a SPLOM to graph the iris data and a mosaic plot to visualize the Titanic data, i.e., both datasets a plotted in a graph which respects the scale of the data.

Admittedly, this example seems to be too obvious, but when it comes to more complex datasets with a mixture of continuous and categorical variables it might be quite helpful to know how to choose the right (set of) graphics in order to visualize (and analyze) the data most properly.

(To create the graphics yourself you might use Mondrian)

Leave a Reply