The Good & the Bad [2/2012]
Looking for a map of the french Departments, I came across this map of the population density of France on Departments level which can be found on Wikipedia – and you may guess: this is this month’s “The Bad”.
At first sight there seems to be a contradiction between the apparently continuous color scale (see here for some thoughts on coropleth maps) and the map that does not seem to give any decent insight in the geographical distribution of population density. The answer is twofold.
1. The color scale is not continuous but has a break between green and blue (unless you invert the shades of blue) and blue and yellow. What we would expect – in less saturated colors – looks like this:
2. For a map showing a continuous quantity, we usually would not choose so many different saturated colors.
Let’s approach “The Good” as I still need to convince you that there might be a better version of the map. In a perfect world, coropleth maps look smooth and “continuous”. For the map of France we might want to look at the distance to the capitol Paris, as France is very centralistic. This map uses a monochromatic scale and shows “the perfect world” …
As this one is obviously too trivial, we want to look at the population density as in the above plot (2011 census data from wikipedia). Using a simple linear scale we would end up with this (useless) map, which uses a color scale that ranges from blue (small values) over white (median values) to red (large values):
Except for Paris and three other departments, all regions are unpopulated compared to the capitol. The extremely skewed distribution which is shown in the lower left, explains the dilemma.
Using the same “trick” as in the original wiki-map, i.e., cutting off all values above 150 we get a map that is easier to read, but now equalizes all information for areas above 150.
(Note, I used the histogram of log(population Density) for the legend)
The result is much better now, but there seem to be too many departments put into a single class.
From the data on the log-scale, we already see what would be most desirable, i.e., a distribution of colors, which is close to a normal distribution. Using a non-continuous transformation of the variable we display, we can map the color-shades to be normal, which ends up in the following map, which I would classify as “The Good”.
We now get a fairly good feeling of which regions are highly populated, which ones are close to the median (even with a distinction of being above or below average) and also clearly see the extremely unpopulated departments.
There is a lot more to say about the do’s and don’ts for drawing choropleth maps (which can be found here in Chapter 6). What is even more fun is to play around yourself! Here is the data (unzip and load France.txt with Mondrian) and here is the software – have fun!
(Thanks to Antony for providing the map!)