The Good and the Bad: chicken and egg problem

Robert has a very long and profound post on this chart:

GE Chart by Ben Fry

The whole interactive thing can be found here on the GE site. It seems to be a bit of a provocation that Ben Fry’s company uses a tattered pie chart to visualize the data, which is definitely better visualized in a line-chart (i.e., a time series with age as the time axis). There is the suspicion that the radius is proportional to the quantities, which would really take it to the top … Unfortunately we don’t have the data at hand to give an improved version – anyone wants to take the burden to note all the values by hand?

Apart from all technical criticism – which includes the cute animation which is completely useless – there is the fundamental chicken and egg question:

A good visualization should tell us a story about the data you didn’t know before and not the other way round, i.e., once you know the story, you create a visualization around it.

I actually have a hard time to find a story here at all …


  1. Cris says:

    It’s not hard getting to the data: You can download the java application from the GE website:
    Inside the ZIP file is a JAR file, which is actually just a ZIP file with Java source files inside. If you unzip the JAR file, you’ll find a directory ‘data’ with the data files inside: ‘cond-age-cost.csv’ has 3×12 columns. These seem to be the 12 diseases, first column total personal cost, second column total insurer cost, third column number of patients. The 80 rows seem to be the age, starting at 0 years for the first row. The file ‘subset.csv’ has 12 disease names. These correspond to the 12 groups of columns. Looking at ‘Chronic Bronchitis’, 18 years old on the graph, and the first group of columns in the data file, row 19, I see numbers that match. I also see that the GE application rounded the numbers down!

    Do I understand it correctly, that the width of the segments is the relative number of patients, and the radius is adjusted so that the area shows total cost? So the radius is sort-of related to the individual cost? The data is interesting, but it’s difficult to get anything out of it this way.

  2. […] This is exactly what I mentioned earlier in an older post: […]

Leave a Reply