In December last year a journalist posted this graph to Twitter, with the caption: “This is what happens to your weight when you stop drinking.. (app on my phone tracks weight among other things via digital scales) Weight is the heavy blue line.”
What does this graph appear to show, do you think? It looks like a big drop, but, actually, there is no meaningful information here.
I hate to let graphs like this go unchallenged, because it worries me that they reinforce our misconceptions. Because this graph, as it is, completely unlabelled, is meaningless. In my reply I focused on the lack of a zero on the scale, but, really, there is more to it than that. You could get a graph exactly like this with a weight loss of 2 grams, or 60 kilograms. You can’t even be sure this graph shows a drop in weight – the scale could be negative.
Without labels, there’s no knowing what we’re looking at. In posting it, it’s clear that the journalist thought it was meaningful, and it’s true that we often think we understand graphs without really thinking about what they actually say.
The journalist in question got snarky with my reply, and suggested I resist the urge to comment in future. Another journalist jumped in with this response, which was actually really alarming:
“I’d just caution against the assumption that we are being misled like we are a) innumerate and can’t read graphs and b) we don’t have some sense of the probable starting weight of an adult male who is happy to shed a few kilos. It’s OK, we’ve got this!”
Even having had her attention drawn to the issue, she still felt she was data literate enough to make sense of the graph. And this is what worries me. It’s part of why I founded ADSEI. Because if we think we are extracting meaning from a wholly meaningless graph, then we need some serious data literacy help, STAT!
(I’m not naming the journalists here because this is not about shaming them, it’s a really widespread issue.)
Consider the same graph, reproduced below with two different sets of labels. The first one shows a starting point of 108kg and a finishing point of 100. 8 kg lost! The second shows a starting point of 50kg, and a finishing point of 49. Both labels work equally well on this graph, because there is no scale. No starting point. No finishing point. No indication of the distance covered by the axis.
Too often we treat labels on a graph as an option extra – nice to have, but not super important. The trouble is that, without labels, a graph can give us entirely the wrong idea about the data we’re looking at.
Consider these two basic graphs. They show exactly the same data.
The graph on the left has a y axis starting at 0. The graph on the right has a y axis starting at 94. Both are valid representations of the data, but they give wildly different impressions of the difference between each value. If the y axis was not labelled, you would not be able to tell that it was the same data, nor how significant the difference actually is between each value.
How often do we carefully read the axes and other labels on a graph? How often do we ask searching questions about the origins of the data, the meaning of the graph, and the validity of the representation? We have a terrible tendency, as a species, to bend at the knees when we see a graph, and treat it as valid and meaningful, without looking closely at the details. We must educate ourselves, and, most importantly, our children, to ensure that we are not so easily fooled.