Greg Jericho, columnist at The Guardian and policy director at the Centre for Future Work talks with passion about communicating with data. This was a wonderful chat with someone who has really thought through how and why we use data, and what it can do for us. Check it out!
“A good graph just explains what’s going on. Don’t think that data is all about maths, and how technical and how wonderful your maths can be. Sometimes it’s just ‘oh, look, I averaged these things and showed that actually all the movement that seems to have been happening over the last 12 months just turns out to be a lot of noise.'”
“I just kept digging, and looking through the spreadsheets and going through them, and opening up this spreadsheet and finding out what’s in this one… I just sort of built up a knowledge and thought ‘why isn’t anyone else doing this? Isn’t this the obvious thing to do?’ and it wasn’t until I started working as a journalist that I realised no, journalists just read the press release.”
“We don’t actually need to get a quote from a politician to tell us what’s going on, the data has the story.” “When you start looking at data there is a natural case of just using the most common data. It took a while for me to have the confidence to, for example, now when I’m doing GDP figures I always look at things like real household disposable income per capita. The problem is there is no figure for that in the national accounts. You have to construct it yourself using four different spreadsheets.”
“I’ve gone from just not reading the media release, to not even caring what their numbers are and actually finding my own numbers. Which is always a fun thing, because you find things that you haven’t looked at in the past and think ‘oh, this is something new and cool.'”
“So much of what we take as given in the economic debate really is just based on belief.”
Transcript
Linda: You’re listening to Make Me Data Literate from the Australian Data Science Education Institute.
I’m your host Dr Linda McIver, author of Raising Heretics, Teaching Kids to Change the World.
Let’s go change your mind about data.
Welcome to another episode of Make Me Data Literate.
The very second episode of this podcast was with Juliette O’Brien who was doing some amazing data journalism around COVID-19 and we haven’t had a data journalist since so I’m really excited about this episode particularly because our guest does some amazing data journalism in the space of equality in particular that I find really fascinating and crucial.
So welcome.
Thank you so much for coming.
Greg: It’s very much a pleasure Linda.
Linda: Tell us Greg, who are you and what do you do?
Greg: Hi everyone my name is Greg Jericho.
I have a number of roles really.
I mean the one that probably most will know because I’ve been doing for the longest.
I’m an economics columnist at Guardian Australia.
I’ve been doing that for 10 years and I did write in a few other media as well in the past but now I’m just exclusively at Guardian.
But also I guess my day job is as chief economist at the Australia Institute and Centre for Future Work.
So I have been doing that since the start of last year, start of 2022.
So I have those sort of balancing jobs of what being working as an economist and dealing with policy issues at the Australian Institute and Centre for Future Work and I’m also still a journalist who writes a column every week now.
So I’ve kind of in my career often worn a couple hats at the same time and that seems to be what suits me best.
So yeah columnist and economist is a weird combination at times but one that I enjoy.
Linda: It seems to me like they dovetail really nicely that if you encounter a really burning issue in your day job you can then write about it.
Is that how it falls out?
Greg: A little bit.
I mean to be honest I try not to use my column to push stuff that we’re doing at the Australian Institute mostly because I just, I kind of want to respect the Guardian Australia.
They employ me not to be a lobbyist in the sense for the Australian Institute.
Although I certainly have used our research and when I put out a research paper that I think is quite useful and will be interesting to read as such as when I did earlier on this year with a colleague on gender equality I thought that was worth sort of using because it dovetailed with International Women’s Day and so I thought well that’s actually a good newsworthy thing.
More often than not it’s, I’ll do my weekly column much as I did before I joined the Australian Institute and in fact that was part of the reason why I’m at the Australian Institute and that Guardian Australian that they really do have a both have a progressive bent and so it’s not a case where I’d need to think about what I’m doing at the Aus Institute and how I can use that or how I can use Guardian, my Guardian column, it’s more about actually sometimes there are things that I might think are interesting that I probably wouldn’t do a research paper or anything on it at the Australian Institute because it’s not big enough but it fits nicely for a column and sometimes there will be stuff that I’m doing research on that I think, actually this is some good research but it actually could do with just being made clearer for a non-economics audience, and I will sort of take a couple things from it and use in a Guardian column but really is there a sense where I’m actually using something from the Aus Institute for, purely so I can write a column on, it very much still remains two jobs especially because Guardian Australia will often just ask me oh look there’s a story like this week there was a story about you know with the Commonwealth Games being cancelled and that’s so they said hey can you write a column on that and or oh there’s something to do with housing can you write a column on that so I’m sort of always careful to try and keep them separate but the good thing is that I’m never having to write things that sort of contradict my sort of overall ethos on economics and politics because they both nicely marry up so it’s not like in my Guardian column I’m writing about how great markets are and we should privatise everything you know so it’s you know it’s they work well together and I guess I’m just in the great thing is the Australian Institute that never they didn’t hire me thinking oh great we can use your Guardian columns to further our research they they knew there was value in the fact that I’m seen as a credible journalist and writer of economic stuff and that again the things I write about they might not be the precise things I’m working on but they’re often on issues that the Australian Institute cares about in just a broader sense and so yeah it’s fun and and it just always gives me that nice balance between working on sort of ongoing things or longer things or things that might have a bit more analysis behind and also that sort of nice daily or weekly news things that you know there’s always data coming out and so generally that’s what I’m as much as anything with my Guardian thing it might be what data came out today and that that’s more often than not what what I’ll be writing about because you know news organisations want things to be fresh and often with, yeah and often with the Australian Institute is more about trying to make things topical which is which which is harder for a news organisation to say oh I’m writing a column on this and they’re going for why it’s not an issue and it’s like oh because I want to make an issue that that’s generally not how they like to operate unless it is something that we think has been missed and you know like if it is actual there has been a massive oversight but in terms of just oh I want to write something on stage three tax cuts and I don’t know why you know no one’s arguing about that at the moment but that is like oh but we’re doing a paper under this threat yeah that that might be a bit uncomfortable so I try not to do that.
Linda: That makes sense. What did you have to learn to do your work? Was there anything missing from your formal education? Are you self-taught in a bunch of areas or was it all kind of built in?
Greg: I mean pretty much all self-taught really when when I think about what I actually do and I think about what I learned at university when I did economics so I’ve got an honours in economics and I use obviously the understanding and the the theory and and the just the knowledge of how economics works but when I think about how I use data how I graph things how I even analyse things, it was all self-taught really, it came up because I mean I did my degree and started in 1990 and you know this was back when you just never really ever saw any ABS data I mean it was located in Canberra and you had to get hard copies when we were doing our when I did my honours the the university actually gave us a budget for data because you had quite often like I did my thesis on on the manufacturing industry and I needed to get some data on the number of employment employed in the manufacturing sector and you needed to pay for it you know it was so there was no part of the degree that actually involved oh here’s some data that’s come out let’s let’s have a look at that and what’s going on and GDP is increased and you know it was you learned about how you construct GDP in a in the theoretical sense but never in a sense of okay well how is this actually measured in the national accounts you know you you knew it was just y equals g plus c plus i plus x minus m and then you did it once you got on as you started using calculus to work it all out and everything but yeah it wasn’t until really when I sort of started writing my blog back in 2008-9 when I, and perhaps a bit before that when I actually started going on the ABS website and actually looking at the spreadsheets and realizing oh god that’s what’s in all these things and a lot of what I worked out was just sort of trying to replicate what was in the ABS releases themselves and then you know working out a few you know there were you know even things like contributions to growth and things like that I you know that wasn’t something that I learned at university and so it was really a case of okay how’d they get that figure and I just sort of dug through their methodology found their form and go oh I can do that and a whole lot of other things you know and just reading other you know economists because certainly when I started my blog and writing about economics outside the media I was just working as a public servant wasn’t working as an economist in the public service I was just a policy officer and I hadn’t really used my economics sort of knowledge for about 15 years so it was a real sort of okay let’s break open the box again and blow out the cobwebs a bit and it was more a case of I realized, actually I know more about this just because I’ve done a degree and I’ve got some and I’ve actually looked into the figures rather than just read media releases, I’m kind of, I know more about this and what a lot of journalists who are writing about and make it sound like they know what they’re talking about and so I just kept digging kept looking at things kept looking at the spreadsheets and going through them and and opening up this spreadsheet and finding out what’s in this one and doing that every day again and again and again just sort of built up a knowledge that I thought why isn’t anyone else doing this? Isn’t this the obvious thing to do? And it wasn’t really until I actually started working as a journalist that I realized that no that people, journalists just read the media release mostly some there might be the one odd one or two good ones and back then there weren’t many of them who actually might have read a bit deeper but there weren’t many actually looking at the spreadsheets and I think one of the, if I can claim any sort of impact in journalism is that I’ve helped increase the the requirement of economics journalists to be able to do a bit more than just look at the spreadsheet, there’s certainly in the past, you know, dozen years since I’ve sort of been doing my thing and and doing graphs and and using that there has certainly been an increase in the number of economics journalists who are actually doing that as well rather than just repeating the media release getting a quote from the treasurer and the shadow treasurer and that’s your economics article there’s a bit more of okay here’s what’s actually going on we don’t actually need to get a quote from a politician to tell us what’s going on the data as the story and and I say that all came about through a lot of trial and error I look back at some my old graphs and and shudder,
Linda: Why, because the data was wrong or just because of the graphs?
Greg: No, the graphs are terrible, the graphs are terrible they just go to excel ones and you know the y-axis is isn’t even particularly clear what it’s actually measuring and they just look terrible and and also there was a little bit of a you know you sort of when you start doing these looking at data there is a a a natural sort of case of just using the most common data and it took a while for me to to have the confidence to for example now I’ll when I’m doing GDP figures I always look at things like a real household disposable income per capita the problem is there is no figure for that in the national counts you have to construct it yourself using about four different spreadsheets and it it probably took me a couple years or so to get to the point where I’m like yeah I can actually do this I have the economy background I know what I’m doing is right instead of just going oh well the the ABS are just counting these things obviously they are the things we should be caring about and that took a couple years and it’s good that I’m now at the point where I’ve gone from just not reading the media release to to not even just hearing about what their numbers are and actually finding my own numbers which is always a fun thing because actually you know you you you find things that you hadn’t sort of looked at in the past and you realize oh actually this is something new and cool and it keeps it all a bit interesting.
Linda: I think it’s so interesting when you can actually challenge some of the the statements as well and go well hang on a minute um but I think that’s how the Australia Institute first came to my attention was when Matt Grudnoff took the statement by Tony Abbott that you can’t tax your way to a stronger economy I went I think you can and so he went and actually, did the analysis and went oh look turns out you can and I just love that that you know you can actually challenge things and go look at the data and go no what are you saying doesn’t track at all.
Greg: Yeah I mean and so much of what we take as given in the economic debate really is just based on belief. There’s, and it’s something I just did this week in my article strike industrial actions always tell we kind of industrial actions because they reduce productivity and it’s just it’s a saffon productivity the productivity commission even says this and everyone says this and everyone just regurgitates it and when we have a strike oh that’s terrible for the economy it’s going to kill productivity and level and so I thought well we know how many we can measure how many days are lost per thousand employees we know we can measure GDP per hour work well let’s compare them and you see there’s absolutely no correlation at all there’s no sense of when we’re having lots of strikes that that’s actually a period of low productivity and when we’re having few strikes that that’s when productivity is taken off and in fact kind of oddly after the industrial relations were changed in um meetings sort of final years in 93 where they really did cramp down on on strikes from then until basically the introduction of work choices it was kind of weird in that periods where there was high levels of strikes were actually high levels of productivity you know it’s, which I don’t think means that strikes cause productivity but more that probably probably when we have periods of high productivity workers realize what’s going on and realize that there are benefits that they’re missing out on so they’re more likely to agitate for them but what it all shows is that there’s no real link to this which kind of makes sense when you think about it because productivity is output per hour and if you’re on strike well there’s no output there’s no hours so what are we talking about here, it might hurt profits, I really enjoy being able to to challenge a few of these things and and one of the reasons why did the as I was talking about before this household disposable income is, the ABS back in about, oh, say 2010, 2011, they started in their in their national accounts release referring to the net national disposable income capita, which is this figure that they do have in in table one of the national accounts, but they started referring to it as a measure of Australia’s economic well-being and it was quite insightful in how the media operates and I was part of this as well in that I sort of picked it up picked up on and go oh okay well I’m going to use that then and so did a lot of other journalists and it became sort of oh this is just the measure of economic well-being we’ll refer to this and it’s one of those figures that yeah kind of does refer to economic well-being a bit better than GDP but around 2016 2017 when I was working at the Guardian I was going why is this figure going up when I know that real wages are flat and and actually things aren’t all that great in health you know when I was looking at the household things and I saw an article by an economist that just referred to household disposable income and I thought well hang on I could work that out and so I started actually realizing that there had been a split between the two they normally did sort of go in sync and then around 2016 we saw a net national disposable income taking off because mining profits were taking off and the problem with national income is it includes profits and which when you know that should be an indicator of of economic well-being because profits should also lead to stronger household incomes but what we were seeing through the mining export boom is that mining companies were doing great they didn’t need to employ workers anymore they’d already built the mines built the rails built the the the wharfs and now they were able to just put stuff on on trains and in the ships and and collect the money and so I started using this and I actually did get some criticism from the the Treasurer’s Office at the time they were saying hey hang on in the past you’ve always used this other measure and said it’s a measure of economic well-being why are you now saying it’s not and I said because the data changed you know and rather interestingly when uh during the pandemic when we saw household disposable income actually rise because of things like JobKeeper and the doubling of JobSecret various things suddenly they started referring to it because the story was a good story yeah because it looked better so it’s it’s always interesting to be able to use that to challenge, challenge narratives and also just challenge things that people take as given without actually having even looked into whether or not it’s true.
Linda: Yeah, and that’s one of the the things I love about data that you can actually go and and you know dig in and find out challenge those statements and especially the the things that we kind of take as a given, the articles of faith and and ideology that to be able to go no that’s not quite right. I think that what you were saying about economic well-being and how it kind of separated you know into into two measures that weren’t tracking the same must relate to what you were talking about in that article a few months ago that, I don’t, I lost count of how many people said it to me about wealth equality and how young people were basically being shafted and the wealth just wasn’t flowing equally anymore in that there had been like 20 years ago a deliberate severing of that you know flow of wealth and that idea of making wealth equal so that profits aren’t going to higher wages and they’re not going to young people.
Greg: Yeah and also I mean especially with wealth because so much of Australia’s wealth um when you sort of take it or household wealth I guess as opposed to profits so much household wealth is is locked up in property and yeah when you have a system that was geared from really from 1999 when Halden Costello changed the way capital gains tax was was taxed and and made it a 50% discount it really just set fire to people, the ability with people who already own property who already had equity to borrow against it really set fire to them being able to generate wealth and that’s fine but it’s not fine when it is actually making it harder for people to actually be able to enter into the housing market so that they too can generate wealth and it’s, it was again it’s one of those things where data is great because you’re able to use it to show what people know is right feel is right they they you know anyone, and you talk to anyone in their 20s they they know that there’s no chance of getting in the housing market, but it’s a case of this that was a good story in that you’re able to use data to actually show yes your feelings are true here’s the proof so data is great for calling bullshit but it’s also good for saying yeah what we think is happening actually is happening and here we can show when and how and why and with housing it’s just astonishing the one the the the changes in the affordability of housing the changes in who actually owns housing in terms of age and also the the big change and this is the one that I think is going to have massive ramifications sooner than than even just the lack of affordability is the fact that while in some ways the the share of people in their 40s and 50s owning a home hasn’t changed dramatically what has changed dramatically is the numbers who are still paying off a mortgage and so we’re going to have many more people entering retirement still holding a mortgage which has major ramifications because Australia’s retirement system is built on people essentially 80% of people owning a home and there’s a big difference to be owning a home with no mortgage and owning a home and still paying a mortgage especially when you are deciding I’m no longer going to have an income and then add on another 20 years and you’re going to get people who are much greater number of people who don’t even have a mortgage and they’re renting and in some ways the issues are going to be very similar in that, you know when you’ve got a system built on the fact oh you retire you don’t have to pay a mortgage or rent so we can worry about other things um that whole system starts to strain and gets to the verge of collapse when you start getting, you know a half to 60% to people are not in that position and that’s kind of where we’re headed
Linda: Yeah, plus all of the people who rated their super during covid with government you know encouragement and all this sort of stuff it’s an interesting setup. You spend a lot of your life trying to communicate things with data and show people you know what the data indicates, is there anything that you wish everybody knew about data? Is there you know one magic thing that would make life easier?
Greg: Um, it’s, it’s a weird perhaps a weird thing to say but that data is and you know especially coming off the back of what we’ve just been saying but data is not perfect and realizing and I think everyone does have you know the sort of line that you know you can prove anything with statistics and things like that but it’s realization that one, data could not be given overdue reverence because we see this with and, I’m speaking more about things outside the official data where one of the problems we have with our political debate is we’ll get reports and someone say well this is going to create a hundred and thirty eight billion dollars worth of growth or something like that and it’s like oh well there’s a number there’s that must be true it’s it’s data and realizing that actually not all data is equal and just as we don’t trust the words from various different people just as we wouldn’t trust if someone um you know if the treasurer is saying something and then you get a vox pop and someone’s saying something different where you might you know question the veracity of perhaps this random person it’s the same with data we we really need to be a bit more careful about how we treat data but with that same regard you should not be afraid to to question data and to to want to know okay why is this, how do we get to this figure, and one of the things where I’ve really liked being part of, for example things like the unemployment rate, of being able to give a bit more clarity in how we get this thing and realizing that not only are there deficiencies to it but but also we perhaps um we we spend a bit too much time caring about it in in a sense that we shouldn’t not because the figure is wrong which is always that that problem with we’re talking about data in this way because then you’ll get people saying I know the ABS are tweaking the data or the government has told them and it’s like, no no the data is is good it’s legitimate it’s just that we care too much about this one figure there are a lot of other figures we should be worrying about and I think that’s the one thing I think is just you know don’t get spooked by data and also don’t I guess treat it as holy as well you know they’re just numbers and and we really should be able to talk about them in a way and that we talk about sort of comments and things that are made by other people.
Linda: That is, so I build data science projects for schools and and it’s essential to me that they’re working with real data because otherwise you get a perfect curve and you know perfect your numbers come out perfectly and you never actually learn how data works and the first thing I put I’ve learned to put in all of my projects is what’s wrong with the data you know yeah and and the fact that there’s something wrong with it doesn’t mean it’s useless but you have to sort of build that into your calculations and your analysis and go well you know we don’t have the perfect story but it’s good enough to tell us whatever, actually to understand what’s wrong with it is really essential I think.
Greg: Yeah I think sometimes it’s overreactions both ways you know like I always say you know the unemployment rate certainly is not the be on end all of of the economy but and but I don’t dismiss it you know whereas you have oh who cares about the unemployment that hides so many things you only can work one hour a week and I’m like yes that’s true but there’s not a lot of things in society and in the economy that are improved when the unemployment rate goes up and so as a general I’d always prefer a lower unemployment rate that a higher one that doesn’t mean unemployment rate falling means everything’s great it means yeah answers our things are getting better than if they if the unemployment rate was rising the same with you know GDP it misses so much but there’s not a lot of good things that happen when GDP starts falling and you know um so it’s a case of I think there’s a tendency by some to go oh the unemployment rate is at 3.5% that means everything’s great and on the other other side of things that people go why are you caring about the unemployment rate you know you shouldn’t be thinking about that you should be looking at x y and z and I’m like you know generally with data there is that you’re actually in a lot of grey areas where you should never be saying one data tells us all it’s part of a picture part of a story and the best research and the best analysis and I think the best journalism is done where they give you they use the data to show a fuller picture rather than say oh the government has got the unemployment rate down 3.5 percent they are brilliant.
Linda: Yeah I mean that comes back to accepting the press release as well doesn’t it that you know you’ve got you’ve got one talking point one sound bite one one headline, nice and clear and we like the simple ones and you know just print what they said the press release and move on, it’s never that simple.
Greg: Yeah it’s, yeah and even within the data itself like I can remember you know the measures we we use with things like your unemployment rate whether we use the seasonally adjusted version or the trend version as well and things and we can get really excited by one movement in one month and then the next month it goes back down and you’re like yeah probably nothing happened it was just the data you know the the unemployment rate is a survey and surveys have margins of errors and and standard deviations and and I think sometimes it’s this sense of oh no it’s it’s exactly 3.5 percent you know and it exactly went up to 3.7 percent and um it’s it’s interesting and you know this sort of gets to bad things done with data um was um very much sort of 20 years ago 15 years ago there was a lot of work done by some really good bloggers this department blogging was a thing on polling numbers and really demystifying these polling numbers because the pollers were reported as being basically 100 accurate, if it went well if you know the a or p had a two-party preferred of 53 one week and then the fortnight later it went down to 52 well it went down you know and and it really was some good work by these bloggers who were a lot of them who were actual statisticians and economists who were going no you realise it’s a survey of maybe a thousand people there’s a margin of error of 3.5 percent though in reality nothing happened and the media at the time really hated this because they’re the worthiness of the news poll and other polls and the news worthiness came from those movements and they wanted to say that oh in the past fortnight people have reacted poorly to the government’s latest policy that’s like none of that happened you know and,
Linda: Nothing happened is a terrible clickbait headline though.
Greg: Exactly exactly and but the problem is quite a lot what is not translated is when it actually comes to looking at other data such as the ABS data the labor force figures and the realization that no they’re not out there counting every single person yeah it’s a it’s a lot bigger survey but it’s still a survey and we certainly saw during the pandemic that you know when they had to drop the trend right because it didn’t make any sense and and also just understand that maybe we don’t need to get excited about a one percentage point move or even a two percentage point move in one month let’s wait a little bit and see but you know media companies are not in the business of waiting and seeing they’ll write, they’ll write that things are great this week and then next week they’ll write that things are terrible and you know and it’s left for the rest of us to try and work out well what is actually going on and so.
Linda: And as far as the media is concerned both of those headlines were successful and fine because they sold sold use of advertising or whatever.
Greg: Yeah, yeah I mean I can remember back in the 20 I’ll say 2013 election and there was a really poor story in my view done on it was a basically attacking Rudd and pointing out that they’re going to lose Queensland boats and it was using, and it was based on, there was a massive jump in state unemployment rate in Queensland and the state unemployment rates especially the seasonally adjusted are notoriously erratic they bounce up and down because you’re using much smaller sample sizes for each state and the ABS itself kind of says don’t use these yeah don’t use these in a month to month basis maybe you can use them to compare across states but don’t you know just use the trend figures to sort of show it and and I sort of push back on this and said look you’re just painting a, using a big wild swing in numbers that is was suggesting I think was at the time you know the Queensland unemployment rate in one month went from 6.1 to 6.7 or something like that and you know, it’s just absurd, and yet you know that’s the kind of thing that we see too often in media that that real sense where they’re using data to find a story rather than, or you know they’re reacting to the wrong things in in the data and not questioning going well did this really happen you know and we see this with polls if now with polls if if there is say a big 5% jump in any one month journalists now know that they have to acknowledge this probably is a bit of an outlier something you know might have just been a bit of a statistical anomaly but we don’t do that for other forms of data that perhaps we should do, so you know.
Linda: Yeah that um it’s a fine line isn’t it between ignoring the data and and crying it as useless actually being well I call it rationally skeptical of you know of what you’re looking at so that you’re asking you know where where are the floors because there will be floors and and are the floors fatal to what we want to use the data for or are they just you know enough to make it a little bit cautious and careful about your results. Have you, what are the worst data mistakes you’ve seen are they,
Greg: Oh mistakes, it’s always hard to know whether it’s a mistake or it’s a willful misuse you know.
Linda: Yeah I have those as separate questions but I’m not sure that they can be separated that clearly .
Greg: I mean I have seen you know there’s been some bad, a lot of misuse of, not misuse, mistakes with data is I would say an ignorant misuse rather than the willful misuse of things like, where for example they’ll compare wages paid in Australia compared to other nations and show that Australia is is an expensive place to work and and and then suggests that oh you know this is, gives credence to people like Gina Reinhart saying you know miners don’t really want to invest in Australia and we should be doing something to to reduce mining, to reduce wages or make us more competitive and it’s like but then you’ll, and so you’ll just have this two you know figures where you see cost per hour in Australia compared to cost per hour in it might be Canada it might be eastern Africa um where there are mines you know iron ore mines and but then when you actually look at other data such as surveys of of mining companies around the world of which countries they believe are the best places to invest Australia will be number one or two you know it’s because there are so many other things that companies care about of where they’re invest such as if we open up a mine are they going to be marauding hordes of gangs coming in and taking over our mine as might occur in some countries do we have to worry about paying bribes to to government ministers do we have to worry about so many other things, do they actually have a an educated workforce that we can, you know do they have engineers there that we can actually employ all these other things and yet it’s that it’s that use of data of just oh I’ve found one data point and that explains all this and you know like no you this is just not only it’s it is a bad use of data it’s a data mistake, it’s really uninformative and and misleading to people I mean other data mistakes you know I can remember seeing a thing in the Australian, front page of the Australian where they were basically trying to guess that Australia’s income tax was terrible when it needed to be flat and this was back in 2010 and I was sort of down in the bell going this is what they want to do and I was like no one’s going to go for a flat I said like this is what they wanted and sure enough we’re there now at stage three but I can remember them you know they will, they’ll for example show that um you know uh people earning under 20, 000 pay no tax and people earning over 200, 000 they pay 20 percent of tax or something like that and it’s just a a terrible, not amazing mistake I haven’t seen too many mistakes where they’ve actually got the wrong numbers but just mistakes where they’ve really used numbers in a way that doesn’t make any sense and I have seen more sort of I see sort of graphical mistakes I guess where you know that they’ve actually, and generally it’s because they’ve given a task to some graphic designer in their department rather than actually recreate the the data and so they’ve said oh the you know the then the curve is sort of goes like this and they produce a graph that mate looks nothing like actual reality or anything like that but I can’t think of too many actual mistakes that that come to mind I mean I know I’ve nearly made a couple you know where you’re and and that’s where I think a good lot of that trial and error occurs because there’ll be times I’ll do something I’m wow that’s that’s amazing that I didn’t expect to see that and then I realized I know I’ve calculated a a nine month change rather than a 12 month change or something like that I’ve just stuck up the spreadsheet.
Linda: So is that your, is that is that what you use to spot things like when you when you get a result that look too good to be true or,
Greg: Yeah that’s always that’s always a good sign that you maybe you’ve made a mistake and I, it happens quite a lot where I’ll be especially when I’ve got multiple spreadsheets open and you know one they’ll be looking at increase in say employment and another one I’ve got increase in profits or something and when I transfer them over I get them in the wrong direction or the wrong order or there’s you know you’ll be looking at one industry and you become wow that you know the health industries seems to have really lost a lot of workers what the hell happened there and then you realise oh no, you know I’m using the wrong industry that’s other services or that’s arts and breakfast something like that and so quite that’s always a good check that if something doesn’t doesn’t seem right that’s when you know you can know that something is wrong and mistakes and I can’t think of any offhand but I know sort of in the past there have been mistakes by journalists and it’s a case of you’ve looked at a figure that you think means something and we know it doesn’t mean something because it’s actually kind of erratic it’s kind of weird and it’s that sense of you didn’t, you didn’t actually understand what you’re looking at and the data itself might be right but your analysis of what it is is completely wrong.
Linda:Yeah. It’s easy to do, I remember working with a spreadsheet of solar installations, and I use this example all the time now in my workshops because I just sorted it, and the top 20 post codes were were WA and Queensland I was like oh WA and Queensland were the best for solar that’s really interesting I didn’t expect that you know I didn’t think the policies were supportive of renewable energy and stuff and I I wrote a little script to calculate the state average and uh by the state average WA in Queensland were among the worst and it was only because I wanted to play with with a bit of python and write that script that I discovered that my assumption from the just looking at the top 20 was completely wrong, turned out you know they had a bunch of new developments and to meet the solar and to meet the sustainability guidelines that had to put solar on and so yeah I completely misunderstood the spreadsheet because it didn’t look closely enough and I didn’t question that first result was a good object lesson.
Greg: Yeah I mean sort of something on that lines and and this is I guess a good example of something that’s misused and a mistake I can remember Victorian government making a big claim about the the latest national count showing that private investment in their state had absolutely exploded and what they didn’t realize that when you’re doing private investment in the the national counts you also have to take into account governments buying things off of private investors or the privatization of things which means you actually have to take away one figure from the other and when you took away that figure you get the accurate amount and they hadn’t done that and I was actually able to show when you when you do the right figure private investments actually declined it only looked like it had gone up because you know they hadn’t taken into account that there’d been a big privatization thing or something like it was yeah and that but by the same token I’ve done that as well where I’ve kind of forgotten to do it and you know you do occasionally make those sort of errors but where you you know and occasionally you know I’ve done it you know got the decimal point wrong or written billions instead of millions and things like that but it’s always a case so long as the actual analysis of the numbers is fine I’m not too worried it’s always a case of if you have just completely got the understanding of what the data was wrong that’s where you really sort of start losing sleep but if you go oh actually sorry I said that there were hundreds of thousands there’s actually hundreds of millions so long as the trend was going up and you were saying this is a good thing it it doesn’t really matter so long as you can quickly fix it before too many people fight out it’s okay but it’s when you go oh god I completely got that wrong and what I was suggesting happened didn’t actually happen that’s when you really,
Linda: Like the Reinhart Rogoff error
Greg: Yeah absolutely yeah
Linda: Which is a classic error in economics which I talk about in my workshops a lot, basically, and two years I think of global economic policy were based on it before they found out, they’ve probably still not included the whole problem in their calculations.
Greg: Yeah, yeah, incredible, and the type of thing that makes everyone lose a bit of sleep.
Linda: Exactly, yeah, what’s the first question you ask when you look at graphs in the media, what do you look for?
Greg: One, I kind of look at it and go okay was this actually derived from the numbers or is this just a graphical representation of things because the old school style was sometimes just sort of get it something close to what we’re after and one of the good things is with a lot of uh online tools now is that you can just use the the actual numbers to get the precise thing always look at I mean the standard things like okay where have they got the the y intercept and things like that and also are they um missing key bits of context um you know are they showing in a way that is either hiding what’s going on or suggesting that there’s more going on than actually is and so looking at things like the scale is always a good thing and it’s something I’ve even sort of done a little bit in that and not so much in articles but just on Twitter like you can just go to websites and find changes in the value of the Australian dollar but quite often the it’ll look like there’s been a huge drop but the scale is so small that you’re actually looking at changes in you know 0.1 percent of the dollar you know now um so it looks like a big fall when it’s just gone from 67.1 US dollars to 67.07 dollars you know it hasn’t really changed at all but the scale is so small so that’s always something I look for and, but to be honest from a technical point of view I also just look at is this actually a good way to present the data or is it confusing or could have I done it better especially if it’s something that on a story that I’ve also written on and I’ll look at their graphs and go yeah that’s a terrible way to use it but more, but so it’s not so much for the media it’s more when I’m looking at sort of government reports and other reports um it’s more that checking to see is this hiding something or is this showing something um that’s really what I kind of like to focus on.
Linda: Yeah I think the the idea that it depends on the audience is an important one you know if you’re trying to present to the public, I remember early on in the pandemic, there was an ABC team whose whose data work is normally great and they published a log graph and I called them out and I went people don’t read log graphs people don’t understand log graphs, it doesn’t represent the data in a way that people will understand and they were like but it’s it’s accurate like it’s the right way to present the data yeah yes but your audience isn’t going to get it, and when you read it as a linear scale it tells a different story.
Greg: Yeah, yeah, and that was really true during the pandemic because I get, and it was a real good show of how academics and actual people understand the data, use data differently than real people so for you know for actually like you know you and I and others we probably if we were analyzing data probably would use a log scale because it made sense yeah but you’re right as soon as you put a log scale people go what the hell’s going on here, I mean I have you know,
Linda: Or they just don’t even do that they just read it as a linear scale and get a completely wrong impression
Greg: Yeah, exactly. I’ve used a log scale maybe once maybe twice and only I think the few times I’ve used a log scale was when I was things like looking at changes in the stock market over the last hundred years you know where it actually using a linear scale was kind of stupid it makes it look like we’re having massive swings big falls and big drops down because the value of the Dow Jones index is 10, 000 whereas back during the Great Depression it was 100 you know and so I’ve used log scales like that but yeah by and large you you don’t want to confuse the readers I mean I always am writing for an audience who I presume don’t like data and don’t understand data and certainly don’t know any economics and I want to make things as easy as possible for them and when I’ve taught data journalism, I’ve said that really a good graph it does need a good title it does need all these things but the story in itself should be obvious from the graph like and it might just be you’ve got six bars and five are pretty constant then the last one there’s a big jump or a big fall and you’re like okay something’s happening there there’s a story and then I’ll read the graph to actually find out oh this is measuring population or it’s measuring deaths or something but just actually looking at it you go wow there’s there’s a story that’s what I’m trying to show to people that something has happened and this is what has happened or conversely nothing has happened and you should be able to show that and it’s, that’s the problem where as soon as you start trying to be too technical you’re writing for an audience who are going to look at this graph and go I have no idea what I’m meant to be getting from this and while academics might be able to go oh yeah good point um that’s not all that helpful if your data’s not being understood by people who really need to understand it.
Linda: That’s right and you can never rely on people to read the fine print.
Greg: No no no not at all, you know so.
Linda: What excites you about data?
Greg: I like that it actually, it well, kind of what we’ve been talking about it it can cut through the bullshit but it also can reassure people that their feelings about things are not mistaken but also it can, it can sort of, it can help and unfortunately it’s not as powerful in persuading people that their feelings are wrong, otherwise we probably wouldn’t have climate change but it can at least, if if done well at least sort of explain to people yeah that was a thing but it’s not a thing now or this is now a thing we should be caring about, you know, and the classic is actually sort of the a boring thing or in some ways boring, unemployment I mean no one ever cared about underemployment because there was no reason to care about it you know it went up and down with underemployment so if you wanted to reduce underemployment you reduced unemployment they went up and down in sync and then in about 2013 one thing of these things is not like the other they went in different directions and everything kind of went a bit skew if and so data was a good way to show that actually now we should be worrying about a bit more than just the unemployment rate there are other things that are going on whereas if you’re just relying on the sort of people saying things that you can often get taken down paths that actually don’t exist or paths that, you’re relying on old old views that that are no longer relevant I like that data, to refute data you need to find other data in a sense and I like that I can use it to to make arguments and to show things that are going on in the world and if people disagree with them they need I need them to come at me with something more than just oh well you’re a left winger what do you know it’s not okay but but these people are living in poverty what you know you know and it’s I think you can get trapped into thinking that uh data is all persuasive, I think that’s one of the big things that I’ve learned over the the past sort of 15 years that people will see data and refuse to believe it if they don’t want to but I think data has been very powerful in forcing governments to confront things that they’d rather not confront I like using data and because I find stories that are interesting, I find things going on in the world that I was didn’t realize were going on and and a classic one is just that we were doing at the Australia Institute right now and we’ll be coming out soon a report we’re looking at unemployment and looking at the gross flows of of the labor force so what what happens every month not sort of the changes in the unemployment rate but people who are employed last month what are they doing this month, people who are unemployed last month what are they doing this month and what we found is that most people who are getting new jobs every month aren’t coming from being unemployed they’re coming from being outside the labor force so they weren’t in the labor force last month and now they’re employed, which is a really interesting thing especially because it seems like that share of new employment is increasing and like the share that’s coming from people outside the labor force is increasing which makes you think okay what does that say then about our unemployment rate what is it missing it seems to be missing a lot of people who aren’t technically unemployed but are fairly open to getting a job because they’re now employed they weren’t last month but they are now, now there’s always been a lot of that, yeah I mean there was always a lot of that and a lot of it was school leavers you know they when you’re at school you’re not unemployed because you’re not looking for work school holidays have finished you’ve got a job, but the fact that it’s been growing and not just in January and February when the school leavers are coming on into the labor force makes us think that maybe there are a lot more people who are marginally attached to or you know what we call marginally attached to the labor force and not been counted who perhaps are a bit more attached than we think they are and that maybe that means the actual unemployment rate is a bit higher than it is and so maybe the the labor force is not as tight as we think it is and so that’s why wages haven’t exactly been taking off in any great way and it’s those type of things that I hadn’t really considered and until I read this report by some of our economists and I’m like oh that’s that’s a that poses some interesting questions and and I like that data can reveal that because there is so much data that you never know everything and you would look at something and and it and it can force you to question what you had just assumed was so and I like that, I mean to be honest right now I’m looking at the unemployment rate and and essentially it’s been three and a half percent for nearly a year and you’re like how the hell can that be given we’ve jacked up interest rates by 400 basis points things, surely the unemployment rate is going to go up and it’s not and so you’re like what’s going on here why why is this not happening in the way that I thought it would happen and I like that data, if you are open to it, will actually challenge you to to question your assumptions on things and I enjoy that and then I enjoy, I really enjoy just doing graphs and coming up with ways of going look this is something you know that you will never understand how I came up with this figure of you know net disposable household income per capita you need to know how to do a bit of maths and understand economics but I can show you this in a way where you’re going to go bloody hell hell living standards are going to hell and you wouldn’t have been able to show that to anybody or do it yourself but I have been able to use this data to tell a story that is not being told has been missed and can actually affect the debate and I just give my readers especially at The Guardian just give them greater understanding of their world that’s that’s what I really enjoy about doing it is that I want my readers to come away from having read my article with more than they had before before him so they’re able to talk about something understand things more than they used to and data is just very good at that because otherwise you’re just repeating someone else’s opinion someone else’s view of things and after reading an article you’re left with saying oh well Jim Chalmers says this is what’s happening and I just like the fact that I can people have to read mine and say well god you look at this you know living standards have gone down four percent in the past year that’s the biggest ever yeah and I just I find that something that I really get a buzz out of doing that especially when people come back to me and give me comments and and say you know I had no idea that that was a thing or I didn’t understand this before and now I do and you know I just think that’s good for society when when that happens and I just know absolutely data is a good way of doing that because I understand that people are scared of data and so I kind of think it’s beholden to those of us who aren’t to to not use it as a shield that protects us and keeps away the the populace but actually it’s something we use to show people what is going on don’t get confused by these figures we’ve done all this study so that we’re not confused and now it’s our job to to explain to you what’s going on rather than oh god if you if you can’t be bothered to open up the spreadsheet why should I care about telling you what’s going on or here’s a graph with seven lines all going around the place and can’t you see that the red line has jumped up and the blue line’s going down and you know it’s like it’s a case of there’s so much interesting stuff so much stuff that people have no ideas out there how many releases the ABS does and OECD and you know health agencies and NASA and all this masses of data and it really there and it’s all available but most people don’t have the tools to understand all the time and I just love that that this data can be used to inform and to you know also to entertain but just to to explain what’s going on I mean I did a thing on Twitter last year because I was actually writing a report on inflation, I was a bit bored one Friday afternoon, so I did a couple graphs or no more than a couple dozen graphs where I showed over 40 years the the prices of things the price index of various goods and services compared to CPI and so you’d have two lines CPI sort of going along nice and other lines going up and down or doing various things and I just tweeted them without the actual lines labeled and say okay who can guess what this product is and you know the ones where it was rising really fast until about the 1990s and then it sort of flattened down and people go oh god and you know some people go I know that’s motor vehicles and other people you know ones where the line has just gone off the charts are going I think they might be cigarettes and I’m like yep they’re cigarettes you know and and other things where it’s gone it’s kept falling and they’re like oh it’s computers and you know and it was just and also other graphs where there was huge drops in a couple periods and it was child care because they were the times where child care was subsidized and it was just doing it in that way and getting people to think about okay what’s why did prices go why is this thing going and being able to guess and all that and some of the guesses were wildly wrong and others were oh yeah it was you know they were thinking it was some sort of fruit and I was going no this is bananas or something and yeah that was when the cyclone occurred or things and and it’s that,
Linda: I love how you need that context to understand, to make sense of it.
Greg: Yeah this thing where you can use data to actually engage people and find stories and and it was one of the reasons why I love teaching data journalism when I was when I was lecturing at the university of Canberra because I have all these journalism students who would rock up and they hated the subject because they had a data they didn’t yeah there’s very few journalism students who who want to be in journalism because they like numbers they all, they’re all writers you know that’s what they want to do and I’d be like you know they’re, and part of the you know generally the sports journals really like because they understand stats and you know batting averages and things like that, but even for, I remember I had this one student who she just was scared of spreadsheets had no idea was going I said look data is basically often just counting things and I and she came up with and one of the exercises was they had to do a data journalism article, feature article and I said well look you know I’m not expecting you to do you know go into the ABS figures and write an economics article you’re not an economist why would you do such a thing but you can count things you can find something that you think there’s an interesting story about that maybe some data are actually reinforce or question people’s views of things and she had sort of read an article where there was just talk about discrimination in fashion and in cosmetics and she realized there are all these cosmetic brands have all got websites where they’ve got all their products and so she went through all of them and basically tabulated the shades of foundation, shades of lipstick and then also had prices and so she was able to work out what percentage of cosmetic companies at various price ranges were actually offering products that were suitable for women of color and she was able to show the high-end brands were pretty good but your more common brands that would be available in Woolies and Coles were not so good and you know was able to tell a story that actually yeah people there is a real discrimination going on and you know and she was kind of stunned that oh that’s what data is that’s, it’s telling stories with numbers and you know I had other people had another one who was like okay I’m going to go to every every target every Kmart and and I’m just going to go the women’s section and see how many size 16 white t-shirts there are that I can buy and like there would always be at least one but there’d be five size 10s five size eights seven size sixes or something and so she just tabulated you know and she went to every store in the ACT from you know Gungarland down to Togrenon and I’m like this is this is what it is and yeah too often people think data is is all about oh what the ABS puts out or the OECD or the RBA and yeah that’s great for economics but a lot of data is just counting what is going on in the world and being able to say you know I went into a shop I went into five shops and there were there was one size 18 dress and there were 30 you know size less than that okay it’s I’m not saying this is every shop but I went to every shop in the suburb or in the shopping center that’s telling us something and you know it’s that type of thing that I like with data that you can so long as you are honest in how you’re going about gathering it how you’re you’re looking at you can really cut through in a way that going shit I didn’t realize it was that bad you know I knew there was something bad going on but I didn’t really just how huge it was and it can be really powerful in a way that isn’t so much when you have women just being quoted saying something when you’re actually able to put the numbers to it you’ve got oh my god check this out.
Linda: I love that, that’s why I run projects in schools so that kids can see that data is a tool that they can use to understand and make change in the world and it’s you don’t need machine learning or really complex stats you can do an awful lot with a spreadsheet and even just you know sorting and averaging and drawing a graph with a trend line, it’s amazing what you can do.
Greg: I mean to be honest my whole sort of base that’s comes from when I was doing my honors thesis and my supervisor and I really stupidly for my own sake especially in terms of the market but he would often say yeah god, all you students and and he’d say oh a lot of the columns they worry too much about are there regressions and and all these numbers he said sometimes a good graph just tells the story really well and because my head oh no I got to do a good regression show our power my mathematics is in my mathematics was not all that powerful and now it’s all always like yeah a good graph just sort of explains what’s going on and and shows it and you know it’s I think it’s a good lesson that just you know don’t don’t think that data is all about maths and how technical and how wonderful your maths can be sometimes it’s just oh look I averaged these things and showed that actually all the jumps that all the movement that’s been happening over the past 12 months seems to have just been a lot of noise and nothing really has changed you know and you didn’t have to do much you just did an average you know so I always try to demystify it for people but I know certainly with my readers that’s my key thing is like I want to do all the hard grunt work for you so it all just seems clear what’s going on and hopefully it’s interesting and informative.
Linda: That’s fantastic, I can see myself using particularly that last 20 minutes in all my workshops. Thank you so much for talking today, it’s been super interesting Greg!
Greg:No problem Linda, it’s been great, always love chatting about data.
Linda: Thanks for listening to Make Me Data Literate. If you’d like to support the work of the Australian Data Science Education Institute you can donate at givenow.com.au/adsei or via the ADSEI website. Let’s change the world together.

Obtaining an understanding of the topic is possible for every user through the reading of this superb blog post.
Greg Jericho’s discussion on data communication is both insightful and eye-opening. His emphasis on the simplicity of a good graph resonates, reminding us that data isn’t just about complex math. His journey of digging into spreadsheets and constructing data himself showcases the power of independent data analysis. Jericho’s perspective challenges traditional journalism and highlights how data can tell the story. It’s refreshing to see someone like him break away from the norm. Kudos to the author for sharing this thought-provoking conversation!
https://www.analyticspath.com/data-science-training-in-hyderabad/