Whose restrictions are tighter?

There has been much squabbling about covid restrictions in NSW and Victoria, with very little evidence or data involved in the discussion. As a data science educator, I often stress that not all data is numeric. A reasonable dictionary definition of data, courtesy of Oxford Languages via Google is “Facts and statistics collected together for reference or analysis”, and I have to say that the covid debates, all of them, are far too sparing in their use of data, whether facts or statistics. So I decided to collect some facts and compare the restrictions, as listed on the state health department websites. This is an exercise you could easily do in class, and it is a wonderful application of data science.

NSW is currently under 3 different levels of restriction, depending on locale, so I have chosen to compare the Victoria wide restrictions with the Greater Sydney restrictions, as they involve the greatest proportion of the population.

The “too long, didn’t read” version of the comparison is: Victoria’s current lockdown is significantly stricter than the one in Greater Sydney. The only way in which NSW is stricter is the pausing of construction, which has not happened in Victoria.

NSW has quite a lot more retail open than Victoria, has a 10km limit compared with Victoria’s 5km, has childcare remaining open, as well as sporting facilities such as golf, tennis, and bowls, whereas in Victoria childcare is open, but golf, tennis, and bowls clubs are all closed. NSW also has a lot more permitted retail than Victoria.

It’s interesting to note the simplicity of Victoria’s mask rules, versus the bewildering complexity of NSW’s. This, to me, explains quite clearly why “masks outside” is important. Because if the rule is “wear your mask whenever you leave the house”, compliance is much simpler than “wear your mask when any of these conditions apply: (long list of situations where fine judgement applies).

It wasn’t difficult to find all of this out, but it did take me around half an hour of trawling through the two websites. All of the people who are on the internet shouting “NSW’s restrictions are tougher than Victoria’s!” and “There’s nothing more we can do, we are locked down as hard as we can” are absolutely playing fast and loose with the facts.

NSW can clearly be locked down harder, which so far seems to be working to suppress the virus in Victoria. We don’t know for sure that it will work now that the virus has spread so far in NSW, but we certainly don’t know for sure that it would not work. And it seems obvious that we should try, because the consequences are too awful to contemplate, in number of deaths, long term serious illness, and also the almost inevitable spread of covid to the rest of the country.

So before you leap into this, or any, debate armed with opinions and no facts, perhaps you could cautiously check your facts, and consider your options. Ideology is killing us. Facts could save us, if only we were prepared to listen to them.

We urgently need to train everyone, especially our kids, to collect the facts before forming opinions. It’s why I created the Australian Data Science Education Institute, and why I wrote Raising Heretics: Teaching Kids to Change the World.(which, by the way, you can pre-order now if you’re in Australia or NZ, or buy from all the usual places when it launches on August 1st)

Below is all of the information laid out for comparison.

Reasons to leave home, NSW: Only leave home if you have a reasonable excuse:

  • obtaining food “or other goods and services” for the personal needs of the household, or for other household purposes, or for vulnerable people, within 10km
  • to go to work if “you cannot reasonable work from home” or you “are an authorised worker living in the locked down areas
  • For education if it is not possible to do it at home
  • Exercise within 10km
  • Medical or caring reasons, including vaccination

Oh, but there’s a list of other reasonable excuses on another page, including to access childcare (which remains open), to visit intimate partners, gathering at Parliament.

Reasons to leave home, Vic:

  • shopping for necessary goods and services
  • care and care giving
  • exercise
  • authorised work and permitted study
  • to visit an intimate partner, single social bubble buddy, or an emergency

You must stay within 5km for shopping or exercise

Facemasks, NSW:

  • indoors when not at home,
  • some outdoor gatherings (working in an outdoor area, next to food & drink or retail, fresh food markets),
  • public transport,
  • major recreation facility such as a stadium,
  • working in hospitality,
  • construction sites,
  • indoors and outdoors at fresh food markets,
  • at covidsafe outdoor gatherings, and at controlled outdoor public gatherings (it was not easy to find out what these are and whether they are currently allowed).
  • Common indoor areas in residential buildings

Facemasks, Vic:

  • Indoors and outdoors whenever you leave your home
  • You do not need to wear a facemask if you are working alone, whether indoors or outdoors, unless another person enters the space.

Permitted retail, NSW:

  • supermarkets
  • grocery stores including butchers, bakeries, fruit and vegetable, seafood
    other food or drink retailers that predominantly sell or display food or drinks
  • kiosks and other small food and drink premises
  • petrol stations
  • banks and financial institutions
  • hardware, building supplies
  • landscaping material supplies
  • agricultural and rural supplies
  • shops that, in the normal course of business, operate as or sell and display
    • pet supplies
    • newsagents
    • office supplies
    • chemists providing health, medical, maternity and baby supplies or
    • liquor stores
    • post offices
    • garden centres and plant nurseries
    • vehicle hire premises, not including the premises at which vehicles are sold;
    • shops that predominantly carry out repairs of mobile phones
    • laundromats and drycleaners.

“Businesses may continue to operate if they provide goods and services to customers and follow the requirements for wearing of face masks and check-in requirements (for example, using QR codes).” – It’s unclear to me whether this means all retail/service businesses, or only the list of permitted ones.

Shopping must be within 10km of your home, or within your local government area.

Permitted retail, Vic:

  • supermarkets,
  • pharmacies,
  • butchers,
  • bottle shops,
  • petrol stations,
  • post offices,
  • banks,
  • food stores,
  • newsagents,
  • liquor stores,
  • pet stores.
  • Other retail shops will only be available for delivery or contactless click and collect, and workers may attend onsite to facilitate these orders.
  • Cafes and restaurants for take away & delivery only

You can only travel 5km away for shopping unless the nearest essential goods and services are further than 5km, in which case you may travel more than 5km to the nearest provider.

Only one person per day can leave home for necessary goods and services, and only once per day.

Construction: Paused in NSW, operating in Vic.

Social interaction: NSW & Vic: exercise outdoors with your household OR one other person, visit intimate partners, Vic: Single social bubble.

Childcare: Open in NSW and Vic.

Schools: Remote learning in both states.

Outdoor recreation facilities such as tennis clubs, bowls clubs, shooting ranges and golf clubs: open in NSW, Closed in Vic.

The importance of scepticism

One of the things ADSEI does in its lesson plans is ask the question: What is wrong with this data?

This is a really crucial question, because there is no such thing as a perfect dataset. All data has issues. Often its not the data you want, it’s simply the data you were able to get. For example:

  • whale observations tell you how many whales were seen, when what you really want to know is how many whales were there. Some whales might have breached but not been observed (shades of Schrödinger’s Whale), or swum by without breaching, or even been spotted twice but accidentally counted as two whales when it was really just the one.
  • speed cameras tell you the instantaneous speed of the car when what police really want to know is: has that car exceeded the speed limit at any time on this trip?
  • counting the litter found in the schoolyard tells you how much litter you found, not how much litter was dropped – some of it may have blown away or hidden under things. It also only tells you how much litter was there that day. What if a year level was out on excursion, or it was a wet day timetable…

And even when the data is actually what you want, there may be data that’s missing or flawed for various reasons. For example:

  • Facial recognition systems that were trained on images of faces that were almost exclusively white and male.
  • Phone polls that can’t include people with unlisted numbers.
  • Internet polls that can’t include people without internet access.
  • Surveys where people don’t or can’t tell the truth – for example about healthy eating, or sexuality, or where people don’t actually know the truth, for example about why they did things, or things they don’t remember (like what did you have for breakfast yesterday? Or how often do you eat broccoli?).
  • Skipped data where someone forgot to record a daily observation or the system went down and didn’t record any values.

Consider the reporting around the Corona virus. We have a reported death rate of around 2% which is highly speculative, because we have no idea how many mild cases of corona virus there are out there that are not being identified or reported. Some sources report the numbers and stress the uncertainty, while others report them as solid facts.

This is a kind of scepticism and critical thinking that we don’t often leave room for – in education, business, or journalism. Often we are in such a rush to get the “right” answer that we don’t have time to pause and evaluate the data we’re working with, to consider the flaws and uncertainty that are built in to any dataset, and any analysis.

If we can teach our students, from pre-school onwards, to question their data, to ask “how many ways is this data flawed” rather than assuming the data is perfect, then perhaps we can build a world which centers critical thinking and evaluates evidence.

This is why using real datasets rather than nice clean sets of fake numbers is crucially important to teaching data science. Because real world datasets are never nice, clean, and straightforward. There is no need for scepticism and critical thinking in textbook examples. But kids who have used real data in their learning are equipped to tackle real world problems.

Can you share some examples of flawed data? What consequences have you seen from people assuming data is perfect?

 

Using Real Data Projects to Engage Kids with STEM

I want to start by asking you a question: What gets you out of bed in the morning? What really motivates you?

For me it’s the chance to make a difference in the world. It’s wanting to leave the world a better place than I found it.

And that’s something that STEM skills are perfect for. They are for problem solving, for designing better ways to do things. For bringing clean water, clean power, increased food production, solutions to climate change, safer transport, personalised medicine, and a whole host of innovations to the world.

But when I first started teaching in a high school – a science school, no less – we were teaching “STEM” as “fun stuff”. Drawing pretty pictures. Making robots follow a line. Playing with toys.

How many of us are motivated, I mean really motivated, by toys? Some of us are, especially technical people! But those are generally the people we’ve already GOT in tech! I’m much more interested in the people we haven’t got yet.

All too often we ask those kids who are not already into tech to get out of bed for the chance to have fun. And fun is great – I like to have fun, we all do! And not all of my fun is finding an interesting new dataset and analysing the hell out of it, I promise. I do have other ways of having fun besides writing an interesting new Python script. Really I do. But fun doesn’t get me out of bed in the morning. Fun is a hobby. A diversion. A toy. That’s not what we need kids to understand about STEM.

We are handing our kids a world in desperate need of creative solutions. Of innovation and entrepreneurship. Of change.

And we’re telling them that STEM is fun! It’s for designing 3D jewellery. It’s sparkly. It’s pink. It’s useless.

We are doing kids a huge disservice. They’re kids, therefore fun is the way to reach them, right? It’s like saying we want more women in tech so we’re going to paint some things pink and offer some courses in the chemistry of makeup (a real suggestion that was made at an actual school). It’s like saying “women do hardware too, let’s sell them some pink hammers.” (and that’s also a real example)

When we were teaching computing using “fun toys” the overwhelming feedback I was getting – from science students – was “Why are you making me do this? It’s not relevant to me. I don’t want to do it.”

Can you guess what happened when we made the year 10 computer science course a data science course instead of a “fun toys” course? We were teaching the same basic coding skills. We still had them learning about selection, iteration, variables, and functions. But now we were using real datasets and finding real questions to answer, real problems to solve. Do you know what happened?

Suddenly they could see the point. They found it useful. They found themselves using the skills in other subjects, especially in project work. And the numbers who went on to the year 11 elective computer science subject increased by around 30%, with double the number of girls.

And none of it was pink!

That first data science course I had a student who was super interested in politics, and there was a federal election, so we used data from the Australian Electoral Commission. Turns out you can download csv files containing every single vote from any Australian election.

We used the senate votes for Victoria for the 2016 Federal election. Over 3 million lines of csv, they contained polling booth, electorate, and a 151 position comma separated string containing the contents of every box on each ballot paper.

3 million lines of csv won’t even open in excel, so the kids had to program just to open the file. They learned about using a small section of the file in order to test their code, so that it didn’t take ages to run. They learned about what questions a dataset could answer.

They found their own questions – from which party’s voters were more likely to follow the how to vote cards, to where Pauline Hanson voters came from. They asked questions about their own electorate or polling booth and how they compared to the whole state. About female representation and share of the below-the-line vote. About preference flows and about how polling compares to actual results. Every student asked a different question, which meant that every student had to write different code to find the answer (goodbye plagiarism!).

And then the important part happened: they had to visualise their results. To create an image, more interesting than an ordinary graph, that conveyed their results in a convincing, valid, and compelling way.

They learned about channels of information, about the human visual system and attention. About colour blindness and the problems with the rainbow scale. They learned which types of graph are appropriate for different types of data, and how to customise their graphs so that they don’t mislead their audience.

As well as learning to analyse and visualise data themselves, they also learned to be critical data thinkers, reviewing graphs and statistics they are presented with using critical questions like “How was that data collected? What was the sample size? And where is the zero on that scale?”

We have a tendency to bend at the knees when presented with statistics and graphs. It seems to automagically make information more credible. But they are very easy to manipulate. So it’s crucial, in this era of fake news and anti-science, that our kids learn to be critical thinkers.

Another reason we need our kids to learn data science skills is the increasing dominance of Big Data and Machine Learning in every aspect of our lives. They are determining our healthcare and our access to home loans. They’re directing our traffic and influencing our consumption and behaviour – even our votes! They’re controlling our justice systems and our borders. But how many of you really feel like you have a good understanding of how the algorithms that do these things actually work? How many of you are confident in the fairness, impartiality, and accuracy of these systems?

And this is a highly educated audience. Think about that for a moment. These systems are running our lives and we have no say in how they operate. We don’t even understand them.

So it’s crucial that we educate upcoming generations to have informed, intelligent conversations about these systems. So that we can have that long delayed community conversation around the way we manage our data – and the way it manages us.

And to do that, we need to engage kids with data in the classroom. To show them its relevance, and to build their Data Science and technological skills.

The problem with finding cool datasets and building them into interesting lessons is that it’s hugely time consuming and highly skilled work. When I used the electoral data it took me hours to make sense of the dataset. I couldn’t even find anyone in the electoral commission who could explain it to me, so I had to derive it from first principles. The only reason I had the capacity to do that is that I was part time, so I used my own time, unpaid, to find the dataset, make sense of it, and build a project around it. Most teachers simply don’t have the time to do that – or, to be honest, the skills.

It’s also important to acknowledge that student motivation is not the only issue we face in teaching tech in schools. The problems are many. Tech has an image problem almost as bad as teaching does! So kids don’t see themselves as the type of people who go into tech (and this affects boys as well as girls).

We attract the kinds of people into tech that we already have – generally people with a very narrow personality and background distribution. This conference is obviously full of the exceptions to that rule. 🙂 But it’s a real problem if you want innovative solutions that meet the needs of everyone, not just the tech nerds of the world.

We lack skilled teachers, in part because the correlation between that classic tech personality type and the kind of person who loves to teach seems to be, frankly, quite low, but also because if you have tech skills you can EASILY earn a LOT more and work a LOT less hard by NOT going in to teaching. But we also have a large cohort of teachers who are flat out terrified of technology. So if we force those teachers to teach our shiny new Digital Technologies curriculum, they can’t help but convey that fear to their students.

That’s why I founded the Australian Data Science Education Institute (which, by the way, is a registered charity). To find and make sense of the datasets, to build cool projects around them that are aligned with the curriculum, and to train teachers in the skills they need to incorporate data science into their teaching. We start from where teachers are and build their skills gradually, in the context of their own disciplines.

We don’t expect them all to program on day one. We start with spreadsheet skills and projects that both teachers and students find relevant and interesting.

Using Data Science teaches kids why STEM matters, and gives them the opportunity to use STEM skills to change the world. So we use this template for finding, analysing, and solving problems in the local community.

  • Find a problem
  • Measure it
  • Analyse the measurements
  • Communicate the results
  • Propose a solution
  • Implement the solution
  • MEASURE IT AGAIN

And that’s the crucial part that we need to make the default position anywhere where we try new things: That we measure & analyse them to see if they work. Because in governments, in schools, in businesses: too often we see new programs implemented as a matter of ideology, and the only “assessment” that happens is for the champion of the program to say “It was awesome!”

And when you say “How do you know?” Everyone goes suspiciously quiet and changes the subject.

Incidentally, that’s why ADSEI collects feedback data on all of its courses, and why we’re also building a feedback mechanism for our online resources.

We also have a template for exploring global issues:

  • Find a dataset
  • Explore & Understand it – and this means understanding the domain, a fact we tend to lose sight of.
  • Find a question it can answer
  • Analyse it to find the answer
  • Communicate your results

ADSEI’s ultimate goal, of course, is to put itself out of business. To build Data Science into the way teachers are trained to teach. To build a community of Data Scientists and teachers who can support each other by sharing resources, project ideas, and cool datasets.

I think my job is safe for the moment!

For now we have grants from the Victorian Department of Education and Training, Google, and the Great Barrier Reef Foundation. We’ve developed teaching resources for Monash University, CSIRO, and the Digital Technologies Hub. We have delivered workshops and talks at conferences and schools, and we are working with the wonderful people at Pawsey Supercomputing Centre and the West Australian Marine Science Institute.

And ADSEI has only been in existence for 18 months.

Over the next few months we’ll be running workshops in Perth, Melbourne, and Alice Springs.

Next year in October we’ll also be running the Inaugural International Conference on Education and Outreach in Data Science and High Performance Computing, with the support of the awesome Australasian eResearch Organisation – Sponsors welcome!

So if any of this sounds like a mission you can get behind, join the slack channel, check out the website, send me an email (linda@adsei.org) or tweet at me wildly. Because Data Literacy and Data Science skills are something all kids need to experience, before they decide that Data Science is too hard, too boring, or not relevant to them!

If Data Science is going to drive us to the future, I want to put all of our kids in the driver’s seat!