Using Real Data Projects to Engage Kids with STEM

I want to start by asking you a question: What gets you out of bed in the morning? What really motivates you?

For me it’s the chance to make a difference in the world. It’s wanting to leave the world a better place than I found it.

And that’s something that STEM skills are perfect for. They are for problem solving, for designing better ways to do things. For bringing clean water, clean power, increased food production, solutions to climate change, safer transport, personalised medicine, and a whole host of innovations to the world.

But when I first started teaching in a high school – a science school, no less – we were teaching “STEM” as “fun stuff”. Drawing pretty pictures. Making robots follow a line. Playing with toys.

How many of us are motivated, I mean really motivated, by toys? Some of us are, especially technical people! But those are generally the people we’ve already GOT in tech! I’m much more interested in the people we haven’t got yet.

All too often we ask those kids who are not already into tech to get out of bed for the chance to have fun. And fun is great – I like to have fun, we all do! And not all of my fun is finding an interesting new dataset and analysing the hell out of it, I promise. I do have other ways of having fun besides writing an interesting new Python script. Really I do. But fun doesn’t get me out of bed in the morning. Fun is a hobby. A diversion. A toy. That’s not what we need kids to understand about STEM.

We are handing our kids a world in desperate need of creative solutions. Of innovation and entrepreneurship. Of change.

And we’re telling them that STEM is fun! It’s for designing 3D jewellery. It’s sparkly. It’s pink. It’s useless.

We are doing kids a huge disservice. They’re kids, therefore fun is the way to reach them, right? It’s like saying we want more women in tech so we’re going to paint some things pink and offer some courses in the chemistry of makeup (a real suggestion that was made at an actual school). It’s like saying “women do hardware too, let’s sell them some pink hammers.” (and that’s also a real example)

When we were teaching computing using “fun toys” the overwhelming feedback I was getting – from science students – was “Why are you making me do this? It’s not relevant to me. I don’t want to do it.”

Can you guess what happened when we made the year 10 computer science course a data science course instead of a “fun toys” course? We were teaching the same basic coding skills. We still had them learning about selection, iteration, variables, and functions. But now we were using real datasets and finding real questions to answer, real problems to solve. Do you know what happened?

Suddenly they could see the point. They found it useful. They found themselves using the skills in other subjects, especially in project work. And the numbers who went on to the year 11 elective computer science subject increased by around 30%, with double the number of girls.

And none of it was pink!

That first data science course I had a student who was super interested in politics, and there was a federal election, so we used data from the Australian Electoral Commission. Turns out you can download csv files containing every single vote from any Australian election.

We used the senate votes for Victoria for the 2016 Federal election. Over 3 million lines of csv, they contained polling booth, electorate, and a 151 position comma separated string containing the contents of every box on each ballot paper.

3 million lines of csv won’t even open in excel, so the kids had to program just to open the file. They learned about using a small section of the file in order to test their code, so that it didn’t take ages to run. They learned about what questions a dataset could answer.

They found their own questions – from which party’s voters were more likely to follow the how to vote cards, to where Pauline Hanson voters came from. They asked questions about their own electorate or polling booth and how they compared to the whole state. About female representation and share of the below-the-line vote. About preference flows and about how polling compares to actual results. Every student asked a different question, which meant that every student had to write different code to find the answer (goodbye plagiarism!).

And then the important part happened: they had to visualise their results. To create an image, more interesting than an ordinary graph, that conveyed their results in a convincing, valid, and compelling way.

They learned about channels of information, about the human visual system and attention. About colour blindness and the problems with the rainbow scale. They learned which types of graph are appropriate for different types of data, and how to customise their graphs so that they don’t mislead their audience.

As well as learning to analyse and visualise data themselves, they also learned to be critical data thinkers, reviewing graphs and statistics they are presented with using critical questions like “How was that data collected? What was the sample size? And where is the zero on that scale?”

We have a tendency to bend at the knees when presented with statistics and graphs. It seems to automagically make information more credible. But they are very easy to manipulate. So it’s crucial, in this era of fake news and anti-science, that our kids learn to be critical thinkers.

Another reason we need our kids to learn data science skills is the increasing dominance of Big Data and Machine Learning in every aspect of our lives. They are determining our healthcare and our access to home loans. They’re directing our traffic and influencing our consumption and behaviour – even our votes! They’re controlling our justice systems and our borders. But how many of you really feel like you have a good understanding of how the algorithms that do these things actually work? How many of you are confident in the fairness, impartiality, and accuracy of these systems?

And this is a highly educated audience. Think about that for a moment. These systems are running our lives and we have no say in how they operate. We don’t even understand them.

So it’s crucial that we educate upcoming generations to have informed, intelligent conversations about these systems. So that we can have that long delayed community conversation around the way we manage our data – and the way it manages us.

And to do that, we need to engage kids with data in the classroom. To show them its relevance, and to build their Data Science and technological skills.

The problem with finding cool datasets and building them into interesting lessons is that it’s hugely time consuming and highly skilled work. When I used the electoral data it took me hours to make sense of the dataset. I couldn’t even find anyone in the electoral commission who could explain it to me, so I had to derive it from first principles. The only reason I had the capacity to do that is that I was part time, so I used my own time, unpaid, to find the dataset, make sense of it, and build a project around it. Most teachers simply don’t have the time to do that – or, to be honest, the skills.

It’s also important to acknowledge that student motivation is not the only issue we face in teaching tech in schools. The problems are many. Tech has an image problem almost as bad as teaching does! So kids don’t see themselves as the type of people who go into tech (and this affects boys as well as girls).

We attract the kinds of people into tech that we already have – generally people with a very narrow personality and background distribution. This conference is obviously full of the exceptions to that rule. 🙂 But it’s a real problem if you want innovative solutions that meet the needs of everyone, not just the tech nerds of the world.

We lack skilled teachers, in part because the correlation between that classic tech personality type and the kind of person who loves to teach seems to be, frankly, quite low, but also because if you have tech skills you can EASILY earn a LOT more and work a LOT less hard by NOT going in to teaching. But we also have a large cohort of teachers who are flat out terrified of technology. So if we force those teachers to teach our shiny new Digital Technologies curriculum, they can’t help but convey that fear to their students.

That’s why I founded the Australian Data Science Education Institute (which, by the way, is a registered charity). To find and make sense of the datasets, to build cool projects around them that are aligned with the curriculum, and to train teachers in the skills they need to incorporate data science into their teaching. We start from where teachers are and build their skills gradually, in the context of their own disciplines.

We don’t expect them all to program on day one. We start with spreadsheet skills and projects that both teachers and students find relevant and interesting.

Using Data Science teaches kids why STEM matters, and gives them the opportunity to use STEM skills to change the world. So we use this template for finding, analysing, and solving problems in the local community.

  • Find a problem
  • Measure it
  • Analyse the measurements
  • Communicate the results
  • Propose a solution
  • Implement the solution

And that’s the crucial part that we need to make the default position anywhere where we try new things: That we measure & analyse them to see if they work. Because in governments, in schools, in businesses: too often we see new programs implemented as a matter of ideology, and the only “assessment” that happens is for the champion of the program to say “It was awesome!”

And when you say “How do you know?” Everyone goes suspiciously quiet and changes the subject.

Incidentally, that’s why ADSEI collects feedback data on all of its courses, and why we’re also building a feedback mechanism for our online resources.

We also have a template for exploring global issues:

  • Find a dataset
  • Explore & Understand it – and this means understanding the domain, a fact we tend to lose sight of.
  • Find a question it can answer
  • Analyse it to find the answer
  • Communicate your results

ADSEI’s ultimate goal, of course, is to put itself out of business. To build Data Science into the way teachers are trained to teach. To build a community of Data Scientists and teachers who can support each other by sharing resources, project ideas, and cool datasets.

I think my job is safe for the moment!

For now we have grants from the Victorian Department of Education and Training, Google, and the Great Barrier Reef Foundation. We’ve developed teaching resources for Monash University, CSIRO, and the Digital Technologies Hub. We have delivered workshops and talks at conferences and schools, and we are working with the wonderful people at Pawsey Supercomputing Centre and the West Australian Marine Science Institute.

And ADSEI has only been in existence for 18 months.

Over the next few months we’ll be running workshops in Perth, Melbourne, and Alice Springs.

Next year in October we’ll also be running the Inaugural International Conference on Education and Outreach in Data Science and High Performance Computing, with the support of the awesome Australasian eResearch Organisation – Sponsors welcome!

So if any of this sounds like a mission you can get behind, join the slack channel, check out the website, send me an email ( or tweet at me wildly. Because Data Literacy and Data Science skills are something all kids need to experience, before they decide that Data Science is too hard, too boring, or not relevant to them!

If Data Science is going to drive us to the future, I want to put all of our kids in the driver’s seat!

Primary School Data Science Template

People often assume that Data Science in Schools has to be secondary school only, because how could primary kids do Data Science? The truth is that Data Literacy and Analysis skills can be built in to the curriculum from as young as 5 years old. And it’s really important that kids learn Data and Tech skills early, because by the time they get to secondary school we’ve already lost a lot of them, believing that these skills are too hard, not relevant to them, or just not interesting. We need to show them early on that Data Science is a useful tool that they are more than capable of mastering.

So how can primary kids do data science? Like any other data science project, it’s crucial to put it in context, so the kids can see the point.

So Step One is: Find a problem the kids care about

It might be litter in the playground, traffic at pickup time (or, to put it in a way kids will really relate to – how long they have to wait to be picked up, or how far they have to walk to the car!), or access to play equipment.

Step Two: Measure the problem

Count and identify the litter, time how long people have to wait to be picked up, measure how far people have to walk to the car, or count the number of people who get to use the monkey bars every lunchtime for a week.

Step Three: Analyse the measurements

For younger kids, that might simply mean sorting the rubbish into categories (eg chip packets, icy pole wrappers from the canteen, and sandwich bags or cling wrap from home), or organising the drop off or play equipment measurements by year level or by day. For older kids you might enter it into a spreadsheet and use a formula to calculate some averages over the week or by area or year level.

Step Four: Communicate your results

This is where you graph or visualise your results. For the littlies they can “graph” the results by stacking up blocks to represent the different categories. Green blocks for chip packets, blue ones for icy pole wrappers, etc. This is a great, tangible, exercise in data representation. Older kids can draw graphs or do them in a spreadsheet like Excel or Google Sheets. It helps to get them to draw pictures and labels on their graphs to make them more interesting and compelling.

Step Five: Propose a solution

Think of a way you might solve the problem. For litter the kids might come up with nude food day campaigns, or a change to the way food is available in the canteen – such as using larger chip packets and handing out small paper bags chips in them, instead of lots of small plastic packets. For traffic it might be that pickup times can be staggered by year levels, or older kids might be encouraged to walk further and be picked up a block or two away.

Step 6: Implement your solution

This can be a whole school initiative, and involves a lot of communication, using the graphs from Step Four to tell the community what’s happening and why.

Step 7: Measure again to see how well it worked

This is my favourite step, often sadly missing from political initiatives. Once you’ve tried to fix something, you need to measure it again to see if you actually made any difference.

You can even repeat steps 3 to 7 with several different solutions to compare which ones work better.

I love this template because it is the essence of STEM – It’s a science experiment, devised by the kids, with rigorous measurement and evaluation. Maths and Technology are used in handling the data, and you can use Engineering to design your solution, or even to measure the problem if you’re looking at environmental conditions like heat, noise, or water and want to use some sensors.

You can scale the technology use up or down depending on available resources and where your students are up to. There are no robots with parts to fail. And the best part is that the motivation is built in. The kids are learning that STEM and Data Science are tools you can use to solve real problems in your community. They’re not just a bit of fun that’s not relevant to their futures.

ADSEI is developing more projects like these over the next year, as well as building a network of teachers interested in sharing their ideas and supporting each other to introduce integrated STEM and Data Science in the classroom. Jump onto the mailing list to stay in touch, and feel free to share your own ideas in the comments on this post!

Why robots are a disaster for tech education

It’s very tempting to see robots and other shiny tech toys as fantastic motivators for STEM education. After all, who doesn’t love playing with cool toys? Unfortunately this kind of hardware has huge drawbacks in the classroom. To show you why, let me tell you a story.

On the weekend I took my kids to Oz Comic Con. My 11 year old, Jen, is a HUGE tech nerd and loves all things hardware, software, mathematical, and, of course, STAR WARS. Dressed as a Jedi and wielding a lightsaber, Jen was magnetically drawn to the stall selling star wars drones. Jen had been saving for Comic Con for months, so the $50 cost, while more than they have ever spent on anything before, was well within their reach.

I did a quick bit of online research and it seemed like a good buy.

Behold Jen’s X-Wing in all its glory.


You can imagine the excitement when we got it home, but we were out to dinner that night and didn’t have time to unbox and charge it. The next day Jen bounced out of bed and went straight to the box. Eating, drinking, and other necessities of life were not on the agenda, so it was lucky it was a public holiday and I didn’t have to try to get them to school.

Once charged (the drone), batteries installed (the controller), and with the beginner-pilot’s safety cage installed, we fired it up. The controller even buzzed when we inserted batteries and had Yoda saying “feel the force!”. The excitement was INTENSE. The instructions said to power up the controller and the drone, flip the left hand lever up and down, whereupon it would beep, and the flashing lights would then stop flashing to show that the devices were synced.

But there was a catch. Beeping occurred as expected, but the lights on both devices continued to flash. We powered both devices off and on again. We tried different batteries. We even went shopping for new batteries. We spent all day trying to get the damned thing to work, to no avail. 3 days later it still didn’t work and we were waiting for tech support from the drone company to reply to our emails.

Now you may think we were doing something wrong – and perhaps we were – but I have a PhD in Computer Science, and my husband is an Electrical Engineer. If we can’t make it work, what hope does your average teacher have?

Unlike with programming, a student, a teacher, and even an electrical engineer have very little hope of debugging a device such as this one, because there is no feedback. There’s no way of knowing its internal state. Short of taking the device apart and resoldering each of the connections and testing each component (not skills taught in your typical primary education course last I checked), there’s no way to troubleshoot these things.

Whether Robot, Raspberry Pi, or Arduino, hardware all suffers from these issues. There’s a significant chance that they won’t work out of the box. Even if they do, connections come loose and they might stop working mid-lesson, or not work next time they come out of the cupboard. And what we teach kids with these kinds of intensely frustrating experiences – when they are trying to do the same things as everyone else, but for them it doesn’t work – is that these problems are insurmountable. That they have no control over technology, no power to fix it when it breaks, and no way of understanding how it does what it does.

These are not the lessons we want to be teaching our kids.

*Update: The company got back to us the day after I wrote this, and very quickly replaced the drone. 10 days after the initial purchase we have a drone that works – but Jen’s enthusiasm – and confidence – has taken a severe battering.