The importance of scepticism

One of the things ADSEI does in its lesson plans is ask the question: What is wrong with this data?

This is a really crucial question, because there is no such thing as a perfect dataset. All data has issues. Often its not the data you want, it’s simply the data you were able to get. For example:

  • whale observations tell you how many whales were seen, when what you really want to know is how many whales were there. Some whales might have breached but not been observed (shades of Schrödinger’s Whale), or swum by without breaching, or even been spotted twice but accidentally counted as two whales when it was really just the one.
  • speed cameras tell you the instantaneous speed of the car when what police really want to know is: has that car exceeded the speed limit at any time on this trip?
  • counting the litter found in the schoolyard tells you how much litter you found, not how much litter was dropped – some of it may have blown away or hidden under things. It also only tells you how much litter was there that day. What if a year level was out on excursion, or it was a wet day timetable…

And even when the data is actually what you want, there may be data that’s missing or flawed for various reasons. For example:

  • Facial recognition systems that were trained on images of faces that were almost exclusively white and male.
  • Phone polls that can’t include people with unlisted numbers.
  • Internet polls that can’t include people without internet access.
  • Surveys where people don’t or can’t tell the truth – for example about healthy eating, or sexuality, or where people don’t actually know the truth, for example about why they did things, or things they don’t remember (like what did you have for breakfast yesterday? Or how often do you eat broccoli?).
  • Skipped data where someone forgot to record a daily observation or the system went down and didn’t record any values.

Consider the reporting around the Corona virus. We have a reported death rate of around 2% which is highly speculative, because we have no idea how many mild cases of corona virus there are out there that are not being identified or reported. Some sources report the numbers and stress the uncertainty, while others report them as solid facts.

This is a kind of scepticism and critical thinking that we don’t often leave room for – in education, business, or journalism. Often we are in such a rush to get the “right” answer that we don’t have time to pause and evaluate the data we’re working with, to consider the flaws and uncertainty that are built in to any dataset, and any analysis.

If we can teach our students, from pre-school onwards, to question their data, to ask “how many ways is this data flawed” rather than assuming the data is perfect, then perhaps we can build a world which centers critical thinking and evaluates evidence.

This is why using real datasets rather than nice clean sets of fake numbers is crucially important to teaching data science. Because real world datasets are never nice, clean, and straightforward. There is no need for scepticism and critical thinking in textbook examples. But kids who have used real data in their learning are equipped to tackle real world problems.

Can you share some examples of flawed data? What consequences have you seen from people assuming data is perfect?


Using Real Data Projects to Engage Kids with STEM

I want to start by asking you a question: What gets you out of bed in the morning? What really motivates you?

For me it’s the chance to make a difference in the world. It’s wanting to leave the world a better place than I found it.

And that’s something that STEM skills are perfect for. They are for problem solving, for designing better ways to do things. For bringing clean water, clean power, increased food production, solutions to climate change, safer transport, personalised medicine, and a whole host of innovations to the world.

But when I first started teaching in a high school – a science school, no less – we were teaching “STEM” as “fun stuff”. Drawing pretty pictures. Making robots follow a line. Playing with toys.

How many of us are motivated, I mean really motivated, by toys? Some of us are, especially technical people! But those are generally the people we’ve already GOT in tech! I’m much more interested in the people we haven’t got yet.

All too often we ask those kids who are not already into tech to get out of bed for the chance to have fun. And fun is great – I like to have fun, we all do! And not all of my fun is finding an interesting new dataset and analysing the hell out of it, I promise. I do have other ways of having fun besides writing an interesting new Python script. Really I do. But fun doesn’t get me out of bed in the morning. Fun is a hobby. A diversion. A toy. That’s not what we need kids to understand about STEM.

We are handing our kids a world in desperate need of creative solutions. Of innovation and entrepreneurship. Of change.

And we’re telling them that STEM is fun! It’s for designing 3D jewellery. It’s sparkly. It’s pink. It’s useless.

We are doing kids a huge disservice. They’re kids, therefore fun is the way to reach them, right? It’s like saying we want more women in tech so we’re going to paint some things pink and offer some courses in the chemistry of makeup (a real suggestion that was made at an actual school). It’s like saying “women do hardware too, let’s sell them some pink hammers.” (and that’s also a real example)

When we were teaching computing using “fun toys” the overwhelming feedback I was getting – from science students – was “Why are you making me do this? It’s not relevant to me. I don’t want to do it.”

Can you guess what happened when we made the year 10 computer science course a data science course instead of a “fun toys” course? We were teaching the same basic coding skills. We still had them learning about selection, iteration, variables, and functions. But now we were using real datasets and finding real questions to answer, real problems to solve. Do you know what happened?

Suddenly they could see the point. They found it useful. They found themselves using the skills in other subjects, especially in project work. And the numbers who went on to the year 11 elective computer science subject increased by around 30%, with double the number of girls.

And none of it was pink!

That first data science course I had a student who was super interested in politics, and there was a federal election, so we used data from the Australian Electoral Commission. Turns out you can download csv files containing every single vote from any Australian election.

We used the senate votes for Victoria for the 2016 Federal election. Over 3 million lines of csv, they contained polling booth, electorate, and a 151 position comma separated string containing the contents of every box on each ballot paper.

3 million lines of csv won’t even open in excel, so the kids had to program just to open the file. They learned about using a small section of the file in order to test their code, so that it didn’t take ages to run. They learned about what questions a dataset could answer.

They found their own questions – from which party’s voters were more likely to follow the how to vote cards, to where Pauline Hanson voters came from. They asked questions about their own electorate or polling booth and how they compared to the whole state. About female representation and share of the below-the-line vote. About preference flows and about how polling compares to actual results. Every student asked a different question, which meant that every student had to write different code to find the answer (goodbye plagiarism!).

And then the important part happened: they had to visualise their results. To create an image, more interesting than an ordinary graph, that conveyed their results in a convincing, valid, and compelling way.

They learned about channels of information, about the human visual system and attention. About colour blindness and the problems with the rainbow scale. They learned which types of graph are appropriate for different types of data, and how to customise their graphs so that they don’t mislead their audience.

As well as learning to analyse and visualise data themselves, they also learned to be critical data thinkers, reviewing graphs and statistics they are presented with using critical questions like “How was that data collected? What was the sample size? And where is the zero on that scale?”

We have a tendency to bend at the knees when presented with statistics and graphs. It seems to automagically make information more credible. But they are very easy to manipulate. So it’s crucial, in this era of fake news and anti-science, that our kids learn to be critical thinkers.

Another reason we need our kids to learn data science skills is the increasing dominance of Big Data and Machine Learning in every aspect of our lives. They are determining our healthcare and our access to home loans. They’re directing our traffic and influencing our consumption and behaviour – even our votes! They’re controlling our justice systems and our borders. But how many of you really feel like you have a good understanding of how the algorithms that do these things actually work? How many of you are confident in the fairness, impartiality, and accuracy of these systems?

And this is a highly educated audience. Think about that for a moment. These systems are running our lives and we have no say in how they operate. We don’t even understand them.

So it’s crucial that we educate upcoming generations to have informed, intelligent conversations about these systems. So that we can have that long delayed community conversation around the way we manage our data – and the way it manages us.

And to do that, we need to engage kids with data in the classroom. To show them its relevance, and to build their Data Science and technological skills.

The problem with finding cool datasets and building them into interesting lessons is that it’s hugely time consuming and highly skilled work. When I used the electoral data it took me hours to make sense of the dataset. I couldn’t even find anyone in the electoral commission who could explain it to me, so I had to derive it from first principles. The only reason I had the capacity to do that is that I was part time, so I used my own time, unpaid, to find the dataset, make sense of it, and build a project around it. Most teachers simply don’t have the time to do that – or, to be honest, the skills.

It’s also important to acknowledge that student motivation is not the only issue we face in teaching tech in schools. The problems are many. Tech has an image problem almost as bad as teaching does! So kids don’t see themselves as the type of people who go into tech (and this affects boys as well as girls).

We attract the kinds of people into tech that we already have – generally people with a very narrow personality and background distribution. This conference is obviously full of the exceptions to that rule. 🙂 But it’s a real problem if you want innovative solutions that meet the needs of everyone, not just the tech nerds of the world.

We lack skilled teachers, in part because the correlation between that classic tech personality type and the kind of person who loves to teach seems to be, frankly, quite low, but also because if you have tech skills you can EASILY earn a LOT more and work a LOT less hard by NOT going in to teaching. But we also have a large cohort of teachers who are flat out terrified of technology. So if we force those teachers to teach our shiny new Digital Technologies curriculum, they can’t help but convey that fear to their students.

That’s why I founded the Australian Data Science Education Institute (which, by the way, is a registered charity). To find and make sense of the datasets, to build cool projects around them that are aligned with the curriculum, and to train teachers in the skills they need to incorporate data science into their teaching. We start from where teachers are and build their skills gradually, in the context of their own disciplines.

We don’t expect them all to program on day one. We start with spreadsheet skills and projects that both teachers and students find relevant and interesting.

Using Data Science teaches kids why STEM matters, and gives them the opportunity to use STEM skills to change the world. So we use this template for finding, analysing, and solving problems in the local community.

  • Find a problem
  • Measure it
  • Analyse the measurements
  • Communicate the results
  • Propose a solution
  • Implement the solution

And that’s the crucial part that we need to make the default position anywhere where we try new things: That we measure & analyse them to see if they work. Because in governments, in schools, in businesses: too often we see new programs implemented as a matter of ideology, and the only “assessment” that happens is for the champion of the program to say “It was awesome!”

And when you say “How do you know?” Everyone goes suspiciously quiet and changes the subject.

Incidentally, that’s why ADSEI collects feedback data on all of its courses, and why we’re also building a feedback mechanism for our online resources.

We also have a template for exploring global issues:

  • Find a dataset
  • Explore & Understand it – and this means understanding the domain, a fact we tend to lose sight of.
  • Find a question it can answer
  • Analyse it to find the answer
  • Communicate your results

ADSEI’s ultimate goal, of course, is to put itself out of business. To build Data Science into the way teachers are trained to teach. To build a community of Data Scientists and teachers who can support each other by sharing resources, project ideas, and cool datasets.

I think my job is safe for the moment!

For now we have grants from the Victorian Department of Education and Training, Google, and the Great Barrier Reef Foundation. We’ve developed teaching resources for Monash University, CSIRO, and the Digital Technologies Hub. We have delivered workshops and talks at conferences and schools, and we are working with the wonderful people at Pawsey Supercomputing Centre and the West Australian Marine Science Institute.

And ADSEI has only been in existence for 18 months.

Over the next few months we’ll be running workshops in Perth, Melbourne, and Alice Springs.

Next year in October we’ll also be running the Inaugural International Conference on Education and Outreach in Data Science and High Performance Computing, with the support of the awesome Australasian eResearch Organisation – Sponsors welcome!

So if any of this sounds like a mission you can get behind, join the slack channel, check out the website, send me an email ( or tweet at me wildly. Because Data Literacy and Data Science skills are something all kids need to experience, before they decide that Data Science is too hard, too boring, or not relevant to them!

If Data Science is going to drive us to the future, I want to put all of our kids in the driver’s seat!

Primary School Data Science Template

People often assume that Data Science in Schools has to be secondary school only, because how could primary kids do Data Science? The truth is that Data Literacy and Analysis skills can be built in to the curriculum from as young as 5 years old. And it’s really important that kids learn Data and Tech skills early, because by the time they get to secondary school we’ve already lost a lot of them, believing that these skills are too hard, not relevant to them, or just not interesting. We need to show them early on that Data Science is a useful tool that they are more than capable of mastering.

So how can primary kids do data science? Like any other data science project, it’s crucial to put it in context, so the kids can see the point.

So Step One is: Find a problem the kids care about

It might be litter in the playground, traffic at pickup time (or, to put it in a way kids will really relate to – how long they have to wait to be picked up, or how far they have to walk to the car!), or access to play equipment.

Step Two: Measure the problem

Count and identify the litter, time how long people have to wait to be picked up, measure how far people have to walk to the car, or count the number of people who get to use the monkey bars every lunchtime for a week.

Step Three: Analyse the measurements

For younger kids, that might simply mean sorting the rubbish into categories (eg chip packets, icy pole wrappers from the canteen, and sandwich bags or cling wrap from home), or organising the drop off or play equipment measurements by year level or by day. For older kids you might enter it into a spreadsheet and use a formula to calculate some averages over the week or by area or year level.

Step Four: Communicate your results

This is where you graph or visualise your results. For the littlies they can “graph” the results by stacking up blocks to represent the different categories. Green blocks for chip packets, blue ones for icy pole wrappers, etc. This is a great, tangible, exercise in data representation. Older kids can draw graphs or do them in a spreadsheet like Excel or Google Sheets. It helps to get them to draw pictures and labels on their graphs to make them more interesting and compelling.

Step Five: Propose a solution

Think of a way you might solve the problem. For litter the kids might come up with nude food day campaigns, or a change to the way food is available in the canteen – such as using larger chip packets and handing out small paper bags chips in them, instead of lots of small plastic packets. For traffic it might be that pickup times can be staggered by year levels, or older kids might be encouraged to walk further and be picked up a block or two away.

Step 6: Implement your solution

This can be a whole school initiative, and involves a lot of communication, using the graphs from Step Four to tell the community what’s happening and why.

Step 7: Measure again to see how well it worked

This is my favourite step, often sadly missing from political initiatives. Once you’ve tried to fix something, you need to measure it again to see if you actually made any difference.

You can even repeat steps 3 to 7 with several different solutions to compare which ones work better.

I love this template because it is the essence of STEM – It’s a science experiment, devised by the kids, with rigorous measurement and evaluation. Maths and Technology are used in handling the data, and you can use Engineering to design your solution, or even to measure the problem if you’re looking at environmental conditions like heat, noise, or water and want to use some sensors.

You can scale the technology use up or down depending on available resources and where your students are up to. There are no robots with parts to fail. And the best part is that the motivation is built in. The kids are learning that STEM and Data Science are tools you can use to solve real problems in your community. They’re not just a bit of fun that’s not relevant to their futures.

ADSEI is developing more projects like these over the next year, as well as building a network of teachers interested in sharing their ideas and supporting each other to introduce integrated STEM and Data Science in the classroom. Jump onto the mailing list to stay in touch, and feel free to share your own ideas in the comments on this post!

Computing with Purpose

I am increasingly angry about the pinkification of Computer Science and Engineering. Pinkification is the presentation of computing and STEM skills as being about 3D printing jewellery, drawing pretty pictures, or somehow involving fluffy animals, in a desperate attempt to interest girls. Aside from how insulting it is to assert that painting something pink and sticking a few sparkly things on it is a great way to attract girls – because we are fundamentally shallow, apparently – there are many pinkified programmes trying to attract girls to CS and, as far as I can tell, they are having very little impact on diversity in Computing and other technical fields.

This doesn’t surprise me. For years I taught at a science school and my boss insisted that we had to teach Computer Science to the year 10s through a “fun” approach – drawing pretty pictures, making robots follow lines, this sort of thing. And the single most common feedback was “why are you making me do this? It’s just not relevant to me!”

Bear in mind these were science students, and if science students think computing isn’t relevant to them, then we’re really doing something wrong. Computing is integral to science now, and I am constantly meeting scientists who lament their lack of computing skills, and tell me it’s limiting their work in the worst way.

When we finally shifted to teaching computing skills in the context of data science – using real data sets and authentic problems, giving the students the opportunity to make a real difference in the world – the change was dramatic. We studied everything from election results to microbats, from climate change to neuroscience, and the number of girls choosing to pursue further study in computing doubled (from 5 to 10 – still distressingly low, but a big step forward!), but it wasn’t just girls who got more interested. A lot of the boys could finally see the point of computing. Data Science engaged kids in computing skills in a way the “fun stuff” never did.


You see, the lack of girls in CS is only the easily measurable side of our lack of diversity. The big problem is that we are, for the most part, only getting the types of people in computing that we already have. The stereotypical kids who are already interested in computing, have been coding more or less from birth, and who really aren’t interested in much else.

We need to motivate a much wider range of people to at least try computing, to see if it’s something they might be interested in. We also need everyone to be Data Literate so that we can think critically about data and graphs we’re shown, and so that we can engage in intelligent conversations as a society about which kinds of Data Science are ok and which ones we’re not comfortable with.

Motivation is key, but I have come to the conclusion that fun isn’t actually terribly motivating. It interests me that “fun” is often still seen as the best way to attract kids to a subject. At best, if it actually is fun, using the “fun” approach to STEM skills may introduce it as a hobby, or a fun way to spend a few hours, but it’s hardly inspiring as a career choice, because it lacks a sense of purpose. It also sells our kids painfully short – they like fun, sure, but more than anything kids today are worried about their future, and the future of the planet. They want to make a difference.

Nicky Ringland, one of the greatest change makers in Computer Science Education in Australia, recently sent a tweet that finally gave me the phrase I’ve been looking for. She said girls get very engaged in “Computing with Purpose”, and added that the bonus is you also engage more boys with this kind of computing, not just girls.

That’s it. That’s what attracted me to computing. It’s what engaged my students with Data Science, and it’s why I started the Australian Data Science Education Institute. To show teachers and students that computing has a purpose. It’s not just something that’s been randomly jammed into the curriculum. It’s not about teaching to an exam. It’s not something to do because the teacher told you to. It’s something you can actually use to understand and even change the world.

Now that is a purpose everyone can get behind!

Making Intentional Data Scientists


When I was a kid, my cousin Chris gave me his old Commodore 64, and I learnt to program in BASIC. There was no Google, there were no online tutorials (heck, there wasn’t an “online” yet as far as most people knew!). I just had a computer to plug into my tv. A tape drive. A keyboard. And a book telling me how to do simple stuff in BASIC.

I didn’t do anything big or clever. I just did enough to get hooked. But I still didn’t see it as a possible career.

In secondary school I spent an absurd amount of time playing the Infocom Hitchhikers Game. A text based adventure that was surprisingly code like, and based on my favourite sci fi book at the time, the HitchHikers Guide to the Galaxy. I wasn’t great at it. I don’t think I ever even finished the game. But, again, it was enough to get me hooked. To show me that computers were fun, and that I could actually control them.

Fast forward a few years and I went to uni intending to study biology. Specifically genetics. And I did, but I also picked Computer Science as a fill in subject because I needed one more subject in first year.

I did not love it.

In fact, I spent a lot of time talking about how much I hated it.

Initially we were learning how to cut and paste, which I found intolerably simple, although it didn’t stop the class tutor from needing to ask me how I’d done it. And then we jumped from intolerably simple to incomprehensibly complex, without any apparent middle ground. Suddenly we were programming in assembler.

We were using macintoshes that had all kinds of bizarre quirks. My favourite was when we were learning to program in PDP8 assembler, and the editor would, occasionally, silently add invisible characters which then triggered errors when we tried to run the program. As a teacher and a usability specialist this horrifies me now, because it was teaching us that if our code didn’t work, it wasn’t necessarily due to anything we could see or understand. There might be some secret voodoo magic causing the problem.

The only way you could find the characters was to use the arrow key to painstakingly crawl across the line, character by character, looking for the one place where you had to use the arrow key twice to move one space. Then you deleted that character that wasn’t there, and your code would magically work. I hated it. And yet… something called to me.

I remember solving a tricky problem, leaning back feeling triumphant, and kicking the power cord out that powered all of the machines around me. (My teenage daughter says that’s “the most mum thing” she’s ever heard)

That was how I felt all the way through my computer science classes. Barely competent. Surrounded by guys who seemed to have been doing this for years, and who were much better at it than I was. And frequently having some kind of catastrophic failure just when it seemed like things were FINALLY working.

And yet… by third year the only thing I was studying was Computer Science. I don’t remember enjoying anything from the CS course in my first or second year. But the siren song of the third year subjects – artificial intelligence, computer graphics, image processing, bizarre programming languages (I’m pretty sure that wasn’t actually what the subject was called) – somehow was enough to keep me on the hook.

I graduated with average marks and travelled for a bit, and then I got a job in a software company. It was an unmitigated disaster. It was a small firm with senior management (who were father and son) regularly screaming at each other in the open plan office. I was doing software testing, which, as you can imagine, made me super popular with the developers.

I did NOT want this to be my life.

Eventually I weaseled my way into Computer Science honours, more as an escape strategy than anything else, where I struggled but got through with the support of my postgrad friends. And then something pivotal happened. I still don’t know exactly why, but Damian Conway offered me a PhD project that really spoke to me – designing a programming language for teaching programming.

Ironically, by the time I finished my PhD the one thing I knew for sure was that a programming language designed specifically for teaching programming, and not for real projects, would never work for teaching, because the one thing kids learning Computer Science wanted to do was REAL STUFF. Not play with childrens’ toys.

But I learnt a lot about usability and a lot about computer science education, got the PhD, and became an academic. I loved the teaching, and put my heart and soul into it, but I never really got into the research. I was looking for a way to make a difference, a real difference, and I couldn’t find it.

When my second baby was due and the department was offering a round of redundancies, I took one. For four years I cast around doing freelance writing, pro bono communicationss work at Oxfam Australia, and even being a project officer for the Australian Breastfeeding Association, but nothing really clicked.

Then in late 2009 I got a call from a friend in my old department giving me the opportunity to help design the Computer Science curriculum at a new Science School, opening in 2010. By 2011 I was teaching there while doing my teaching qualification part time. Working at this amazing school, with thoroughly remarkable kids.

None of these steps in my career were planned. I never intended to go on to do Computer Science, even though it turned out I loved it. The first few years of undergrad were off putting. At school, even though I was repeatedly told girls could do anything, “anything” seemed to mean traditional careers like law and medicine. Computing just never crossed my mind.

And in fact medicine was my first preference when I applied to uni. Not getting in was the best thing that could have happened to me.

At any step along the way, any one of the accidents that set me onto the next part of this path might not have happened, and I would have gone on believing I wasn’t very good at this stuff, even though it was fun. I’d have gone on believing it wasn’t for me. Many times along the way I’ve been told I couldn’t cut it. Wasn’t good enough to do what I was trying to do. Should leave it to people who were better than me.

Those voices still ring in my ears sometimes, and pump up my imposter syndrome. Incidentally I have a theory that the only people who don’t suffer from some degree of imposter syndrome are actually sociopaths. I’ve learned to lean on my friends in those moments, and let their encouragement drown out the imposter monster.

At John Monash Science School I developed Computational Science assignments that enabled kids to work on real projects that made a difference in the world. My first class of year 11s did Cancer Research. I’ve had groups do marine biology. Neuroscience, Genetics, Microbiology, Physics, and Psychology projects. I’ve had year 11s present their work at academic conferences. And the projects that worked the best were ALL data-related.

The scientists we worked with, for the most part, had limited, if any, computational skills, and they all admitted it was limiting their research.

What’s more, the students coming through – even at a science school! – didn’t always recognise the importance of computation, and in their science research projects they were doing things with their data to make a data scientist weep. To be honest I saw a fair bit of that in academia as well.

And at the same time the data science industry was growing and becoming a driving force – but with very few checks and balances. As a society, we have so few people able to even begin to understand the underlying concepts, we’ve had no chance to rein in this mad rush to data riches and say “hey, is that ethical?” “hey, is that good for us?” or even “hey, is that result RIGHT?”

Meanwhile I had very few girls choosing my elective year 11 Computer Science class, even at JMSS, but often the ones that did said they never would have chosen to study CS if I hadn’t shown them how useful it could be, and that it was something they could actually do! For many of them the only experience of Computing at school had been tedious, step-by-step manipulations of images in Photoshop. Or worse – learning to format Word and Powerpoint documents! I mean, seriously? Kill me now!

For my first few years at JMSS, against my loud objections, we were teaching year 10s computing using toy languages based on scratch, and giving them “fun” things to do like drawing pretty pictures and controlling robots.

And they hated it.

They couldn’t see the point.

Many of them cheated just to get through.

And the year 11 elective CS class stayed small, and mostly male. We got the kids who were already interested in computing (despite the year 10 class!) but we didn’t get anyone else.

And then sanity FINALLY prevailed and we started teaching data science in year 10 instead of “fun toys”. And suddenly kids were saying “This is amazing! This is so useful! I’m using this stuff everywhere!”

The year 11 class nearly doubled in size. We went from a maximum of 5 girls in the class to 10 – in just one year!

Finally, they could see the point.

We used real datasets.

We did real projects.

They found their own questions and analysed the data to find the answers.

They designed their own, hand-drawn visualisations to communicate the results. It was an outstanding success.

And so I quit teaching.

Because although I was having the most amazing time in my own classes, and my students were doing real things, and making real change, for kids elsewhere nothing was changing. I wanted kids everywhere to have these opportunities.

I wanted all of the potential accidental data scientists out there, to have the opportunity to become real data scientists. Because it wasn’t just the girls we were scaring away from Computing & Data Science, and STEM in general, with tedious or toy computing classes. It was all of the kids who didn’t look like your stereotypical computer scientist.

It was all of the kids who had never tried Computing or never had any fun with it. All of the kids who didn’t know what Data Science was. All of the kids whose cousin hadn’t encouraged them to program when they were 10. All of the kids who never found a text based adventure game they loved. All of the kids who never tried Data Science but might turn out to be amazing at it.

Because those kids who don’t gravitate to Computer Science naturally are the kids who ask different questions. Who try different solutions. Who find ways to integrate CS with other things they love, whether it’s Biology, or History, or Literature, or healthcare, or sport. It’s those kids who will revolutionise the Data Science industry. Who will hold us to account on ethics, on accuracy, on validity. Who will champion privacy and open government, and who will find solutions to our most desperate problems.

And so the Australian Data Science Education Institute, ADSEI, was born! We’re a registered charity that’s teaching teachers to use real data science projects in all of their teaching, from primary school upwards, and right across the curriculum. We’re giving kids a chance to make a real difference in their communities using data science. And developing a new generation of data literate, computationally skilled, critical thinkers who know the power of Data Science and STEM disciplines to solve real problems.

I set ADSEI up as a charity, because I didn’t want funding to ever be a barrier to accessing these kinds of projects. I’m teaching teachers to put Data Science into every subject across the curriculum because we will never have enough skilled Computer Science and Data Science teachers. And it’s not enough to offer kids voluntary, out of hours classes that they can sign up to because, again, we’re only preaching to the converted. Only getting the kids who choose this stuff. But all kids need the chance to experience the power of Data Science, the power of authentic, integrated STEM projects that give them the opportunity to effect real change in their communities.

Schools we work with run real data science experiments with unknown outcomes: They find a problem in their community, whether it’s traffic, litter, or sustainability. It might be access to sporting facilities or overcrowded public transport. They measure the problem. They analyse that data and communicate it visually. They propose a solution. Implement the solution. And then, and this is the important bit: They measure it again! So that they know whether their solution worked, and if so how well? (Maybe we could teach our politicians that.) Then they can move on, or they can try another solution.

Or they use real datasets, like the Happiness Index, voting data, or Renewable Energy installations, to explore issues that are relevant to their world and their future.

They’re learning technical skills, scientific enquiry, communication, and maths skills, plus all of the topics related to their chosen problem or dataset. And they’re creating positive change in their communities. All with the power of Data Science.

We’re aiming for a generation of kids who have the chance to become intentional Data Scientists instead of accidental ones. For a generation that is science and data literate, and that knows the power of STEM to change the world.

How can you help? Volunteer, help us find & annotate datasets and come up with cool projects that we can put as free resources on the website. Tell schools about us. Take our projects into schools and help them run them. Or tell businesses about us and encourage them to sponsor our teacher workshops. Together, we can be a data science education revolution.


This is an edited version of my talk at #AWSCommunity She Builds on AWS day in Melbourne. There it was called “The Accidental Data Scientist”