Making Intentional Data Scientists


When I was a kid, my cousin Chris gave me his old Commodore 64, and I learnt to program in BASIC. There was no Google, there were no online tutorials (heck, there wasn’t an “online” yet as far as most people knew!). I just had a computer to plug into my tv. A tape drive. A keyboard. And a book telling me how to do simple stuff in BASIC.

I didn’t do anything big or clever. I just did enough to get hooked. But I still didn’t see it as a possible career.

In secondary school I spent an absurd amount of time playing the Infocom Hitchhikers Game. A text based adventure that was surprisingly code like, and based on my favourite sci fi book at the time, the HitchHikers Guide to the Galaxy. I wasn’t great at it. I don’t think I ever even finished the game. But, again, it was enough to get me hooked. To show me that computers were fun, and that I could actually control them.

Fast forward a few years and I went to uni intending to study biology. Specifically genetics. And I did, but I also picked Computer Science as a fill in subject because I needed one more subject in first year.

I did not love it.

In fact, I spent a lot of time talking about how much I hated it.

Initially we were learning how to cut and paste, which I found intolerably simple, although it didn’t stop the class tutor from needing to ask me how I’d done it. And then we jumped from intolerably simple to incomprehensibly complex, without any apparent middle ground. Suddenly we were programming in assembler.

We were using macintoshes that had all kinds of bizarre quirks. My favourite was when we were learning to program in PDP8 assembler, and the editor would, occasionally, silently add invisible characters which then triggered errors when we tried to run the program. As a teacher and a usability specialist this horrifies me now, because it was teaching us that if our code didn’t work, it wasn’t necessarily due to anything we could see or understand. There might be some secret voodoo magic causing the problem.

The only way you could find the characters was to use the arrow key to painstakingly crawl across the line, character by character, looking for the one place where you had to use the arrow key twice to move one space. Then you deleted that character that wasn’t there, and your code would magically work. I hated it. And yet… something called to me.

I remember solving a tricky problem, leaning back feeling triumphant, and kicking the power cord out that powered all of the machines around me. (My teenage daughter says that’s “the most mum thing” she’s ever heard)

That was how I felt all the way through my computer science classes. Barely competent. Surrounded by guys who seemed to have been doing this for years, and who were much better at it than I was. And frequently having some kind of catastrophic failure just when it seemed like things were FINALLY working.

And yet… by third year the only thing I was studying was Computer Science. I don’t remember enjoying anything from the CS course in my first or second year. But the siren song of the third year subjects – artificial intelligence, computer graphics, image processing, bizarre programming languages (I’m pretty sure that wasn’t actually what the subject was called) – somehow was enough to keep me on the hook.

I graduated with average marks and travelled for a bit, and then I got a job in a software company. It was an unmitigated disaster. It was a small firm with senior management (who were father and son) regularly screaming at each other in the open plan office. I was doing software testing, which, as you can imagine, made me super popular with the developers.

I did NOT want this to be my life.

Eventually I weaseled my way into Computer Science honours, more as an escape strategy than anything else, where I struggled but got through with the support of my postgrad friends. And then something pivotal happened. I still don’t know exactly why, but Damian Conway offered me a PhD project that really spoke to me – designing a programming language for teaching programming.

Ironically, by the time I finished my PhD the one thing I knew for sure was that a programming language designed specifically for teaching programming, and not for real projects, would never work for teaching, because the one thing kids learning Computer Science wanted to do was REAL STUFF. Not play with childrens’ toys.

But I learnt a lot about usability and a lot about computer science education, got the PhD, and became an academic. I loved the teaching, and put my heart and soul into it, but I never really got into the research. I was looking for a way to make a difference, a real difference, and I couldn’t find it.

When my second baby was due and the department was offering a round of redundancies, I took one. For four years I cast around doing freelance writing, pro bono communicationss work at Oxfam Australia, and even being a project officer for the Australian Breastfeeding Association, but nothing really clicked.

Then in late 2009 I got a call from a friend in my old department giving me the opportunity to help design the Computer Science curriculum at a new Science School, opening in 2010. By 2011 I was teaching there while doing my teaching qualification part time. Working at this amazing school, with thoroughly remarkable kids.

None of these steps in my career were planned. I never intended to go on to do Computer Science, even though it turned out I loved it. The first few years of undergrad were off putting. At school, even though I was repeatedly told girls could do anything, “anything” seemed to mean traditional careers like law and medicine. Computing just never crossed my mind.

And in fact medicine was my first preference when I applied to uni. Not getting in was the best thing that could have happened to me.

At any step along the way, any one of the accidents that set me onto the next part of this path might not have happened, and I would have gone on believing I wasn’t very good at this stuff, even though it was fun. I’d have gone on believing it wasn’t for me. Many times along the way I’ve been told I couldn’t cut it. Wasn’t good enough to do what I was trying to do. Should leave it to people who were better than me.

Those voices still ring in my ears sometimes, and pump up my imposter syndrome. Incidentally I have a theory that the only people who don’t suffer from some degree of imposter syndrome are actually sociopaths. I’ve learned to lean on my friends in those moments, and let their encouragement drown out the imposter monster.

At John Monash Science School I developed Computational Science assignments that enabled kids to work on real projects that made a difference in the world. My first class of year 11s did Cancer Research. I’ve had groups do marine biology. Neuroscience, Genetics, Microbiology, Physics, and Psychology projects. I’ve had year 11s present their work at academic conferences. And the projects that worked the best were ALL data-related.

The scientists we worked with, for the most part, had limited, if any, computational skills, and they all admitted it was limiting their research.

What’s more, the students coming through – even at a science school! – didn’t always recognise the importance of computation, and in their science research projects they were doing things with their data to make a data scientist weep. To be honest I saw a fair bit of that in academia as well.

And at the same time the data science industry was growing and becoming a driving force – but with very few checks and balances. As a society, we have so few people able to even begin to understand the underlying concepts, we’ve had no chance to rein in this mad rush to data riches and say “hey, is that ethical?” “hey, is that good for us?” or even “hey, is that result RIGHT?”

Meanwhile I had very few girls choosing my elective year 11 Computer Science class, even at JMSS, but often the ones that did said they never would have chosen to study CS if I hadn’t shown them how useful it could be, and that it was something they could actually do! For many of them the only experience of Computing at school had been tedious, step-by-step manipulations of images in Photoshop. Or worse – learning to format Word and Powerpoint documents! I mean, seriously? Kill me now!

For my first few years at JMSS, against my loud objections, we were teaching year 10s computing using toy languages based on scratch, and giving them “fun” things to do like drawing pretty pictures and controlling robots.

And they hated it.

They couldn’t see the point.

Many of them cheated just to get through.

And the year 11 elective CS class stayed small, and mostly male. We got the kids who were already interested in computing (despite the year 10 class!) but we didn’t get anyone else.

And then sanity FINALLY prevailed and we started teaching data science in year 10 instead of “fun toys”. And suddenly kids were saying “This is amazing! This is so useful! I’m using this stuff everywhere!”

The year 11 class nearly doubled in size. We went from a maximum of 5 girls in the class to 10 – in just one year!

Finally, they could see the point.

We used real datasets.

We did real projects.

They found their own questions and analysed the data to find the answers.

They designed their own, hand-drawn visualisations to communicate the results. It was an outstanding success.

And so I quit teaching.

Because although I was having the most amazing time in my own classes, and my students were doing real things, and making real change, for kids elsewhere nothing was changing. I wanted kids everywhere to have these opportunities.

I wanted all of the potential accidental data scientists out there, to have the opportunity to become real data scientists. Because it wasn’t just the girls we were scaring away from Computing & Data Science, and STEM in general, with tedious or toy computing classes. It was all of the kids who didn’t look like your stereotypical computer scientist.

It was all of the kids who had never tried Computing or never had any fun with it. All of the kids who didn’t know what Data Science was. All of the kids whose cousin hadn’t encouraged them to program when they were 10. All of the kids who never found a text based adventure game they loved. All of the kids who never tried Data Science but might turn out to be amazing at it.

Because those kids who don’t gravitate to Computer Science naturally are the kids who ask different questions. Who try different solutions. Who find ways to integrate CS with other things they love, whether it’s Biology, or History, or Literature, or healthcare, or sport. It’s those kids who will revolutionise the Data Science industry. Who will hold us to account on ethics, on accuracy, on validity. Who will champion privacy and open government, and who will find solutions to our most desperate problems.

And so the Australian Data Science Education Institute, ADSEI, was born! We’re a registered charity that’s teaching teachers to use real data science projects in all of their teaching, from primary school upwards, and right across the curriculum. We’re giving kids a chance to make a real difference in their communities using data science. And developing a new generation of data literate, computationally skilled, critical thinkers who know the power of Data Science and STEM disciplines to solve real problems.

I set ADSEI up as a charity, because I didn’t want funding to ever be a barrier to accessing these kinds of projects. I’m teaching teachers to put Data Science into every subject across the curriculum because we will never have enough skilled Computer Science and Data Science teachers. And it’s not enough to offer kids voluntary, out of hours classes that they can sign up to because, again, we’re only preaching to the converted. Only getting the kids who choose this stuff. But all kids need the chance to experience the power of Data Science, the power of authentic, integrated STEM projects that give them the opportunity to effect real change in their communities.

Schools we work with run real data science experiments with unknown outcomes: They find a problem in their community, whether it’s traffic, litter, or sustainability. It might be access to sporting facilities or overcrowded public transport. They measure the problem. They analyse that data and communicate it visually. They propose a solution. Implement the solution. And then, and this is the important bit: They measure it again! So that they know whether their solution worked, and if so how well? (Maybe we could teach our politicians that.) Then they can move on, or they can try another solution.

Or they use real datasets, like the Happiness Index, voting data, or Renewable Energy installations, to explore issues that are relevant to their world and their future.

They’re learning technical skills, scientific enquiry, communication, and maths skills, plus all of the topics related to their chosen problem or dataset. And they’re creating positive change in their communities. All with the power of Data Science.

We’re aiming for a generation of kids who have the chance to become intentional Data Scientists instead of accidental ones. For a generation that is science and data literate, and that knows the power of STEM to change the world.

How can you help? Volunteer, help us find & annotate datasets and come up with cool projects that we can put as free resources on the website. Tell schools about us. Take our projects into schools and help them run them. Or tell businesses about us and encourage them to sponsor our teacher workshops. Together, we can be a data science education revolution.


This is an edited version of my talk at #AWSCommunity She Builds on AWS day in Melbourne. There it was called “The Accidental Data Scientist”

Fairness is not the default

KJ Pittl from Google spoke brilliantly at C3DIS (The Collaborative Conference on Computational and Data Intensive Science) about fairness in Machine Learning in May. Although I’ve thought and read a lot about this topic, her talk was electrifying. I want to try to capture here the points that I thought were key, and none registers more strongly than this one:

“Humans have not got a history of being fair. Fair is not the default.”

To back up this point, KJ used the following slides, which really speak for themselves.


I am almost certain that none of these situations came about by malicious intent. They were just design decisions by a small group of people, for a small group of people, and they simply assumed it would work for everyone the way it worked for them.

But right there, that’s why we urgently need diversity in tech, and in data science. Because as long as the groups that are designing our future are largely homogeneous, they won’t be able to say “But are there any people of colour in our image set?” – a question that could have averted this:

Screen Shot 2018-08-03 at 5.00.03 pm

or to say “Hey, do you know that blind people won’t be able to use this device to enter their PINs?”

Or “But what happens if you’re in a wheelchair or pushing a pram?”

or “What if you’re homeless?” “What if you have kids?” “What if you’re part time?” “What if English isn’t your native language?” “What if your eyesight isn’t great?” “What if you have food allergies?” “What if you’re a refugee?” “What if you don’t have a car?” or any one of the myriad questions that might prevent us from designing a future that inadvertently locks a section of our population out.

Diversity helps us design better solutions, but it also helps us ask important questions of the solutions we have. And given that, by default, our systems will not be fair, inclusive, or equitable, we really want to make sure those questions get asked.

Why robots are a disaster for tech education

It’s very tempting to see robots and other shiny tech toys as fantastic motivators for STEM education. After all, who doesn’t love playing with cool toys? Unfortunately this kind of hardware has huge drawbacks in the classroom. To show you why, let me tell you a story.

On the weekend I took my kids to Oz Comic Con. My 11 year old, Jen, is a HUGE tech nerd and loves all things hardware, software, mathematical, and, of course, STAR WARS. Dressed as a Jedi and wielding a lightsaber, Jen was magnetically drawn to the stall selling star wars drones. Jen had been saving for Comic Con for months, so the $50 cost, while more than they have ever spent on anything before, was well within their reach.

I did a quick bit of online research and it seemed like a good buy.

Behold Jen’s X-Wing in all its glory.


You can imagine the excitement when we got it home, but we were out to dinner that night and didn’t have time to unbox and charge it. The next day Jen bounced out of bed and went straight to the box. Eating, drinking, and other necessities of life were not on the agenda, so it was lucky it was a public holiday and I didn’t have to try to get them to school.

Once charged (the drone), batteries installed (the controller), and with the beginner-pilot’s safety cage installed, we fired it up. The controller even buzzed when we inserted batteries and had Yoda saying “feel the force!”. The excitement was INTENSE. The instructions said to power up the controller and the drone, flip the left hand lever up and down, whereupon it would beep, and the flashing lights would then stop flashing to show that the devices were synced.

But there was a catch. Beeping occurred as expected, but the lights on both devices continued to flash. We powered both devices off and on again. We tried different batteries. We even went shopping for new batteries. We spent all day trying to get the damned thing to work, to no avail. 3 days later it still didn’t work and we were waiting for tech support from the drone company to reply to our emails.

Now you may think we were doing something wrong – and perhaps we were – but I have a PhD in Computer Science, and my husband is an Electrical Engineer. If we can’t make it work, what hope does your average teacher have?

Unlike with programming, a student, a teacher, and even an electrical engineer have very little hope of debugging a device such as this one, because there is no feedback. There’s no way of knowing its internal state. Short of taking the device apart and resoldering each of the connections and testing each component (not skills taught in your typical primary education course last I checked), there’s no way to troubleshoot these things.

Whether Robot, Raspberry Pi, or Arduino, hardware all suffers from these issues. There’s a significant chance that they won’t work out of the box. Even if they do, connections come loose and they might stop working mid-lesson, or not work next time they come out of the cupboard. And what we teach kids with these kinds of intensely frustrating experiences – when they are trying to do the same things as everyone else, but for them it doesn’t work – is that these problems are insurmountable. That they have no control over technology, no power to fix it when it breaks, and no way of understanding how it does what it does.

These are not the lessons we want to be teaching our kids.

*Update: The company got back to us the day after I wrote this, and very quickly replaced the drone. 10 days after the initial purchase we have a drone that works – but Jen’s enthusiasm – and confidence – has taken a severe battering.

ADSEI in the news

ADSEI has been in the news lately. Check out our Executive Director, Dr Linda McIver, on ABC Radio Sydney’s Focus Program, talking about Big Data and data literacy.

There was a profile piece on Linda in the Australian Financial Review, in BOSS magazine.

And an Op Ed in The Age, the Sydney Morning Herald, and other Fairfax publications on why kids need to be data literate:

Linda also gave a recent YOW night talk on how kids can solve our data problems with Citizen Data Science:

Data Science Education is an idea whose time has clearly come!