Making Intentional Data Scientists


When I was a kid, my cousin Chris gave me his old Commodore 64, and I learnt to program in BASIC. There was no Google, there were no online tutorials (heck, there wasn’t an “online” yet as far as most people knew!). I just had a computer to plug into my tv. A tape drive. A keyboard. And a book telling me how to do simple stuff in BASIC.

I didn’t do anything big or clever. I just did enough to get hooked. But I still didn’t see it as a possible career.

In secondary school I spent an absurd amount of time playing the Infocom Hitchhikers Game. A text based adventure that was surprisingly code like, and based on my favourite sci fi book at the time, the HitchHikers Guide to the Galaxy. I wasn’t great at it. I don’t think I ever even finished the game. But, again, it was enough to get me hooked. To show me that computers were fun, and that I could actually control them.

Fast forward a few years and I went to uni intending to study biology. Specifically genetics. And I did, but I also picked Computer Science as a fill in subject because I needed one more subject in first year.

I did not love it.

In fact, I spent a lot of time talking about how much I hated it.

Initially we were learning how to cut and paste, which I found intolerably simple, although it didn’t stop the class tutor from needing to ask me how I’d done it. And then we jumped from intolerably simple to incomprehensibly complex, without any apparent middle ground. Suddenly we were programming in assembler.

We were using macintoshes that had all kinds of bizarre quirks. My favourite was when we were learning to program in PDP8 assembler, and the editor would, occasionally, silently add invisible characters which then triggered errors when we tried to run the program. As a teacher and a usability specialist this horrifies me now, because it was teaching us that if our code didn’t work, it wasn’t necessarily due to anything we could see or understand. There might be some secret voodoo magic causing the problem.

The only way you could find the characters was to use the arrow key to painstakingly crawl across the line, character by character, looking for the one place where you had to use the arrow key twice to move one space. Then you deleted that character that wasn’t there, and your code would magically work. I hated it. And yet… something called to me.

I remember solving a tricky problem, leaning back feeling triumphant, and kicking the power cord out that powered all of the machines around me. (My teenage daughter says that’s “the most mum thing” she’s ever heard)

That was how I felt all the way through my computer science classes. Barely competent. Surrounded by guys who seemed to have been doing this for years, and who were much better at it than I was. And frequently having some kind of catastrophic failure just when it seemed like things were FINALLY working.

And yet… by third year the only thing I was studying was Computer Science. I don’t remember enjoying anything from the CS course in my first or second year. But the siren song of the third year subjects – artificial intelligence, computer graphics, image processing, bizarre programming languages (I’m pretty sure that wasn’t actually what the subject was called) – somehow was enough to keep me on the hook.

I graduated with average marks and travelled for a bit, and then I got a job in a software company. It was an unmitigated disaster. It was a small firm with senior management (who were father and son) regularly screaming at each other in the open plan office. I was doing software testing, which, as you can imagine, made me super popular with the developers.

I did NOT want this to be my life.

Eventually I weaseled my way into Computer Science honours, more as an escape strategy than anything else, where I struggled but got through with the support of my postgrad friends. And then something pivotal happened. I still don’t know exactly why, but Damian Conway offered me a PhD project that really spoke to me – designing a programming language for teaching programming.

Ironically, by the time I finished my PhD the one thing I knew for sure was that a programming language designed specifically for teaching programming, and not for real projects, would never work for teaching, because the one thing kids learning Computer Science wanted to do was REAL STUFF. Not play with childrens’ toys.

But I learnt a lot about usability and a lot about computer science education, got the PhD, and became an academic. I loved the teaching, and put my heart and soul into it, but I never really got into the research. I was looking for a way to make a difference, a real difference, and I couldn’t find it.

When my second baby was due and the department was offering a round of redundancies, I took one. For four years I cast around doing freelance writing, pro bono communicationss work at Oxfam Australia, and even being a project officer for the Australian Breastfeeding Association, but nothing really clicked.

Then in late 2009 I got a call from a friend in my old department giving me the opportunity to help design the Computer Science curriculum at a new Science School, opening in 2010. By 2011 I was teaching there while doing my teaching qualification part time. Working at this amazing school, with thoroughly remarkable kids.

None of these steps in my career were planned. I never intended to go on to do Computer Science, even though it turned out I loved it. The first few years of undergrad were off putting. At school, even though I was repeatedly told girls could do anything, “anything” seemed to mean traditional careers like law and medicine. Computing just never crossed my mind.

And in fact medicine was my first preference when I applied to uni. Not getting in was the best thing that could have happened to me.

At any step along the way, any one of the accidents that set me onto the next part of this path might not have happened, and I would have gone on believing I wasn’t very good at this stuff, even though it was fun. I’d have gone on believing it wasn’t for me. Many times along the way I’ve been told I couldn’t cut it. Wasn’t good enough to do what I was trying to do. Should leave it to people who were better than me.

Those voices still ring in my ears sometimes, and pump up my imposter syndrome. Incidentally I have a theory that the only people who don’t suffer from some degree of imposter syndrome are actually sociopaths. I’ve learned to lean on my friends in those moments, and let their encouragement drown out the imposter monster.

At John Monash Science School I developed Computational Science assignments that enabled kids to work on real projects that made a difference in the world. My first class of year 11s did Cancer Research. I’ve had groups do marine biology. Neuroscience, Genetics, Microbiology, Physics, and Psychology projects. I’ve had year 11s present their work at academic conferences. And the projects that worked the best were ALL data-related.

The scientists we worked with, for the most part, had limited, if any, computational skills, and they all admitted it was limiting their research.

What’s more, the students coming through – even at a science school! – didn’t always recognise the importance of computation, and in their science research projects they were doing things with their data to make a data scientist weep. To be honest I saw a fair bit of that in academia as well.

And at the same time the data science industry was growing and becoming a driving force – but with very few checks and balances. As a society, we have so few people able to even begin to understand the underlying concepts, we’ve had no chance to rein in this mad rush to data riches and say “hey, is that ethical?” “hey, is that good for us?” or even “hey, is that result RIGHT?”

Meanwhile I had very few girls choosing my elective year 11 Computer Science class, even at JMSS, but often the ones that did said they never would have chosen to study CS if I hadn’t shown them how useful it could be, and that it was something they could actually do! For many of them the only experience of Computing at school had been tedious, step-by-step manipulations of images in Photoshop. Or worse – learning to format Word and Powerpoint documents! I mean, seriously? Kill me now!

For my first few years at JMSS, against my loud objections, we were teaching year 10s computing using toy languages based on scratch, and giving them “fun” things to do like drawing pretty pictures and controlling robots.

And they hated it.

They couldn’t see the point.

Many of them cheated just to get through.

And the year 11 elective CS class stayed small, and mostly male. We got the kids who were already interested in computing (despite the year 10 class!) but we didn’t get anyone else.

And then sanity FINALLY prevailed and we started teaching data science in year 10 instead of “fun toys”. And suddenly kids were saying “This is amazing! This is so useful! I’m using this stuff everywhere!”

The year 11 class nearly doubled in size. We went from a maximum of 5 girls in the class to 10 – in just one year!

Finally, they could see the point.

We used real datasets.

We did real projects.

They found their own questions and analysed the data to find the answers.

They designed their own, hand-drawn visualisations to communicate the results. It was an outstanding success.

And so I quit teaching.

Because although I was having the most amazing time in my own classes, and my students were doing real things, and making real change, for kids elsewhere nothing was changing. I wanted kids everywhere to have these opportunities.

I wanted all of the potential accidental data scientists out there, to have the opportunity to become real data scientists. Because it wasn’t just the girls we were scaring away from Computing & Data Science, and STEM in general, with tedious or toy computing classes. It was all of the kids who didn’t look like your stereotypical computer scientist.

It was all of the kids who had never tried Computing or never had any fun with it. All of the kids who didn’t know what Data Science was. All of the kids whose cousin hadn’t encouraged them to program when they were 10. All of the kids who never found a text based adventure game they loved. All of the kids who never tried Data Science but might turn out to be amazing at it.

Because those kids who don’t gravitate to Computer Science naturally are the kids who ask different questions. Who try different solutions. Who find ways to integrate CS with other things they love, whether it’s Biology, or History, or Literature, or healthcare, or sport. It’s those kids who will revolutionise the Data Science industry. Who will hold us to account on ethics, on accuracy, on validity. Who will champion privacy and open government, and who will find solutions to our most desperate problems.

And so the Australian Data Science Education Institute, ADSEI, was born! We’re a registered charity that’s teaching teachers to use real data science projects in all of their teaching, from primary school upwards, and right across the curriculum. We’re giving kids a chance to make a real difference in their communities using data science. And developing a new generation of data literate, computationally skilled, critical thinkers who know the power of Data Science and STEM disciplines to solve real problems.

I set ADSEI up as a charity, because I didn’t want funding to ever be a barrier to accessing these kinds of projects. I’m teaching teachers to put Data Science into every subject across the curriculum because we will never have enough skilled Computer Science and Data Science teachers. And it’s not enough to offer kids voluntary, out of hours classes that they can sign up to because, again, we’re only preaching to the converted. Only getting the kids who choose this stuff. But all kids need the chance to experience the power of Data Science, the power of authentic, integrated STEM projects that give them the opportunity to effect real change in their communities.

Schools we work with run real data science experiments with unknown outcomes: They find a problem in their community, whether it’s traffic, litter, or sustainability. It might be access to sporting facilities or overcrowded public transport. They measure the problem. They analyse that data and communicate it visually. They propose a solution. Implement the solution. And then, and this is the important bit: They measure it again! So that they know whether their solution worked, and if so how well? (Maybe we could teach our politicians that.) Then they can move on, or they can try another solution.

Or they use real datasets, like the Happiness Index, voting data, or Renewable Energy installations, to explore issues that are relevant to their world and their future.

They’re learning technical skills, scientific enquiry, communication, and maths skills, plus all of the topics related to their chosen problem or dataset. And they’re creating positive change in their communities. All with the power of Data Science.

We’re aiming for a generation of kids who have the chance to become intentional Data Scientists instead of accidental ones. For a generation that is science and data literate, and that knows the power of STEM to change the world.

How can you help? Volunteer, help us find & annotate datasets and come up with cool projects that we can put as free resources on the website. Tell schools about us. Take our projects into schools and help them run them. Or tell businesses about us and encourage them to sponsor our teacher workshops. Together, we can be a data science education revolution.


This is an edited version of my talk at #AWSCommunity She Builds on AWS day in Melbourne. There it was called “The Accidental Data Scientist”

Leave a Reply