A fantastic conversation with Quantum Computing specialist and all round fascinating thinker, Dr Pascal Elahi.
The worst data mistakes I’ve seen, besides let’s say using obviously biased data to prove a result you want is also usually, uh having data, and being like I’m gonna ignore the bits that contradict the answer I want to get and then, uh getting the answer you want to get because you’ve made sure that you’ve really selected the data to to kind of fit that
On LLMS:
…on average it got a good answer. It’s like it’s completely wrong but that’s because you know the way we were using the technique. it can lend itself to catastrophic errors. It’s not what it’s good at. So it’s okay. I mean it’s fine that it got catastrophically wrong sometimes and on average okay, it doesn’t work if you’re trying to build a jet engine. You don’t want the catastrophically wrong answer.
On Quantum computing: Suddenly you can maybe do not just like really novel science, but maybe solve things that you couldn’t solve before that were really challenging to solve.
Linda: what excites you about data?
Pascal: What it might tell you that you didn’t know to ask about the universe.
Transcript
Linda: Welcome back to another episode of make me data literate We’re on a bit of a Pawsey super computing center role at the moment at the moment. So Very excited to introduce to you Dr. Pascal Elahi. Oh, I should have asked you how to pronounce your surname
Pascal: You got it.
Linda: I got it? Excellent. Very good. Thanks so much for coming Thanks, very much Yeah, I’m glad happy to be here
Linda: Fantastic. So who are you and what do you do?
Pascal: Yeah, so my name is Pascal. Oh, I work at the Pawsey Supercomputing Research Center my Formal title is quantum super computing research lead which essentially amounts to I’m leading a group at Pawsey to look at the use of quantum computers and how to integrate them with supercomputers to do scientific computing. So the idea is to try to You know get to the point of kind of making use of this new technology which we will go into probably a little bit and Apply it to scientific problems or research problems or in some case industry problems, but the idea is open science questions And and see how we can manage to do this considering it’s a very new novel technology. I lead a small team. So that’s the lead part. And the research is because it is very open-ended research, right? We don’t necessarily and this is what’s really interesting about the position and the job – we don’t necessarily know the exact answer We have an idea of what the solution would look like to try to get this to happen the integration of quantum computers and supercomputers But there’s a lot of open-ended questions there like what’s the best use, where to go? And so there’s the research component. There’s also just research in like the general science Aspect as well, but there’s also research in just like practical problems. We don’t really know the answers to.
I’m originally from Canada and I now live in Australia. So I’ve moved around the world and I think it’s kind of important to realize like the The background at Pawsey and also a lot of places I’ve been to it’s always diverse And it’s really interesting to have a lot of different perspectives because you kind of get really novel ideas coming out of that And so the fact that I’m a Canadian living in Australia, and now also Australian, Is not like pure happenstance. It’s kind of like there’s a lot of people who do move about Across the world kind of exploring new cultures new ideas and then bringing their culture and ideas to another place And then you get a nice steamrolling effect snow balling effect of like more ideas happening and that’s kind of My position now.
I’ve got an interesting team from a couple different places as well
Linda: That’s a beautiful little summary of the value of diversity But also your job is a classic example of jobs that didn’t exist when you were at school Like didn’t exist, I think until you got your job. like that was that was the first
Pascal: Yeah, it’s very much a very novel role because the I mean, yeah, so quantum computers were an idea When I was doing high school. I’m In my 40s, so that puts back the that’s the young when I was doing high school But it was just kind of an idea and not necessarily now it’s coming more to fruition but even now like this role is very new and that is you know is Not uncommon in this kind of field where like you you have a role and you essentially do stuff Which four years ago you wouldn’t have thought that you would be doing because it it’s a accumulation of a lot of different Things happening in the world – research and discoveries that suddenly lead to a new possibility of doing something you hadn’t thought of before.
Linda: That that leads nicely into a little question that I’m kind of sneaking in there for everyone who works in in high-performance computing – How did you get into high-performance computing?
Pascal: So it’s an interesting path. So in Canada, I did my PhD in astrophysics. I did computational astrophysics and in the computational astrophysics I wanted to understand dark matter and dark energy and that required running simulations. So I learned how to code I learned how to run, On super computers actually, one of the key things out of the start of the research was My supervisor for my PhD says this is gonna be some interesting research. We’re probably gonna have to use supercomputers to do Like virtual toy universes at the scale that we need to would require super computers. so I was like, oh, that’s really cool I mean I’ve used computers. I like coding. I’ve never used a super computer It’s like neither have I so you can go and learn how to do that and you’ll teach me and I was like, okay
So I I was driven by the science. I really want to try to understand dark matter and dark energy. Got into computational astrophysics computational astrophysics took me from Canada to China to the UK to Australia looking at science questions on trying to understand how galaxies form how You know dark about what type of the in the universe we seem to see that there’s a lot of The energy budget and matter budget univers is invisible. We don’t really know what it is So it’s still a mystery, and I was really interested in solving that by essentially making Complex toy universes using computers and physics. So driving universes do physics to understand that.
so I that means that I got exposed to doing Computing and scientific computing it at like quite a large scale using super computers I was never intending to kind of go down that path. It literally was like I want to know what dark matter is because I had done some experiments In my undergrad looking at neutrino physics. So it’s a different sort of physics, but it was like particle physics looking at making little particle detectors But and working in one of the giant detectors underground. I was just kind of really interested in that question and then Started going kind of like well, how do I answer that question? Well, I need to make a toy universe. How do I make a toy universe? Okay, let’s make it in a computer.
And so I know I like games I like the idea that so there’s kind of progress there But I was never thinking I’m gonna be the person who’s gonna be helping other people doing high-performance computing. This isn’t really something I was thinking about and then when I was in Australia and I Was in a research role at the University of Western Australia, And wanted to kind of stay in Australia. So it was also the fact that I was in Perth, young daughter, Son about to be born, Covid! Okay We don’t really want to move. How do we also move during when Australia’s Western borders were shut down? Yeah, and I was kind of looking for Positions available like what jobs would be available. I didn’t really want to there were possibilities of kind of continuing on an academic career So part of a university staying in the academic Sphere, but I wanted to kind of progress. I want to try something different I also didn’t want to move and it happened that Pawsey supercomputing Center was looking to hire someone, so I was like Okay, you know, I’ve done I mean I’ve run lots of stuff on Pawsey super computers. I’ve broken Pawsey super computers, And and so it happened and then Since then right it’s been kind of really interesting to be in the side of trying to enable Other people doing science while also doing some science because I was really trying to be like, oh, it’s really been fun also helping Like people who are doing bioinformatics be able to run on These computers to try to do novel research, and other people who are trying to do novel research now in quantum computing. But like that aspect of trying to enable science and also doing some science So that real exploration has been really interesting and that’s kind of like it doesn’t sound too divergent a path But I was never thinking of this at all, even in the high school when I was doing some coding the first language I learned myself. I learned turbo Pascal
Linda: I remember turbo pascal.
Pascal: Exactly so I was like computers are cool. I remember building like my own little system But I didn’t view it as a career, right? Yeah, just like a little thing to kind of plug away at And I want to do science and so I was doing science and then it’s like, oh cool. I get to use my skills with computing But as I said like even knowing that I was using super computers at my PhD It never really occurred to me like there’s a there’s a like there is a career there And it’s only later on like, yeah, okay, There’s a real career there.
And so Yeah, it’s important to realize that and now in the career. I mean right the position is totally new So it didn’t even it wasn’t even something on my radar Even like five years ago. It’s like, you know, there’s probably You know wasn’t thinking necessarily trying to do quantum computing
So now my role as a quantum super computing person is really new. there are these new positions where people are trying to really build quantum computers And they try to get it to be part of a computing framework to solve an actual problem Not like really solving the problem of quantum computing, but the quantum computing solving a problem So again, I mean I was like, ah cool opportunity. Let’s take it.
But it’s it’s always a meandering rule. I always feel like there’s I know very few people who have Had the set path of like this is what I want to be and then they just follow that path It doesn’t work. It does happen But generally the meandering path is always the way because you it’s not like you get excited and just completely pivot But you know, that’s actually I’ve got I’ve got skills I can use there. I didn’t realize that even that was a thing Right.
Linda: Yeah,
Pascal: somewhere else has got the same need of skills and it’s an interesting path to take. Let’s let’s maybe explore
Linda: I love that and the idea that everything you’ve ever done Combines to give you the skills and to make you the person that you are it brings so much to the role that you know If you perhaps if you’ve done a straight and narrow just doing all the things you needed to do to take that particular role Even if you could have done that which you couldn’t because the role doesn’t even exist but you know if you take a very linear path, maybe you don’t get the The richness and diversity of skills, you know, we’ve already talked about how diversity is important, It’s important in your own life as well as in your team.
Pascal: Yeah, I actually I Really like I mean, I feel lucky that I’ve gotten the breadth of experience Because I find this to be very useful just having that like academic career having a bit of an industry career having different perspectives just because of the path that I’ve taken and Understanding other people’s intersections with that path and kind of feeling like okay We’ve got all sort of different skills and some skills you think I mean one of the things I will harp on is communication skills, right?
So I remember thinking Yeah, I’m gonna be that solitary scientist that just solves everything like you know, that’s like nonsense, right? But the communication skills was not something I worried about. I’m not bad at it. I was never necessarily worried about public speaking, but I never thought of it as a thing to practice or a skill. Just like whatever. And it’s actually quite important to be able to communicate an idea. if you have an idea and you don’t communicate the idea Then you don’t really have the idea because then it just stays with you It doesn’t get to be a really good idea. You have to let it out in the world, right? You have to communicate it and so
Linda: Yeah,
Pascal: I would have never thought that would have been a critical skill and it is a critical skill But also I I do think the idea of making sure that you have breadth I there’s something that actually I’m not gonna rant too much, but I do feel like it’s a bit sad that the education system Australia. Especially in high school, tends to tell people like pick your path now before you know, Like what the outcome of that path is gonna be and they’re like pick your path now. don’t go for breadth just This is the these are the courses you take. Okay, you’re good now You can go to university or something and I do feel like it’s missing the idea that If you wanted to be a marine biologist You might be like I don’t need to know mathematics You’re like, well, no, you might need – a really good like statistical background will really help your research. if you are in purely I don’t know Like linguistics, you might be like I don’t need to program but like suddenly large language models are a thing and understanding language At a mathematical level is super kind of useful
Suddenly if you if you just coded in the background But you really like language and poetry you suddenly have like, wait a second Poetry and programming? like yes, it can be a thing. and I Wish there was more Realization and maybe a real push to kind of like Do the the necessary things but always grab adjacent things, like always grab a tangential thing that you’re kind of interested in and Explore it because you don’t know you might go down that path. But also that that expression of those skills might be useful to you. You didn’t just you didn’t know
Linda: 100% I mean I went to university intending to do a degree in genetics Yeah, and by third year I was only doing computer science and I still couldn’t really tell you how that happened But I just kind of picked all the things that I was interested in and you know, that’s what that’s what fell out But Is it different in Canada? you said the Australian education system?
Pascal: I think in so in Canada there is we don’t have like the HSC… well in Ontario The each system is slightly different, but like the province is slightly different, but there isn’t necessarily like the HSC standardization fear factor or atar fear factor where you would like you have to maximize your score And so you do the things you think you’re gonna get the best score. And you just focus on those, and then and so then there’s a real push to also not necessarily saying you must drop, But you if you take it you you get a mark, if that mark isn’t the best ever you might be like oh, oh, no.
I don’t like that the idea of like, that pushing of like risk aversion. And I don’t recall that in my, you know, Like it didn’t happen in high school when I was taking that, there you could do like a bunch of stuff. Like I did, I really enjoyed my creative writing classes. As well as, and I felt it was super useful. like I think, English class, the creative writing class where we really had to like construct stories and like really tell a story and so on but also understanding that and conveying… it was really useful later on. I think it’s super useful now
And I really also enjoyed world history. Like I just wanted to, like I love history, So but I was you know, it is, if I was gonna do a physics degree in university, a world history mark Would have not helped me, who cares, but they did care in the sense they’re like, okay At least there’s breadth there right? you have to show some breath and even in my undergrad you there were elective courses you could take and I made sure that I took elective courses That I thought were interesting but completely unrelated, like it didn’t serve me in any purpose of getting my degree, other than it If I got a good mark, I filled up a credit it’s an elective. so I took economics And Based on, also here right if you in Australia, right, you do An undergrad that typically is quite focused. There’s not a lot of electives you can take There’s some, but it’s there’s a hard push to focus, and then if you want to do further studies, you get immediately. You can do like a masters. There’s a capstone project or you can do a PhD, but they’re usually very short time scales so there’s real focused because you have such a short time scale, You must focus here.
Linda: Yeah,
Pascal: and that’s great, But then I do find that people in Australia education lack some of the high-level, like broader view, And some other skill sets that could be useful and I do feel like it’s the education system That’s kind of like trying to be like this is this is your pigeonhole. You said you want to be a pigeon You must follow this path.
Linda: Yeah,
Pascal: it’s like no no, you can come back!
Linda: I Couldn’t agree with you more and in fact my Oldest did a degree in environmental science, but all of their electives They wound up picking the things that looked interesting to them turned out to be almost entirely politics and sociology. and After their undergrad they went on to do – they’re doing now – honors in sociology, because they found this is – like they’ve we’ve always seen them You know leaning heavily towards environmental science and marine biology and stuff, But then suddenly they met this whole new world of sociology and suddenly they’re really into it, But the skills that they learned in in their environmental science degree are Making them, I think a more rigorous and, Oh What is even the word, just a better Scientist in the social sciences sphere, In a way that a lot of people who’ve gone from social science undergrads to social science Honors and postgrad don’t necessarily get that, because they haven’t got that training in the scientific method, and then you know what makes a good research study. So the the breadth there has really, first of all It’s shown them a path that they didn’t know existed But secondly that it’s given them skills to, you know bring a whole new A much stronger approach to to the social sciences
Pascal: Can I ask a question because then they do they feel like oh wow this is unexpected Skilling up that I mean, you know, do they realize how useful it’s been to have this meandering path?
Linda: Yeah, but in the sense that they’re frustrated by The people who don’t do that in the field, they’re like Mum! The data management of these people, like the they don’t understand anything about data! I might get in trouble for putting this on the podcast
Pascal: But that’s good, that’s really good, right, like they feel like they can provide perspective that other people that they are there working with which is like, oh I didn’t even contemplate to think about that, and that’s that’s really good, right?
Linda: That’s yeah That’s that’s why I really think that like you that That the nexus of skills is kind of like the confidence of many different skills It’s like a really important thing and see a confidence of also many different paths that are like oh, yeah That’s that’s some cool novel stuff that happens there. Yeah 100% and in fact I learned I went to a talk some years ago now from a Nobel Prize-winning physicist I can’t remember his name. I’m going blank but he said that he He moved after his PhD I think into a new field and he felt like I’ve just wasted all that training I did in the old field like I Know nothing now in his new field But that’s it was less than a year into that that he made his Nobel Prize-winning discovery. And he thinks now that it’s because he moved into a new field because he brought the perspectives and the skills and the techniques from the old field and applied them to this new field in a way that people who’d been in that That field sort of the whole time didn’t ever think to do
Pacal: Yeah,
Linda: so the the novelty of coming into a new field gave him the opportunity to make a new discovery
Pascal: Yeah, and it probably I mean it’s one of these things as well like if you If you switch a little bit it can be really exciting as well. So you get a first perspective, but you’re also kind of you know, You’re pretty eager to try to like, oh I can maybe maybe I can you know do something interesting, or help out people, and so on.
And I think that’s that’s really great I and that’s that’s the thing that I would love to be more part of the The standard practice of education this idea of making sure that you learn through failure and also learned a skill, And then apply it in a different field. Yeah, right and then you can be like you get kids being like kids are pretty open about like, learning stuff, so they you know if you chat about Like the history of I don’t know tea-making and then be like so high then you’re counting you’re like how why do these two things matter? You’re like well, so you can see how economics drove some of the stuff that’s happening.
Linda: Yeah,
Pascal: and they’re like, oh, okay There’s interesting stuff. I also feel like I mean I really enjoyed my art class So Because it was it was just a kind of way and I do know I’m not necessarily like an artist, But I like the perspective of being having like, okay, there’s a blank canvas. I’m gonna try to do something
Linda: Yeah,
Pascal: and I just, that approach can be super useful, right?
Linda: Yeah,
Pascal: and I wish that was a more common Bit of like learning through failure learning through Applying skills that you learn here and be like, okay, we’re not what I do so for something completely different not the Spanish Inquisition But you know, no one expects and this reason no one expects new novel application of Ideas so you try something completely different, right?
I think that’s a really useful education Tool that I feel is unrealized. I would love it to happen more often
Linda: Yeah, I talk a lot in my work about the importance of recognizing that there is no such thing as a perfect answer to a real problem and so um You have to actually assume that you failed to some extent And look for where you failed or you know, like where it’s not perfect and where it could be better and you know, That that kind of angle and I had a long conversation in the podcast.
I just recorded with Kat Ross um about About one of her big fails and how she just brought it up organically in the conversation. I was like, I love that you’re talking about that and she said everyone’s got one, you know Like everyone’s got one of these and if we don’t talk about them then then you’re gonna try to hide it whereas actually what we want to do is You know talk about it and and bring it out into the open that this is science. This is how you do science you you make mistakes and you write up the mistakes as much as you write up the the um you know the The real discoveries and I think that’s so important to to normalize the fact that we make mistakes. So we’re not perfect and not being perfect is not actually, you know an indictment on your character and abilities, It’s it’s it’s part of your humanity
Pascal: Yeah, I would agree. I says one thing that now Maybe now that I feel like I’ve been doing this for years, but I feel more a lot more confident, Is I usually would if I’m giving a scientific presentation presentation I usually like aim for I want to tell you where I’m confused and why I’m confused, about the thing that I’m doing like the research that we’re doing, And the answers we’re getting, Not to say that I’ve got the solution, but like I think I thought I did and I don’t, and so this is why I don’t, And I’m confused uh, because I do remember one of the most uh It’s kind of a very interesting presentation. It really stuck in my mind, but it was you know, it was during my phd There was a Russian mathematician, it was a string theory conference, and he just did a blackboard presentation Like just literally a chalk, right? It’s a very old school, but his basic premise is like I don’t understand why this works so well Uh, but then it has non predictive ability like why and the entire presentation was like why he was confused.
Linda: Yeah,
Pascal: and he wanted someone – he wasn’t trying to explain, Like oh, this is my novel idea He really was like this is the ideas I’ve got and can you understand why I’m confused? Does anybody have the answers? he was asking the audience for the answer to his presentation. Yeah, ah, yeah, that’s actually I learned more there than I did otherwise. I’m like, okay that And as I said again before I just assumed that You present fully complete ideas, you know exactly what the answer is you kind of give good answers and And I really tried to give this like since: if you don’t know say you don’t know, it’s okay.
It’s not like people – If you don’t know and it’s something that you should know then maybe it’s interesting, Like you might have to go back and think why you didn’t know, but Having somebody be like: this is my lack of knowledge. It Is okay to admit or like I I think we did an approximation. It seems to not work, Why, but I thought it would work, like I think it’s fine to have those,I agree, like out in the open because that is where actually a lot of the novelty comes from from trying ah getting a new idea, right?
Linda: Yep. Well, you know, penicillin was a mistake, right? The classic example. It’s been a bit useful that one Um, I I did warn you that we would diverge from the questions and indeed we have so to Come back to them briefly. Um, is there anything that you wish everyone knew about data? Like one thing where you’re like if people really understood this, it would it would be so much better
Pascal: That’s a hard question to answer. I think the thing that I find most often is missed Is uh biases in in data.
Linda: Yeah,
Pascal: uh, so You know, you you accumulate some data, uh, and it’s understanding what that Accumulation process, like how it could you know specify some bias in the data. How you might under sample or over sample certain features in a data set Yeah, and I mean doing some of the AI, Sort of machine learning. I’m not a huge fan of artificial intelligence. I prefer machine learning, to me they’re all machine learning techniques, but in the machine learning world like The introduction of bias – but even just in the same way if you’re if you’re doing some fitting of a line to some data, Really understanding how Your certainty in a measurement and the bias is involved in how much you might actually sample the real gambit is I think quite Sometimes missed. And that’s the thing I wish people understood about data. It’s like, okay. I’ve got the data set. I’ve got accumulated like survey results Okay, I’ve got everybody I sampled a large portion of Australia, uh, and I’ve uh found that people like, uh A lot of people like ketchup, right?
They’re like, okay, so that’s fine but then you might be like well, what people are going to respond to this survey? So some of the stuff we do also for Survey results is like, you know, if you have a very strong opinion more likely to Answer, so you might see and this is happening right, polarization of people like, oh, this is the definite This this is the one truth and the evidence is no, this is the other truth. And the people who are maybe in between Might not respond because they’re like well we’re in between. I don’t really have a strong opinion one way or the other um
So the thing that I wish people understood about data more Is that the process of collecting data can introduce biases And that’s not necessarily a bad thing But it’s just understanding the limitations of the data that you’ve acquired
Right and then it can tell you something about like the model you’re going to try to use to explain the data. All models are wrong, but some of them are useful, right? It’s that, but also that comes out of the fact that You know, you have data, but you don’t have the infinite true data set you have a sample
Linda: yes
Pascal: And that’s the thing I think probably I wish more people understood. I’d be curious If if you would – what would your answer to that question be? because that’s that’s mine I think the bias is the thing that comes out to me
Linda: Ooh, no one’s ever asked me that! I mean every time – Every interviewee’s answer is different, and every time I’m like, oh man, Yes, that would change so much. So, you know, that’s my whole job. I want everyone to understand everything about data! But the bias one is huge, and it’s been really fascinating to me as I delve more into the types of bias, And and you know thinking about well, Selection bias is a problem, and and you know if you think about selection bias, I’ve been thinking as you’ve been talking I’m thinking, wow! It’s amazing how everyone on my podcast has, you know, some points of view that where I’m just like, oh man. Yes. And that’s, you know, there’s this there’s this real commonality of you know principles, and then I’m like it’s possible there’s a little selection bias there because I invite people on my podcast I invite people that I like talking to and then have interesting things to say, You know, ah Maybe there’s a whole bunch of people out there who don’t believe that, but I just I’m just not meeting them and I’m not you know getting into big conversations with them, because we we believe, you know different things. It’s easy to think or to believe that the people you surround yourself with are representative of the world, But they’re not, they’re representative of people you want to spend time with, and that’s you know, so that’s a classic case of selection bias.Who are you asking?
Pascal: Yeah, it’s one of these things that uh, I mean I generally feel like if you asked any human on the planet, you and you really if you got uh, if you sampled everybody, You should end up with a very confused look on your face as to what Humans’ motivations are because that seems a sensible answer.
Linda: Yeah,
Pascal: but even even stuff like that You know, we we can I think people can feel like okay. there’s biases when you ask humans, but there’s biases even like there was a big tension in um the cosmological, I mean still a tension, so we still don’t understand what’s going on, but there was a discrepancy between measurements of the cosmic microwave background – this is the leftover afterglow of the early universe – and you can use models to predict what the expansion rate of the universe should be now. And then people looked at using what they call standard candles, looking at stars nearby that they figured they knew How bright they should be, so they could how intrinsically bright they are? So that depending on how far away there you could figure out, Okay, well, that’s their distance. And they use this laddering up and then eventually they use supernova explosions, which they Are quite complicated explosions so stars blowing themselves to bits, but you can They seem to have an intrinsic sort of profile distribution of light curves, Like how bright they get and then how well they dim To get a distance of where they are how far away they are and then you can start using that to figure out the distances to lots of galaxies that are nearby and then figure out what the what do you think The local universe and I mean here I’m talking big scales, but still says the expansion rate of the universe should be so the universe is still expanding and there’s a big tension.
And for a lot of it a lot of people are like well It’s it’s biases and what we’re doing when we’re trying to do the standard candles because the other one’s a big afterglow but also we make an assumption in the model that may introduce a bias in the results we’re getting and so a lot of the initially we’re like oh, they’re both were like this one’s more biased than this result, This one’s more biased than this result. Both knew that they had a model that was trying to do its best to describe what we were seeing. And that they both might have a bias that would change the tension they were hoping the tension would go away. So you’d have the two values agree. They’ve not agreed, so now it says something interesting. But for a long time it was a real question. It’s like are you getting a biased result given that you are doing your best to try to sample certain things, but it’s a very specific type of sampling of what’s going on in the universe.
So there’s all even like even what you might think is a bit less… like it sounds esoteric, but I mean it’s not complicated. We’re trying just trying to measure how fast the universe is expanding. That is a challenging process. But it’s not like you’re getting a the confluence of let’s say the human confusion. Here it’s still also “well, we tried to look at stars and stars exploding and we know that we don’t see all of them, and some of them are not necessarily as old. There’s some diversity. We don’t know how well we sample that diversity. We don’t know what the biases are.
It’s very hard to gauge But they could be biased results and you get a disagreement and there’s a question of whether the disagreement is real or a bias and it seems to be real. But it’s an interesting sort of uh, physics one where you’re like, oh, there’s a huge debate as to who had the right answer and which one answer was more biased than the other.
Linda: It’s it’s particularly interesting because we think of physics is all, you know, hard numbers and and yes it’s it’s all it’s all measurable. It’s all straightforward. It’s not not confusing and and you know messy and squishy like biology things and and human things And oh, it turns out it’s actually quite messy not maybe not squishy for the most part, but definitely messy
Pascal: Yeah, definitely. There’s there’s lots of uh I mean, it’s it’s one of the things whenever you get like a complicated system. No matter, you know, if it’s if it’s a virus a cell, Just fluids moving, like water moving and diffusing it can be quite challenging to understand exactly and predict as well because that’s the big thing of science.
You want to be able to be predictive as well. What could happen? We’re pretty good at it, right? I mean, we’ve got lots of technology proving like we’ve got pretty good explanations for lots of things But there are approximations that work really well most of the time but occasionally you can see like, uh, there’s a there’s a spot missing and you’re like, yes, is that spot important? I don’t know? sometimes?
Linda: What are the worst data mistakes you’ve seen?
Pascal: Uh, the worst data mistakes I’ve seen, besides let’s say using obviously biased data to prove a result you want is also usually, uh having data, and being like I’m gonna ignore the bits that contradict the answer I want to get and then, uh getting the answer you want to get because you’ve made sure that you’ve really selected the data to to kind of fit that
Linda: Yeah
Pascal: is one of them, the other one is is I said also misunderstanding what the data is Like you have data and then you’re trying to pose a question, It’s like that data doesn’t really tell you much about the question you’re posing
Linda: Yeah,
Pascal: uh Which uh, I do feel like this actually happens quite a bit. This actually comes into the actually we were mentioning AI right so there actually is right there a real thing – so the llms. There’s large language models, When you get to very large models, okay, you want to get to be fluent in a language, So you grab everything that you can grab right. And then you end up with something that occasionally you’re like Well, it’s hallucinating all the time. Why is it giving wrong answers? Well, it’s just like it’s gotten very good at writing language. But because you’ve just grabbed a smorgasbord of uh, English, French, Italian, Japanese, Chinese, right? It’s just learned how to structure phrases Yeah, and those phrases are sort of unconstrained, because you’ve got a wide variety of topics and so then you can pose a question and be like that’s that’s not true And I why is it not true? Well, if you if you then confined the data to be like this is it limits the scope of the questions you could ask It’s like it’s not going to necessarily answer everything, but it might have a more higher probability of giving something that is uh, a correct statement and
I do feel like this this lesson seems to have been lost. Well, I don’t know if it’s truly been lost I feel like they have different motivations, but like for some of the big AI companies There’s a harvesting everything and then training enormous models to try to on that enormous set of uh phrases and then getting AI that can be quite convincing there right about what it states, but it because it’s been like you haven’t curated the data to be like can you answer questions on plant biology? Right, right, and then you might be like, okay, it can answer questions on plant biology. It can’t really it will always refer to plant biology stuff. if you ask questions about how to make a table We’re like, wow, that’s not in the lexicon of the stuff it’s learned.
But I that’s that’s one really good example of people like miss or seemingly at least I don’t think that’s the case but seemingly missing understanding If you want to have this as your kind of tool, this is the data that you should use.
And then normally people being I think uh I think knowingly misled but like the idea is that okay, we’ve grabbed everything because it’s it sounds more impressive. It would be like we’ve grabbed all the data in the world and then constructed and used machine learning techniques to come up with probable phrases in English or Chinese or something. But that’s totally fine.
But then you were trying to come up with real good understanding of the language and the grammar that’s used and there’s also like I don’t know how it would manage you’ve got slang that varies across time. So suddenly, you know things that make sense in the 60s, right as slang, are not present in the data. So you’re like, well, why I don’t understand the slang it’s using we’re like Well, it’s because it grabbed the data that’s available that has more newer slang,and that’s a great example of like I would say probably mis-marketing but let’s say also it could be just data misuse, right? You you’ve just done the wrong thing with that data
Linda: Yeah, and you know the slang doesn’t just vary across time it varies from country to country even Yeah, two English speaking countries can use the same word to mean different things So like in the US an entree is what we would call the main course and here an entree is what the US would call an appetizer and like so if you use the word entrew You get two different meanings or two different interpretations. How is a language model supposed to deal with that?
Pascal: Yeah, exactly.
Linda: You know it It seems like the AI industry has forgotten or is trying to pretend that it’s not true that machine learning systems are really good or can be really good at very specific problems So, you know one of the one of my favorite Pawsey projects is the the machine learning system that that can predict A crisis coming in someone with a traumatic brain injury
Pascal: Yes, I think it’s a great example
Linda: half an hour in advance and so they can start treating it before It actually happens because once it’s happened it kind of screwed But if you can catch it as it’s starting you can you can prevent catastrophic Outcomes, so it’s very good at that one specific thing. No one would dream of asking that system to tell you What’s a a restaurant that does gluten-free in como like It would be nonsensical. But that’s kind of what we’re doing with these large language models these chatbots where where they’re trying to make them be all things to all people and we just don’t have that technology. What we have is technology that does one thing really well. in the case of chatbots. It’s creates a plausible sentence. That’s it. that’s what it does not not Gives you answers not solves problems
Pascal: Yeah, that that’s the the big thing right like it’s a great example of I mean as I said I see open question whether it’s data misuse or just like they know that it’s The outcome is just like plausible answers and they’re happy with plausible answers that for queries that people are posing um, I know there’s some I mean I I use uh smaller language models, so You know, I tend to run locally A small language model where I know it’s been trained on some good C++ and some rust and some python and I ask it questions about programming to refresh my memory. Um, not necessarily to write the code because it also writes incorrect answers, but it is a pretty good basis point for actually going for something that’s a bit more complicated. Whereas if you try to do this general model, I mean So we we don’t have something as you said like we the actual techniques that are being used – the transformers, the attention mechanisms – Don’t lend themselves to like true generalization with somehow a reasoning behind it.
They are like I’m going to give you a plausible answer and I will say It’s also as you mentioned like English uh You know, there’s different interpretations for what the word means in English, like lorry and truck. You know all these things these nuances. Then also I feel like all these nuances can get lost. so like it’s nice to have a bit of language diversity And suddenly if you if you pose questions in this way You kind of lose some of that language diversity, but also now something that is think of thematically positive, right? So you write a statement in English It’s meant to be positive and then I can write statements in French and ask is it a positive or negative outcome?
And if I use the models that have been trained in France, it’s quite good at picking up understanding French, but if I use other models, they’re pretty terrible at understanding what the French is actually trying to convey, as a positive or negative outcome. I remember we had a training session, one of the visiting PhD students is Italian, and said go ahead and try it and he’s like, okay. So he wrote a question in English and like wrote a statement in English and asked the AI is it a positive or negative? From the perspective of the person who’s writing this statement and Yeah, and he tried a few times, English was okay, and then he tried Italian. He’s like, oh, it’s just random guessing I was like, yeah because It hasn’t been trained enough on that, and so then you but also people have tried to you know, be like, oh, yeah, it can definitely do the translation. It probably can but it’s always you always need to revise because they’ve just grabbed a whole smorgasbord of everything. if you grab all the data and ask a silly question You can get silly answers, but you ask a serious question. You can still get silly answers. It doesn’t know.
Linda: Yeah, yeah The concept of truth just isn’t isn’t in there And then we can’t with this technology it can’t I can’t see any way how you can build it in. like it’s not It’s not you can’t bolt fact checking on to a Large language model. It doesn’t really work. But also, you know the sentiment stuff, particularly in Australian can be really tricky, like if you particularly shorn of of like emotional tone. If I say that’s sick, you know, if you just have it in text, that’s sick, in Australia that could mean that’s the greatest thing I’ve ever seen or it could mean that’s really twisted and awful.
Pascal: Yeah,
Linda: I think If you just see it on a page shorn of context, there is no way to know. Uh, and if you hear someone say it you can probably tell from the tone of voice, but but that’s not something that we have the the technology to make sense of because the sense isn’t in the statement, you know the sense is in the whole context, and the you know facial expressions and tone of voice all that kind of stuff and we just went nowhere near being able to wrangle that with any level of accuracy.
Pascal: No, and this is one of the things where I feel like this is again a data misuse, right? If you just grab text and then don’t annotate the text to explain context and also emotional like sarcasm, all that stuff. Unless you say, okay, you know that’s sick and then qualify and train the model to say this is what it meant, right like just tokens then it might be like, okay, but if you just have that sick, like shorn of context, it is just now a random guessing of like if you go oh that’s sick, what do I mean? It should give you random guessing. It’ll you know be like because it doesn’t know and it in the the like I said it’s fundamentally the algorithms, but also the data that’s used to train that algorithm and like how could it ever know, right?
Sarcasm is a is a is a big thing even even some simple stuff. So I was looking at using physics informed neural networks as trying to reproduce physics results normally with a machine learning technique. Where you try to you know, you train the technique to try to minimize the difference in its answer versus the actual answer. The issue there is that you can get stuff which is on average quite good. But then has like violations of the laws of physics as part of the solution. Because it on average got a good answer.
But it completely. like if you ,said what’s the answer like oh, no, that’s that’s completely 100 percent wrong. And I was thinking of one where it’s trying to model the flow of fluid into an engine like a little piston engine. And so they do these simulations where they’re looking at the airflow in and airflow out And then got in and it had some weird asymmetry like the flow in should be the flow out because it’s a confined wall and so on, But it also had weirdly changes in the flow. So you had air flowing in and it would flow in, flow in, flow in and then suddenly flow out, flow out, flow out, and then flow in flow in flow in and it’s like why is there no shock in the flow? If you take fluid and move it in one direction and then suddenly reverse the direction there’ll be like turbulence and they didn’t have any of that.
it just got a small slightly opposite direction answer to where the flow is going, but on average it got a good answer. It’s like it’s completely wrong but that’s because you know the way we were using the technique. it can lend itself to catastrophic errors.
Linda: yeah
Pascal: it’s not what it’s good at. So it’s okay. I mean it’s fine that it got catastrophically wrong sometimes and on average okay, it doesn’t work if you’re trying to build a jet engine. You don’t want the catastrophically wrong answer.
Right, but I understand trying to like, try stuff. I’m going okay on average. It’s kind of good, but I understand that I need to really think about the answer it’s giving me rather than just like accept it.
Linda: Yeah, and that’s that’s that is not the way it’s being marketed or the way people are being encouraged to use it or you know the way people are being told that it behaves, you know, that’s it’s not… It’s fine if you’re dealing with someone who knows to really interrogate and check, but It’s not fine when you’re putting it out as a, you know, magic answer machine.
Pascal: Yeah, that really… not to use too strong a word, but really angers me. When they say I was like, uh, you know, just take it and then, and people have a tendency, right? If it looks convincing you’re like, okay, it’s probably right and in this case I mean, I do say I know vibe coding has come out right like people are via like coding with human responses. But I’m like if you’re not an expert programmer, you can introduce security holes if it’s like a web portal. There’s lots of stuff that can go wrong. So the idea is then you you have to spend the time debugging what the AI has produced. So you look at the code and to do that you normally have to be a pretty good expert at the code.
So it’s a helpful tool for someone who is already an expert which also really angers me because there’s like a digital divide pushing right there. Because to get good answers you have to really know quite a lot already and then vet. And so that’s the people who benefit or the people who’ve all seek gained lots of skills. Uh, and then you might be pushing people and also typically also have like lots of infrastructure, So they can run big models and they don’t worry about the energy and the cost and so on. And then you’re driving a digital divide for people who are trying to learn, don’t necessarily have the money to spend on like pro versions of these AIs don’t necessarily have the skills to look at answers. They get answers and then get maybe a result that is less than ideal and the people like “oh you’re you’re not good.” I like oh no, that’s that’s not what’s happening
Linda: Yeah, it’s there’s just so many so many issues there. Um, yeah I’m not going to ask you to explain quantum computing in, you know, two minutes, because it can’t be done. But what is the potential that you see? like I see you, You know when I was at Pawsey a couple of weeks ago, I’d see how excited you were about your job What is it? What is it that you think could come out of this?
Pascal: So there’s there’s uh two things right, one of these on the quantum computing side some of the necessary steps on just like an engineering aspect are really novel and probably have applications completely elsewhere. Like the ability to take out lasers and move atoms around precisely pretty precisely for doing quantum computing for one of the quantum computers is pretty cool. Like that’s pretty awesome than doing an individual atom around with a laser. But that probably has applications elsewhere.
So it kind of excites me that there’s unknown applications of just the engineering thing that people are trying to do for quantum computing.
But the novel thing that I think is really interesting is there is Quantum computers are good at tasks that we find classically – like if you pose in a classical computer – it’s just a really challenging problem to solve, And suddenly you have a tool that could solve that particular mathematical problem super quickly. And then suddenly like okay well that solves that problem really quickly, but where does that problem fit in? like where do we encounter that particular posed mathematical problem? probably in a huge number of places. That people thought okay, you know, we’re gonna try to do some approximations. Suddenly you can maybe do not just like really novel science, but maybe solve things that you couldn’t solve before that were really challenging to solve. like we’re looking at stuff where you know the standard machine learning techniques, you need quite a bit of data to get it to converge, to give you kind of goodish answers, right? And then the stuff we’re like health data is like how do you grow the data set like if you have 600 patients? It’s is not like I can arbitrarily go and grab another patient. I have 600 right? There is possibilities in quantum machine learning to pick up more of the features that would give you a better diagnostic tool that you simply would be really challenged to do classically. So there’s this idea the really sort of a novel technique a novel tool, that if we can get it working happily with Computers and kind of provide an ease of, and this is a big thing for me like easy access, so that you don’t get bogged down in like the really hardcore, like it’s like if you had to chat to a machine in like the machine instruction set, right? Like I don’t want to have people focus on…I want to have people focusing on like “this is the problem”. Have conversations with people who have understanding of quantum computin,g pose the problem well, and then get solutions and then suddenly go wait a second – that that solved that problem, but I I have a similar problem. Can we remap it to what we just solved? Yes?
We can now maybe do something even more novel. So like get better, like obviously like you can model maybe molecules with quantum computers, So you could really do like maybe personalized rapid drug discovery, because you could be like this is the you know protein that’s affected. We can do protein folding classically, but like, you know looking at drugs and manufacturing drugs is a hard thing. So suddenly you could come up with really novel stuff faster than before. Uh, which I think is the really exciting bit, like there’s this new space of like suddenly there’s a new computational technique. It’s like when computers just came around right suddenly you had computers like You know people thought them as just keyboards and typewriters and some people like “no, it’s a modeling tool for the real world”
And suddenly like that is good at certain things and then other things start breaking out, suddenly Oh, we have a new modeling tool for the real world that is good at the things that the other one was not good at. Ah! New tools in the toolkit to kind of solve some interesting problems.
That’s what really excites me it’s like, applying that, trying to get it to to be working. Democratizing access so people can try the ideas for quantum computers. And then getting that that snowball effect where you suddenly get really like not just people kind of like mulling over mathematically “This is maybe a posed problem”, but some people like okay, I mean I can map… It’s as if you try to map a problem to a different way of looking at it, So you change the perspective and you suddenly you have like uh, that can be well posed for a quantum computer. And the other bit that can’t, well I can use a super computer. And now I get to do something really novel, that otherwise would be very challenging or possibly in the case of as I said in some of these answers, maybe you get goodish answers, but you’re you’re not happy with a goodish answer, You want a slightly better answer and you can maybe suddenly get a slightly better answer or maybe a much better answer.
Linda: That’s awesome. And that ties in nicely to the last question, which is what excites you about data?
Pascal: Uh, so this is going to be probably from uh my physics perspective, but just What it might tell you that you didn’t know to ask about the universe. like I love the idea that uh, you know, as I said, said people looking at stars Looking at galaxies and then looking at the motions of stars and realizing that the data implied that there was way more matter than we could see around our galaxy. We didn’t even know what to pose the question of what is dark matter, right? It was just like you had matter and now we had you know pose the question like what is dark energy and then suddenly the new question is well Uh, it’s not what we thought before. so I love the idea that data can really expand our understanding of the universe In a way that leaves us, uh, let’s say properly confused
Like it’ll be like, oh, um, don’t quite understand what’s going on there But I didn’t even know that I should be looking there. and I really like that, you know, it’s like a really sort of like you you discovered a treasure you didn’t even know was there in front of your face. Till you’re like, oh, there’s actually this there is something interesting here that’s going on. Why? I didn’t even know it was an interesting thing. I didn’t even know it was a thing So that’s what I really love about like uh data can it can really lead down some interesting paths.
Linda: So it’s not just uh finding new answers. It’s like figuring out that there are new questions to ask
Pascal: Exactly. That’s really cool Thank you so much.
Linda: This has been a fabulous chat. I I’m always surprised by where these conversations go and that’s always somewhere really interesting Thanks so much for your time Pascal. It’s been great
Pascal: Thanks very much Linda
