The Dejargonizer

AI Safety: How to Build Artificial Intelligence Without Destroying Humanity

September 20, 2023 Amir Mizroch Season 2 Episode 8
AI Safety: How to Build Artificial Intelligence Without Destroying Humanity
The Dejargonizer
More Info
The Dejargonizer
AI Safety: How to Build Artificial Intelligence Without Destroying Humanity
Sep 20, 2023 Season 2 Episode 8
Amir Mizroch

AI expert Rebecca Gorman from AlignedAI simplifies AI concepts like alignment and existential risk. She explains how her company Aligned AI helps AIs learn faster, identify unknowns, and act cautiously - just like a parent teaches a toddler. 

Support the Show.

Listen
Apple Podcasts, Spotify, Google Podcasts, Audible, or anywhere you get podcasts.

Connect
LinkedIn
Twitter
Newsletter

Email: dejargonizerpod@gmail.com

Become a supporter of the show!
Starting at $3/month
Support
Show Notes Transcript

AI expert Rebecca Gorman from AlignedAI simplifies AI concepts like alignment and existential risk. She explains how her company Aligned AI helps AIs learn faster, identify unknowns, and act cautiously - just like a parent teaches a toddler. 

Support the Show.

Listen
Apple Podcasts, Spotify, Google Podcasts, Audible, or anywhere you get podcasts.

Connect
LinkedIn
Twitter
Newsletter

Email: dejargonizerpod@gmail.com

Rebecca: One classic example is the camel cow example. You have an artificial intelligence that's trained on camels in the desert and cows in pastures. AI can look like it's correct a hundred percent of the time, but the way it's identifying camels is it sees a tan pixel. And the way it's identifying cows is it sees a green pixel. That's actually not the way that you or I would identify a cow or a camel. If we sent this AI out into the real world, it's going to make a lot of mistakes. It's going to see camels and cows in all sorts of places where there are no camels or cows. If we don't send it out into the real world but give it a picture of a camel in a pasture, it'll think it's a cow. The reason for this simple example is to make a really simple demonstration of a problem that all artificial intelligences have.

Amir: B2B services, new standards for quality, technological leadership, and operating excellence. Where it really made a difference was that it had a very leveraged effect. System overload. The tech industry is in an age of messaging instead of story, and it's getting worse. It was an ownership stake in the underlying headcount that you need to generate that cash flow, which is a final scalability. How big can you make it to be considered a disrupting marketplace? I'm going to tell you about what really means system. Hi, I'm Amir Mizroch, a communications advisor and former tech editor at The Wall Street Journal. I speak to tech founders and put them through The Dejargonizer, a zero-jargon zone podcast. Welcome to the Dejargonizer. Today we're talking about artificial intelligence with someone who's been working at it for the past 20 years.

Rebecca: I am Rebecca Gorman, the Founder and CEO of Aligned AI. We're working on making artificial intelligence safer and therefore more usable by helping it to understand concepts in more of the way that human beings do, so that we can tell it what's important to us and it will actually have the ability to respect that.

Amir: On to The Dejargonizer.

Rebecca: Thanks for having me.

Amir: To me, it feels like AI has just landed in such a big way over the past year. But it hasn't just come out of nowhere. People have been talking about AI since the forties, fifties, sixties.

Rebecca: I actually built my first artificial intelligence over 20 years ago.

Amir: How old were you when you built your first one?

Rebecca: I suppose 18.

Amir: Were you studying computer science at school?

Rebecca: My friends who were programmers told me that studying computer science was for people who didn't program at home and do their own side projects. So it wasn't cool. I studied business, and if I had studied computer science, I probably wouldn't have made my own AI 20 years ago. But in business, we were actually teaching people to make AIs and making AIs because we could use them to make predictions and decisions about how to maximize business value.

Amir: You also have a philosophy degree.

Rebecca: I do. A lot of people in philosophy have been talking about artificial intelligence for a long time. They've been talking about it at a higher, philosophical level. My interest in artificial intelligence really began at the data science level, the limitations in what you can do with artificial intelligence based on how it fundamentally works.

Amir: You were coding this 20 years ago. You could set the stage; it's now 2023, where are we in this whole AI thing?

Rebecca: The first people to build computers actually would've seen those as a form of artificial intelligence. Most people don't know this, but before we had digital computers, we actually had beings whose job title was "computer." They computed things. Every time we give computers more of an ability that human beings have and technology hasn't had before, it's very impressive at first. People are excited at first, and then eventually we become used to it and we say, well, actually computers can't do everything that humans do. It's not that interesting after all; it's not actually going to take all of our jobs. We become accustomed to it, it becomes less exciting, and we're less worried until there's another advance in what computers can do.

Amir: That's our human dance with AI, right? What are researchers like you actually looking for there, conscience? Intelligence? Dumb it down for us.

Rebecca: After the development of creating the digital computer, the next development in AI was symbolic reasoning. It was amazing to make computers be able to reason like we do, but that reasoning wasn't grounded to the real world. It couldn't really handle ambiguities. And then we realized we could add in statistics basically to AI to help artificial intelligence understand a distribution of data instead of a single data point. That was the next stage of artificial intelligence. And then, people worked out that we could scale that up just by scaling up the amount of data and the amount of compute, you could actually get a lot more out of the same techniques 

Amir: Let's pause on that for a second. 'cause I think we've, we've introduced a couple of concepts here that kind of take us into, data and compute, 

Let's talk about the company Aligned AI. Why aligned, what, is this word alignment that I'm seeing a lot more of?

Rebecca: So the word alignment an artificial intelligence means alignment with human values. is artificial intelligence, really respecting and following the values of human beings. My co-founder, Dr. Stewart Armstrong and I came together because we realized that were these fundamental problems in the way, in in the fundamentals of how artificial intelligence are built keep it from actually understanding, concepts in the way that human beings do.

Amir: You said about, aligning the software or aligning the programs, aligning AI with human values. I just wanna understand these human values are because, turn on the news and you can see certain human values, as you can see in conflicts all over the world.

Rebecca: So the the really interesting thing about our approach is that we don't have to define a set of human values and we're not intending to do that. the problem with artificial intelligence today, the way it's built today is that it, can't follow any set of human values. It can't follow any set of human concepts. The reason AI seems magical, and it it seems like it works, is because it's operating in constrained environments that look very much like the environments it was trained on let's say I thought that everything red was a stop sign.

That would work as long as I don't see anything red that's not a stop sign. So let's say I'm wearing this red jacket and I walk down the street and an AI that thinks that red means stop signs, sees me, it's going to stop. I mean, if it knows to stop it, stop signs, which might be bad thing if, if there's cars behind it, for example.

 So if, um, every time this artificial intelligence sees red, it stops. The only only things that ever sees red are stop signs. it looks like it's working perfectly. As long as you continue to provide only that kind of data, working perfectly. You don't know that your artificial intelligence is finding the wrong features or too narrow a set of features until, else that's red is introduced into the equation.

Amir: Would an example of that be, autonomous cars that get into accidents? 

Rebecca: In a case of an autonomous vehicle several years back, After this autonomous vehicle hit the human, they went back and tried to figure out why.

It turns out the human was crossing where there was no crosswalk and humans crossing there was no crosswalk was underrepresented in the training data. So this autonomous vehicle couldn't recognize the human because human was crossing in an unexpected place. this is still a problem with the way that artificial intelligence is being developed.

And this is the kind of problem we're trying to fix to help AI identify a human no matter what situation they're in, no matter what the shadows are, no matter where they're crossing the street, no matter what the color of their skin is, no matter what age they are. that is something that you and I can do naturally. we developed it when we were quite young. artificial intelligence is still developing that skill.

Amir: I understand what the risk is with the autonomous cars killing, a person here or driving a car off a cliff, to avoid hitting children. But again, I the numbers don't add up to, let's say, an AI nuclear Armageddon. What are the stakes here beyond those examples to what a real kind of AI practitioners really famous one around the world say, you take it to an existential level is what people are are we talking about? 

Rebecca: There are a few different broad categories people are talking about in existential risk. And one of those categories is what if you have artificial intelligence that has interests counter to ours? Maybe it, it knows what our interests are, it knows what we want it doesn't care and it's going to do something else.

that's not the problem we're addressing because that's not a problem that we have today. Today, artificial intelligence doesn't know what our interests are, it doesn't know what, what we want. It can look like it in as long as it's a narrow data distribution that looks like what we've trained on. but the real risk is that we make this AI that doesn't understand, our concepts and it doesn't understand our values, and it it doesn't understand, what we, we really want it to do. We give it a huge amount of power, it operates in an environment that we haven't prepared it for.

Now, it can hurt us, um, not because it's malicious or not because it has its own agenda, but because it's trying to follow our agenda and it doesn't understand robustly what our agenda is, or it doesn't understand how to apply that agenda into the set of data that it's looking at. If you think about, the senses that an AI has, You and I, we have taste, we have sight.

We have the ability to hear, we, we have touch, and AI has the data streams we provide it to. That's the AI's touch and sight and sound. And that's what it has to go on. And the way that it parses that information, the way that it understands information, is what we teach it way that we teach it, to parse or understand that information and to date the industry has not been teaching artificial intelligence to parse and understand that information in a way that it will keep doing what we want it to do and keep respecting our values and, and keep, respecting safety, when it receives information that it's not seen before. 

Amir: That really was good. I think that cleared up a lot for me. what would be great is if we could a clear, easy to understand example of how aligned AI is actually doing that, with, whether it's research that you've done or product that you're working on, or conversations or, like. Give us, How are you making that happen?

Rebecca: The part that we're excited about is the ability for artificial intelligence to monitor itself, behave safely in new situations by acting conservatively. And, an idea, a a better idea of new concepts it's experiencing in, in a data set it hasn't seen before, and then asking interpretable questions humans about the new situation it's in, so that it can, adjust . 

Amir: When you say that, you're helping the AI act conservatively, until it gets feedback, what do you mean by conservatively? How, how does it know is a conservative step to take when it sees new data? that it's not, uh, seen before? 

Rebecca: So in order to do that, it, it needs a little bit of information about, human priorities. So for instance, if we tell it that if you think you might see a pedestrian, Assume it's a pedestrian until we've, given you more information, we help it to suspect that it sees a pedestrian when other technologies would not be actually able to it that insight.

Amir: I like that. Can we take that a few levels up, to say, companies that use AI to make pharmaceuticals, drugs?

Rebecca: Someone tried to make an AI that could recognize skin cancer the AI determined that if it sees a ruler in the picture, it's skin cancer.

Because, um, turns out that in the training set images, if you have a picture of a, non-cancerous mole, there's not usually a ruler in the picture, but but cancerous uh, tumor, often a picture of a ruler next to the skin cancer. therefore cancer looks like a ruler, a according to this artificial intelligence. 

 Another classic example was, uh, somebody trained in AI to detect collapsed lungs X-rays.

And it worked great in the lab. They took it out into field and suddenly it didn't work anymore. So back to the lab to figure out why wasn't this working? turned out that a lot of the pictures of collapsed lungs in their training dataset were actually collapsed lungs that had been treated with a chest strain. And a chest strain looks like a little white blob with a, a line coming off of it, it's a much easier to recognize that white blob with a line going off of it, it is to recognize all the complicated features of a collapsed lung on x-ray.

So the artificial intelligence decided that a collapsed lung looked like a chest strain. 

Amir: Where along that would aligned AI come into the picture? 

Rebecca: So we can come in at many different steps along the way. So if this collapsed lung detector was using our technology, they'd get an interpretable question of, is this with the line off of it? that what you mean by collapsed lung? And the practitioner can say, no, it's not. 

Amir: Okay. I wanna see if I get this right, the service or product that you are. Delivering is concept Extrapolation. which is, I human values of, you know, don't hurt, don't harm, the safety alignment? what characterizes. this concept extrapolation program that you're developing. 

Rebecca: If you look at a, at A human baby A human baby takes in dramatically less, visual information, language information in order to understand the world around it an artificial intelligence does a human baby um, once they've learned how to talk and they, a child and, and an adult, and, and now you're, you're having this conversation with me.

you've heard a lot less language than chatGPT did in order to be trained by orders of magnitude, and yet you do a better job at it. artificial intelligence today is very inefficient at learning with the data it receives. And at the, at the end when it's doing an okay job, it still doesn't hold the kinds of concepts that we do.

And so it fails in weird ways. 

Amir: if we go back to the baby, the baby, if it grabs, let's say a sweet from, uh, it's it's brother or sister. The brother or sister are gonna get up, smack it, take that sweet back, then there's gonna be a fight and crying. That's how The baby knows, not to grab or not to hit or that kind of stuff. 

Rebecca: If that baby had no idea why its sibling, uh, hit it when it took the candy, if it takes some hundreds of times it, to figure out why the hit it, what it took, the candy, it's going to be very very slow at learning that information. If it understands, oh, it's because I took the candy, um, after say the second or third time.

It's learning more quickly, but if it thinks, oh, maybe hit me because, it's 2:00 PM or maybe they hit me because there's a shadow on the wall, that's how artificial intelligence today is operating. It's taking a very very long time to understand feedback we're giving it. 

Amir: I just wanna make sure that people understand exactly what it is, So, A company that makes autonomous cars calls you up and says, look, our algorithms, our AI's are learning too slow.

We're killing too many pedestrians. Sorry. We really need to stop doing that. We've heard that Aligned gives, AI the ability to learn from its mistakes quicker. I think that's what you're saying. Or learn that a new context, when it sees something it hasn't been trained on, or not sure of, needs to act conservatively.

And in that case, that would mean the following actions, stop the car, assume that you're gonna hurt someone and shut down. 

Rebecca: Concept extrapolation helps the artificial intelligence do more of what it should do and less of what it shouldn't do. 

Amir: It's like parenting. It's basically parenting 

Rebecca: Yeah. 

Basically parenting. yeah. 

Amir: So, which actually leads me to my last question, Rebecca. We have Elon Musk and Jeff Hinton and all these, the, these really big tech names and AI names, on the one hand saying AI is extremely powerful. It's going to become, actually intelligent and, uh, destroy us all. And then you have people on the other side no less, qualified, saying that's all rubbish. It's not magic. It kind of feels like magic now for a lot of people.

 Can you give us your reality check on that?

Rebecca: if you had a really, really powerful AI that's taking in various data that we're giving it and we've, for some reason, we've given it a huge amount of power maybe to make, synthetic viruses. Or to make explosives, or we've given it power over bulldozers and drones. and we've, given it some objective and we've told it to maximize that objective. it doesn't understand human concepts and human values, it could do something we don't want it to do. You have the famous example of, paperclip factory where someone's told paperclip AI to make lots of paperclips and they haven't told the AI, don't make so many paperclips that you turn all the all the humans into paperclips and don't turn the earth into paperclips. in the thought example, the factory turns everybody into, into paperclips.

Amir: Sounds painful.

Rebecca: There's another possibility, that is kind of more concerning and it's more of where we're headed today, AI destroys humanity because it's very powerful. Is trying to do what we want and it just doesn't really understand what we want real onus is on us 'cause we've told it the wrong thing and we don't realize that machine learning is teaching AI  different concepts than, than what we intend for it to learn. So that's how we are approaching, uh, reducing the existential risk. If we can help artificial intelligence to communicate with us by understanding the concepts that we're giving it, then we can start to tell it values. We can say, don't harm a human. it's knows roughly what harm is.

Now it can behave safely. And until it can do that, until it can know what a human being is, most of the time at least, and know what harm is, most of the time at least, we don't have a safe artificial intelligence and the most that we can do is just try to restrict the power of that AI so that it won't harm anyone on accident.

Amir: I feel like I've got the bottom line, which is. without getting really into the lines of code the direction is teach it certain concepts to identify when it doesn't know, and then to act in ways that we intend it act less ways we don't intend it act.

Rebecca: We're helping artificial intelligence to develop more robust concepts that more closely look like a human being's concept. And because of that, giving it the capacity to behave safer in a larger variety of situations. safety is really something that makes it possible to use new technologies. we believe that every artificial intelligence that's built should be safe. we're developing our, our company strategy around our desire our all artificial intelligence that anyone is building in any industry to not harm human beings.

Amir: That was great. I really connected with that.

Rebecca: You got it. Thank you so much 

Amir: Thanks for listening to The Dejargonizer.