Attention is like a cat—if you’re like, “oh, I will buy a nice camera and spend a lot of time in fancy video editing software making this look nice,” it will spurn you, and also if you’re like, “oh last time I got drunk and threw something together I got a lot of view so this time I’ll do the same” it will ignore you, but if you are just trying to do a nice job and not think too hard about it it will come sit in your lap and force you to pet it until you forget what you were trying to write about (oh hi kitty…)
(Although uh actually the followup video I posted on the heels of the one I did where I had a couple ciders and talked about Internet space games and software engineering is actually doing kind of eyewatering numbers—10k views as of this writing and I swear it was 2k a second ago, so I uh I guess I’m a Star Citizen youtuber now?? The cat is staring at me again. Don’t read too much into this. Like most cats he’s a bad analogy.)
Anyway the point being that the Internet has just spent the better part of the last week talking about “AI safety” and as someone with a glancing background in actual safety (exhibit A: the entire rest of this blog going back to 2012) I’m kind of cranky about it, because it’s clear except for obvious exceptions Dr. K— and … probably at least one other person, but Dr. K—is the only one I know, nobody involved has a good definition of what ‘safety’ is, let alone what ‘AI’ is, frankly let alone what we’re even afraid of besides “I watched Terminator 2 at a friend’s sleepover when I was 7 and couldn’t sleep for a week” or was that just me
(The best science fiction movie of the 20th century, and you can and will fight me on that, but actually it was Jurassic Park that I watched at 7 and scared me for life, which is also an incredible movie but just not quite as good as Terminator 2.)
And here’s where I drop the drunken ramble bit, to the extent that it’s a bit: The folks who I studied, coming up in the field of safety/security/privacy/whatever we call it, define ‘safety’ as “freedom from unacceptable loss.” (Okay technically it defines ‘safety’ as the “absence of accidents” and ‘accidents’ as “an event evolving an unplanned and unacceptable loss, Leveson PDF p. 32 print p. 11 but I think the transitive property holds, and I would link to Wikipedia but it’s useless here. A=B=C therefore A=C. Clearly.)
And, what. The. FUCK. Are the unacceptable losses we’re worried about in the AI safety context?
“AI gets super-intelligent, malevolent, and kills us all” is, sure, an unacceptable loss, because of the “kills us all” bit, but, nobody who’s worried about that rounds it to “LLM-induced mass casualty event.” Maybe we should? It hasn’t happened yet though it’s clearly coming, and whether you think it’s likely to look more like Terminator or Jonestown (or Heaven’s Gate) tells you more about you than about me really.
(“Unacceptable to whom?” was also the immediate question of anybody to whom I gave this definition for a few years, and, yeah, that’s the question isn’t it. All these AI systems are, and are going to continue to be, not just Californians but Northern Californians, and no you can’t tell me that because all the VCs decamped for Wyoming or Texas or Miami or whatever Motel 6 Elon Musk lives in that they’re not Northern Californians, they absolutely are, everything you hate about them was here before them and will live on long after they’ve run out of money and sensced into a pleasant-for-them retirement of being assholes at HOA meetings, been replaced by a new batch of assholes with money and zero other qualifications—and I quite like Motel 6s, they were the nice motels we stayed in growing up, they’re too good for him.)
(The cat, sensing that I’m getting closer to my conclusion, has shown up to take advantage of this opportunity to stand on my keyboard and contribute to the blog post.)
Where was I? Right! Unacceptable losses. I want to be clear that I’m not trying to throw anyone I work with under the bus, I’ve been saying this so long, on Twitter and… mostly on Twitter, that I needed to write it down somewhere even in this ridiculous fashion.
“AI turns us all into paperclips” or strawberries or whatever is an unacceptable loss, sure, but so is “wizard turns us all into hamsters” and the story of change there is about as clear and specific as the story of change in the first one. Or. I dunno. “Eaten by a fuzzy green zebra.” Maybe it’s AI powered. Sure.
This is the other bit, like, so we’re unclear on ‘safety’, sure—and I don’t feel like I’ve given you an ironclad case there that would pass my former mock trial coaches’ muster but also roll with me, see above attempts at get-out-of-jail-free caveats, but, worse, we’re unclear on what AI is.
I sure really do fucking hope, he says, speaking to certain friends in particular, that when I say “‘AI turns us all into paperclips’ is an incoherent fear” that you haven’t, like, installed your LLM as the top-level optimization algorithm on a paperclip plant… and how would you even do that? And then … roll its … what … down through all the sub-optimizers and sub-sub-optimizers and… sub-sub-sub-.. … basically what I’m asking is do you know what even goes into making shit, bro
Or farming and harvesting shit, in the case of Elon Musk’s strawberry example.
Plausibly some undocumented workers from Guatemala are going to have some opinions when their bosses tell them that the AI requires them to murder some people in order to plant more strawberry fields over their corpses.
I mean. Elon doesn’t deserve them. But those workers will have said feelings. Said workers being, in the main, you know, not enormous gits, like he is.
Doing anything requires that you have sensors, to take input from the physical world, you have a model of the physical world, which those sensor inputs get fed into to predict how the physical world will evolve, you have actuators, which you can use to act on the physical world, and also you have some goal, which is usually, like, “don’t die,” but is sometimes slightly more sophisticated, like, “don’t die _in an embarrassing way_”.
Any intelligence will have all of these things, whether meat based or silicon based, whether it’s body is a quarter-acre’s worth of data center sucking down a big town’s worth of power and producing a big town’s worth of heat or about 10 lbs of cranky furry meat that eats Meow Mix and shits in a litter box the essential loop remains the same.
To bring this whole ramble back around—if we’re specific about what ‘AI’ means—at the moment, an LLM running in a server rack somewhere to which we’ve fitted a text input and a text output?? For reasons. Sure. Why not. But we could fit other things.
And we’re specific about what ‘safety’ means, i.e. freedom from losses which are unacceptable to me, Kevin Riggle*, sitting right here, in [redacted city] in [redacted state] at 2:10 A.M. Eastern Standard Time in the year of our Lord Twenty-Twenty-Three.
I dunno, where does that leave us? Somewhere better than “we shouldn’t do things which make Northern Californians yell at us on Twitter” (also hi, I am a nortner californian) and somewhere worse than “we have solved this forever, we can continue to build shit free from these pesky northern Californians and their pesky opinions” probably. Idk.
Again, dropping the bit, to the extent that it is a bit, tl;dr: Let’s be really fucking specific about what unacceptable losses we’re worried about, when we talk about AI safety. And let’s also be really fucking specific about what we mean by AI, when we talk about AI.
And with that i leave you with this cat, who doesn’t exist, but who might as well have.