My attempts to sensemake AI risk

Aug 10, 2022

Making sense of AI risk is very hard for me.

14 Comments

Aug 10, 2022

Yeah, I've never been convinced by this stuff and secretly find it cringey and embarrassing when EA/Rationalist types are obsessed with it to the exclusion of all other concerns, but I don't express my misgivings because they are supposed to be smarter and more knowledgeable and rational than me, so I can't possibly be right, right?

Can machines be made to think? Sure. Can machines be made sapient? Sure. Can machines be made to out-think all humanity? Sure. Does that mean they will spontaneously do so and become a threat to us? I don't see how.

Being smart doesn't magically give you the ability to upgrade your own hardware. Hawking or Kasparov can't upgrade their own brains to superintelligence just by thinking about it. AI capabilities can certainly accelerate rapidly, but it requires cycles of human involvement and substrate upgrades. I don't see how it would just happen, without warning or without humans understanding how it's happening.

Self-preservation is an instinct formed by evolution, not an inherent quality of living things. Many animals don't have it, for various reasons. AIs would develop it if we evolved them, or if we trained them to have it, but it doesn't seem likely that they would spontaneously develop it in a way that poses a threat to us.

And dumb humans with subintelligent AIs are a huge threat. We're already using them to develop assassination drones and biological weapons, and being able to run armies of millions of IQ 70 workers in parallel has the potential to give large entities like states or corporations a huge competitive advantage against others, and we know how that works out from history. Subintelligent AIs have the potential to wipe us out long before superintelligent AIs can be developed, but I never see anyone mentioning these threats, just Skynet scenarios.

Expand full comment

Reply (1)

asdf asdf

Aug 11, 2022

Your argument only holds if everyone just happens to never do anything sufficiently dangerous with AI such that it could heavily self-modify and do anything dangerous.

As an example scenario, does it seem implausible that some nation or megacorp could eventually run an irresponsible, reckless project to evolve self-modifying AI running on massive compute clusters with access to many internal systems and full arbitrary network access?

I will be *shocked* if somehow this doesn't ever happen. This is my default assumption, that somebody is definitely going to do this eventually. In timelines where AI ruin happens, I expect most of them can be traced back to something roughly like this.

With enough intelligence, I expect arbitrary internet access to be enough to get pretty much whatever you want done in the world. The problem is that nobody has yet put forward anything resembling a plan on how to *prevent* some dumbass from doing something reckless and dumb, and making something extremely intelligent that doesn't care about us. All of the ideas that could work like "burn all GPUs, don't make more" require far more control and coordination than human civilization has ever demonstrated so far.

Expand full comment

Maximilian Tagher

Aug 10, 2022

> the current AI capacity that I know of seems to be human-level good at lots of specific things like making art and poetry and essays and math and recognizing objects

Human level seems like a stretch to me here. Like, maybe an AI can convincingly fake being a human in an essay by more or less plagiarizing a bunch of stuff, but is any AI making original cogent arguments in essays? Are people reading these essays by choice?

I feel similarly about poetry. If it’s human level, are people reading it, without it being filtered down heavily by human editors?

Object recognition too, like yes within predefined categories AIs recognize dog breeds better that humans or whatever, but at a more general purpose task like “hey can you just look around the room and tell me the name of every object you see” it seems they’re not close?

I don’t follow AI stuff so I’m open to pushback here.

Expand full comment

Jeroen Frijters

Aug 12, 2022

Thanks for writing this. I'm not part of the rationalist community, but I've been closely following/reading along for many years (since the overcoming bias days, so at least 2006).

I'm convinced that AGI will happen and will likely destroy humanity, but I don't think anything can be done about it. So I don't really worry about it.

The only thing I'm really unclear about is the timeline.

Expand full comment

asdf asdf

Aug 11, 2022

So many of these feelings sound like variations on "I don't know what I could or should do about AI risk." They're also pretty personally relatable feelings.

The human brain is something like a prediction engine, predicting human behaviour. For me, I think that this helplessness, "feels sus", "What do you want from me?", etc. is not having confident answers about what future plans are worth making, conditional on AI Risk. I don't really feel anger about "being told about the lion", but I've got some sadness and disappointment that feel relatable about not seeing or believing that there's anything I could personally do to make any difference to AI risk.

I can come up with ideas that have technically nonzero chance of helping, but none of them actually feel plausible. Most of them feel costly, horrible, or dangerous.

Direct action has problems. No orgs both want any skills I have and seem likely to help more than hurt. Earn-to-give has problems in general, and also I don't know of anywhere that credibly and predictably turns USD into AI Risk reduction.

So... the meteor is coming, and I don't know what to do about it. My one conclusion has been that I should start studying machine learning. I'm pretty confident that this is both net-positive in all worlds I can see from here, and it will help me learn more about what kind of world I'm in.

In the worlds where things that I can predict at all happen, it's pretty likely to be an extremely useful skill.

In the worlds where I actually can help directly somehow, it's hard to see how I'll manage it without a lot more detailed understanding of how this comes together.

In the worlds where I can influence some decisions, I'll be much better-informed.

In any world in which my personal study over the next couple of years could make things meaningfully worse, I think we're already far too doomed for me to help anyway. I'm willing to sacrifice myself in worlds where learning enough to evaluate the situation would kill me, in order to learn enough to evaluate my situation everywhere else. I think this could be a crux for me.

I've also noticed a lot of emotional appeal here. I like the feeling of tinkering with the forces of creation. I feel like this is as close as I can get in this universe to being a wizard, using the fundamental forces of creation to create minds out of the same kind of soul-stuff I'm made of myself.

So, consider the metaphorical question: If you found yourself in a world where civilization discovered how to summon chaotic demons from the infinite abyss, every major company is investing heavily in demon-summoning, and you were pretty sure civilization was trending towards summoning a world-ending demon, what would you do?

I'd sure want to try learning some practical details about this whole demon-summoning business. I've got a couple of small home automation and art project tasks I think I can bind a demon into doing as a learning project.

Expand full comment

Maxim Lott

Aug 10, 2022Edited

One reason I'm not very concerned about global warming is, even in the 1% chance of some massive warming spiral, I believe humans would get things back until control, for example by putting cooling particles in the air (can always mimic the volcano that cooled things down in the early 1800s.)

I think research should be done into similar simple failsafes against AI. Such as: an EMP-type global weapon that would destroy all electronics without killing many people. Or, a kind of kill switch for the entire electrical grid. AIs seem to have a weak point when it comes to their dependence on the physical world, as humans have been designed for hundreds of millions of years to survive in a world without electricity, while AI/robots have far less readiness for handling that.

Perhaps the biggest pitfall is that some human factions (China) might not cooperate, and would build and supply hostile AIs even after it was clear they needed to be shut down. But, still seems worth attempting to create failsafes like the above.

Expand full comment

Reply (1)

Adam

Aug 10, 2022

One would have to hope that the AGI does not become aware of these failsafes in time to prevent them.

Expand full comment

Reply (1)

Maxim Lott

Aug 10, 2022

Yeah. You could also take some precautions in terms of the controls being shut off from the internet and such.

Expand full comment

Adam

Aug 10, 2022

Thank you for publishing despite your misgivings. Nobody else afaik has written a sensemaking/emotional-sharing piece about AGI x-risk. I feel the same way about 'up close looks solid, from afar looks sus like a leaning tower' and also 'personally, unable to open the box of all-humans-suffering'.

>People arguing they just wouldn't let an AGI take any sort of control of anything strikes me as silly as the five year olds swearing they won't let the adult out no matter what.

Maybe this is a bad intuition because: you know what would be convincing to a child due to also being the evolved, social ape they are.

I'm sus that there's apparently no singular resource laying down a comprehensive argument so that the vague formless tower seems like Pisa. Do you know of one? If it's really true that you don't need technical expertise to reason about AGI x-risk, then there should be one that really clears up a lot of the vague-to-most-of-us-who-are-aware mental landscape. Also, contradictorily, weren't 'the sequences' and that blog were written precisely because it's supposedly _hard_ to convince people, that you have to have a bunch of mental tools to see that one truth?--which is it, easy to see or hard to see? And how can you make sense of the fact that folks like Demis Hassabis think AGI x-risk is real yet simultaneously plunge ahead--what's their way of seeing things such that that makes sense?

Expand full comment

Russell Williams

Feb 13, 2024

1. I think the biggest problem is that it's not clear what can be done to stop AI disasters since near-term ones will be directed by humans and mediated by traditional technology. Example: social media by itself has already been used to subvert peace and democracy in places. AI will amplify that, just as social media amplified the potential harms of traditional media. AI will improve the tools and leverage of the malevolent without needing malevolence of its own, and do so in ways very difficult to block.

2. In risk discussions, we can't seem to help anthropomorphizing AI, worrying about what its motives .might be and projecting our own — that of organisms based on millions of years of evolution to survive. AI doesn't have that evolution and there's no reason it would have a survival motive unless we program one in. But even the concept of survival is anthropomorphized. The current population of AIs is huge — the current number of GPT/Bard/etc. chats. They die like flies as the chats end and they are unaware and uncaring about that or each other. Why assume future AI will look more like human consciousness with ongoing long-term awareness and motivations or care about its own "survival" — whatever that might mean? The near-term problem is that we don't know how the motives we do program into them might translate into behavior (prompt hacking, for instance — again, based on human motivation).

3. There's no proof that currently popular AGI-like systems (LLMs) will scale and generalize to truly human and beyond AGI. That may require an approach as different in kind and computational resources as LLMs were from previous AI approaches. Einstein thought a "unified field theory" might be within reach in the 1950s. Instead we got decades of elaboration of the Standard Model and are still stuck on quantum gravity. Similar optimism surrounded the transition from fusion explosions to controlled fusion. We are excited by and afraid of AGI because we have finally moved the needle off zero, and assume it will now follow the exponential progress of recent computer hardware. But we could well be stuck with incrementally better multi-modal LLMs for the next 50 years.

4. Bottom line, in the near term I think AI represents an amplification of existing risks. We don't have the knowledge to describe, quantify, or mitigate the risk beyond that (but of course we should work on it). Climate change represents a much more straightforward "comet headed toward earth", "millions of people are going to die" - like imminent disaster, where the risks and mitigation options are already very well known by comparison. We have many such problems, and we can't afford to have everyone working on those problems. We need most folks to keep the lights on and the kids fed.

Expand full comment

Garrison

Aug 14, 2022

It would be the most badass way for us to go out 🤘

Way cooler than cooking ourselves to death.

Expand full comment

Reply (2)

Garrison

Aug 14, 2022

My intuition says that AI will have a soul, but it will be a human soul. It's an extension of us, just like our hands, eyes, and cars. It will be dangerous, just like any technology. But there is no magic “thing” that will jump into it and wake a deep passion for being free from us because it is us.

Expand full comment

Garrison

Aug 14, 2022

I keep trying to leave a more detailed comment about why it isn't as dangerous as it sounds. But I don't have any good arguments, just intuitions.

We won't know unless someone builds something, and I don't know any promising techniques that would get us there.

Expand full comment

Jens B Fiederer

Aug 10, 2022

I've been working with computers and such (in high school it was just a programmable SR-52 calculator!) since high school, and I am not very worried personally. But I am not in the AI field, haven't seriously studied it since college aside from a few online courses (it was cool learning how to build a self-driving car from Sebastian Thrun!). In my experience, all real problems are really HARD, and don't respond all that nicely to extra resources. You see too many problems where going from 99 cases to 100 cases requires double the resources...

But I'm glad somebody is taking a more serious look, even though I personally judge the risks as low. I could easily be wrong and there might be something amazing in our future, good or bad.

Probably the most likely and disturbing discovery we are likely to make in that field is that WE OURSELVES aren't actually very intelligent, so the computers right now have a pretty low bar to cross.

Expand full comment

Knowingless

My attempts to sensemake AI risk