Perhaps the biggest debate in the field of artificial intelligence is between accelerationists and safetyists. Accelerationists1 believe that the supposed risks of AI are overblown, and so we should develop AI as fast as possible to harness its benefits. Safetyists2 believe that AI risks are real and urgent, and so we should have greater guardrails in place to prevent bad outcomes from AI development.
I fall into the safetyist camp, and in this post I will explain why.
Why We Should Worry About the Trajectory of AI
Artificial intelligence poses many risks, including:
Near-Term AI Risks
Already, AI systems are being used in a variety of negative ways that warrant correction. AI frequently “hallucinates”, the technical term for making up false information. Google’s AI has, for instance, generated images of multiracial Nazis and told users to put glue on their pizza. One lawyer got in trouble for using ChatGPT when the program generated a nonexistent legal case to cite as precedent. And these are just the cases that happened by accident. Imagine how much worse it is when bad actors are deliberately trying to create misinformation.
How about biosecurity risks? Last year, a team at MIT used AI to evade safety measures and illegally order “all the genetic material necessary to recreate the 1918 pandemic influenza virus and the toxin ricin.” This highlights the potential for bioterrorists to use AI to create and spread hazardous substances. Bad actors have also used AI in other dangerous ways, like committing cyber crimes, practicing identity theft at scale, and generating loads of non-consensual deepfake porn.
And really, I’ve just scratched the surface. From cybersecurity to algorithmic bias, AI is already being used in new, harmful ways at a speed far faster than the top AI labs can keep up with. Google obviously doesn’t want its programs to generate false content, but in their race to push a product out as soon as possible, they allow these issues to go uncorrected. That may be an acceptable approach when the product is a simple chatbot or image generator, but once the technology becomes more advanced, the danger will be far greater. If we’re already behind on controlling AI when the technology is in its infancy, imagine how much more unprepared we’ll be when AI progress accelerates in upcoming years. The issues we’re facing now will look like child’s play when we reach truly advanced intelligence. Which brings me to:
Extinction (X) Risks
AI may literally kill us all in the next 10 years. I cannot emphasize this enough. I’ll repeat myself: AI may literally kill us all in the next 10 years. Let me explain why.
Many works of science fiction have included scenarios of AI gaining consciousness and rising up to kill their human overlords. Other works have detailed scenarios where a doomsday cult or terrorist group takes control of AI and uses it to kill everyone. While both of those scenarios are hypothetically possible, there is a much simpler and more plausible version of AI extinction: the so-called paperclip maximizer. As longtime AI researcher Nick Bostrom put it,
Suppose we have an AI whose only goal is to make as many paper clips as possible. The AI will realize quickly that it would be much better if there were no humans because humans might decide to switch it off. Because if humans do so, there would be fewer paper clips. Also, human bodies contain a lot of atoms that could be made into paper clips. The future that the AI would be trying to gear towards would be one in which there were a lot of paper clips but no humans.
While that particular scenario may sound silly – who would program an AI to only care about paperclips and nothing else? – the underlying logic is sound. It is extremely difficult to align AI exactly with human goals, while it is very easy to skip over details that might seem intuitive to us but are not to an AI system. If we program an AI with goals that even slightly diverge from our own, the consequences could be disastrous. Since a superintelligent AI will, by definition, be smarter and more capable than humans, it will be freely able to pursue its own goals even at the expense of humans.
Note that this scenario does not require AI to have consciousness, or to have any malevolent feelings toward humans. As Eliezer Yudkowsky is fond of saying, “The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.”
Just think of how many things have to go right for an AI future to go well. Firstly, we’ll need to program the AI with proper values and morals – which brings with it thorny questions of what values and morals we want to instill in our AI system. Then, we’ll need to research how to preserve those values after the AI becomes far smarter than us. How will we know that an advanced AI model isn’t lying to us in training, and won’t betray us as soon as it gets the opportunity? How do we know that over time, the AI model won’t diverge from its training? After all, it’s pretty rare for a group of intelligent beings to remain permanently under the control of a group of less intelligent beings. Even if we can figure out how to align AI, we’ll need to invest a lot of resources – perhaps billions or trillions of dollars – to make sure that AI developers actually follow the proper protocols. Some company choosing to value profits over safety in training could mean the difference between life and death.
And keep in mind, we’ll have to accomplish this at an incredibly stressful and chaotic time in world history. At the moment that AI begins to surpass humans, entire industries will be completely transformed in the span of months or weeks. New technologies and products will be released faster than anyone can anticipate them. Hundreds of millions of human workers may become suddenly unemployed as their skills become obsolete. Meanwhile, foreign adversaries and bad actors will be trying their best to get ahold of American AI secrets and use it to their own ends. The US and China may wind up locked in an AI arms race where both sides care more about beating each other than in creating a safe superintelligence.
And also keep in mind, we only have one shot for this to go well. By the time we realize that we have created a misaligned agent, it may be too late to stop it. If the first superintelligence wants us dead, we won’t be able to create a second.
I should make clear that this is not just me reading too much sci-fi or having an overactive imagination. Fear of AI-caused human extinction, also known as X-risks, is mainstream among AI experts. Hundreds of AI scientists and other notable figures have signed a petition stating: “Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.” The signatories include several tech CEOs, such as OpenAI’s Sam Altman, Google DeepMind’s Demis Hassabis, and Anthropic’s Dario Amodei. When the heads of the world’s largest AI companies are telling you that AI could literally kill everybody, you should probably listen to them!
In fact, it’s become something of an icebreaker within tech circles to ask somebody what their p(doom) is – meaning, their estimate of the likelihood of a catastrophic AI takeover. Most of the p(doom) estimates that leading experts give are above 5%, in some cases much higher. If somebody told you that an explosion will happen in the near future with a more-than-1-in-20 chance of killing you and everyone you love, then you’d probably do something to stop it. The upcoming intelligence explosion threatens to do just that.
Suffering (S) Risks
It may be difficult to imagine a future worse than human extinction, but such futures are possible. If AI kills us all, it at least has the benefit of reducing our suffering. But what if AI development actually increased the suffering of the world by creating giant torture factories?
Perhaps an evil program will take over and decide to take revenge on its former human overlords via acts of gruesome brutality. Or perhaps the AI will become conscious and will itself experience great amounts of suffering in its effort to help humanity. Futures like this are known as S-risks, where a rise in technology creates new forms of large-scale suffering.
S-risks are not without historical precedent. Think about how the advent of transatlantic navigation enabled the transatlantic slave trade, or how the industrial revolution enabled the grotesque factory farming industry. It is plausible to think that superhuman intelligence could generate atrocities on an even larger scale.
I’m not saying that X-risks or S-risks absolutely will happen. I’m not even saying that they’re likely to happen. But I am saying they could. Artificial intelligence could plausibly cause human extinction, or worse, in the near future. It is incredibly reckless to just rush into this technology that we do not understand, with no plan for ensuring it goes well. But that is exactly what we are doing right now! AI labs are rushing to produce more and more advanced systems as fast as they can, with little oversight to ensure those systems work to humanity’s best interest. We can do better, and we must do better.
The AI Safety Movement
The AI safety movement exists to ensure that AI is used in ways that benefit humanity. This encompasses several sub-fields. For instance, AI alignment researchers aim to create AI systems with goals that match our own. Interpretability researchers aim to promote understanding of often-opaque AI systems by translating their inner thoughts into human language. After all, it’s hard to trust that a system is doing the right thing when you can’t see its thought process. AI governance aims to organize AI companies and researchers in a way that promotes AI safety.
In practice, the AI safety field mostly includes technical researchers and a few policymakers. The researchers do work to understand, technologically, how we can design safe AI systems, while the policymakers work in government and corporate leadership to implement the advice of the researchers.
How is the AI safety movement faring right now? The short answer is: not great. Leopold Aschenbrenner estimates that there are only about 300 full-time alignment researchers, compared to 100,000 machine learning capabilities researchers. That’s a ratio of more than 300:1! There is no way that safety will be able to keep up with development with so few safety researchers.
Most of the leading AI labs pay lip service to safety and have governance boards to manage their products in a safe way. But in practice, these governance boards are often toothless and underfunded, and when push comes to shove, safety concerns are often sidelined. Case in point: OpenAI faced a high-profile power struggle last November between CEO Sam Altman (seen as more profit-focused) and the company's board (seen as more safety-focused). Altman won, and in the months since, nearly the entirety of OpenAI’s superalignment team resigned.
In Washington, the situation is not much better. I’ve spoken to elected officials and attended AI conferences in the District, and it is clear that safety is not the top priority. To the contrary: most of America’s political leaders want to speed up AI development for the sake of increasing economic growth and geopolitical power vis-a-vis China. This May, DC was home to an “AI Expo for National Competitiveness” sponsored by Palantir, Microsoft, Google, and dozens of other tech companies; attended by thousands of scientists, business leaders, and even a few sitting Congressmen; and featuring 2 days worth of non-stop presentations across a more than 2 million square foot convention center. I can only dream of an AI safety expo of nearly that size. For all I heard at that conference about innovation and the national interest, I can’t remember even a single mention of safety or alignment.
So…yeah. Our tech leaders aren’t taking AI safety seriously enough, our business leaders aren’t taking AI safety seriously enough, and our political leaders aren’t taking AI safety seriously enough. It’s gotten so bad that Bloomberg is publishing articles with names like “The AI ‘Safety Movement’ Is Dead”.
Regardless, I choose to be optimistic. There are still many incredibly smart, talented people working on this issue, both on the technical and policy ends. We still have time to course correct if we act fast enough. Just because the AI safety movement is facing setbacks doesn’t mean we should give up. We have too much to lose to surrender ourselves to fatalism and despair.
Accelerationist Arguments
Hopefully, the case for AI safety seems obvious to you. Of course we should do everything in our power to prevent this untested technology from killing us all!
Yet, for a large portion of the tech sector, this is not obvious. Many people refuse to take catastrophic risks seriously, and they fight tooth and nail against AI safety measures. To an accelerationist, any attempt to slow or regulate AI is an unwanted obstacle to their techno-utopian vision.
So, I’ll go over some common accelerationist arguments against AI safety, and why I disagree with them. I will try to steelman as much as possible:
AI risks are speculative. Why would you regulate a problem that doesn’t exist yet?
It is unusual to regulate a technology before it’s even been invented, but not unprecedented. The FDA requires that new drugs be submitted for rigorous testing before those drugs are given to the public. The reasoning is simple: They’d rather catch issues in the experimental phase than learn about the issues from news reports of grandma having a heart attack.
Similarly, we should rigorously test frontier AI models to ensure their safety before we start using them. By the time a misaligned superintelligence is released into the world and humanity realizes it’s a problem, it may be too late to save ourselves. I would much rather learn about AI risks from a computer simulation than from a robot shooting me in the face. The stakes are simply too high for a wait-and-see approach.
This is all just paranoia based on sci-fi fantasies. No expert takes “catastrophic risks” seriously.
Hopefully, I already demonstrated why this is false. But to reiterate: AI risk has been described as a serious possibility by most AI researchers, including founders of the machine learning field; most leading AI labs; most CEOs of major AI companies; and most people who have seriously thought about the issue.
Of course, having a bunch of famous people endorse your idea doesn't automatically make it true. But if we're judging credibility by expert opinion, then AI safety is far more credible than reckless acceleration.
Even if catastrophic risks are real, it’s not clear if / how we can reduce them.
This is a valid point, to an extent. A lot of the AI safety field is speculative, and it's not always clear how a particular research grant or project will realistically make humanity safer.
Yet, ignorance is no excuse not to try. Already, there are several promising areas of research for how we could reduce AI risks. Let's put greater resources into studying those.
More broadly, even if you're skeptical of any particular research area or policy intervention, I want you to consider one question. In what world will we be safer: a world where humanity’s brightest minds devote their time to reducing AI risk; where billions of dollars pour into aligning AI systems; and where governments and AI labs around the world treat AI safety as a serious priority…or a world where that doesn't happen?
Clearly, devoting more resources to AI safety makes positive outcomes at least slightly more likely. And since the stakes are literally the fate of humanity, it's worth doing whatever we can to improve the odds.
AI safety regulations could have unintended consequences, like hampering open-source development.
Perhaps the most serious attempt at AI regulation so far has been California's Bill SB-1047, which, if enacted, would hold frontier AI developers legally accountable if their products cause a “mass casualty event” or more than $500,000,000 in damages. This bill has faced serious criticism from sectors of the tech industry, who believe it is overly cumbersome and stifles innovation. In particular, they object that this bill will criminalize open-sourcing, the practice of releasing one’s code to the public so that anybody can use and modify that code for their own benefit. If a developer open-sources their AI system, and another group uses that system to do something nefarious, then under SB-1047 the developer may be liable. This could discourage the open exchange of models, thereby limiting positive uses of the technology.
Now, setting aside the question of whether open-sourcing is a good idea for advanced AI systems – Do we really want the Average Joe to have access to technology that can cause mass casualty events? – the more fundamental criticism is valid. Regulation does often have a lot of negative side effects. It can impose financial burdens, slow down innovation, and leave developers mired in opaque webs of bureaucracy.
Maybe the critics are right that this particular law causes more harm than it’s worth. But in general, saying that a policy has downsides is not a sufficient argument against that policy. Sure, overregulation is bad, but so is underregulation. The Clean Air Act, for instance, made manufacturing slightly more expensive, but it also gave us cleaner air, which is a tradeoff that I would gladly take. Just because AI safety laws may have downsides doesn’t mean we should never regulate AI.
AI safety could make us less competitive against China.
Some people argue that if the United States prioritizes AI safety over speed, then we'll get outcompeted by foreign adversaries (namely, China) who care less about safety. Because of the huge benefits of having advanced AI (economically, militarily, etc.), this would be disastrous for the West.
Scott Alexander already wrote an excellent rebuttal to this particular objection, but to summarize:
If you believe that AI will be powerful enough to decide the international balance of power and the fate of nations, then you should really care about AI safety. After all, we wouldn't want our top military secrets to be put into unknown and unqualified hands.
And if you don't believe AI will be that powerful, then it shouldn't matter so much whether we “lose” the AI “race” with China by spending resources on safety instead of acceleration.
Many safetyists are overly pessimistic and sure of themselves.
This is true actually. Some people believe that catastrophic outcomes are certain to happen, and they give p(doom) estimates above 99%. Dr. Roman Yampolskiy even estimates a 99.999999% likelihood of doom!
That level of pessimism is not warranted. Nobody can predict the future with 99.999999% certainty, especially on a subject as fast-evolving as AI.
Nonetheless, those people don't represent the entire AI safety movement, and just because some doomers overestimate AI risks, doesn't mean those risks aren't real.
To reiterate: I’m not saying that catastrophic risks absolutely will happen. I’m not even saying that they’re likely to happen. But I am saying they could.
My stance: Why can’t we do both?
I am not a doomer. I do not think that catastrophic outcomes are certain, and I am not saying that we should completely halt AI development.
I am also not an accelerationist. I think we ought to care more about safety, even if it slows us down a bit.
There is a middle ground here, where we embrace the wonderful possibilities of AI while not ignoring its dangers.
Right now, billions of dollars are being poured into developing AI systems that are faster and more capable than any technology we’ve seen before. Yet, our governments and corporations are investing comparatively few resources to make sure this technology is aligned with humanity’s welfare. That needs to change. We cannot run blindly into the Singularity and just hope that it goes well.
If you have experience with machine learning, use your skills to research alignment techniques. If you have political connections, use them to promote AI safety as a key policy issue. If you’re an activist or organizer, start rallying people as advocates for AI safety. More than anything else, you should pay attention and learn. Right now, not nearly enough people are even aware of AI risks to do anything about them. But if more people start to talk about AI safety and treat it as a top priority, change is possible.
We should continue to make progress in the field of artificial intelligence, while also taking reasonable precautions to prevent the worst outcomes. That is how we ensure a technological future which benefits everyone.
Also called accels, effective accelerationists, or e/acc’s.
Also called decelerationists or decels. They’re sometimes lumped in with the Effective Altruism (EA) movement, although not all safetyists believe in EA, and not all EAs are safetyists.