When You Don't Know the Optimal Answer, the Marginal Answer Might Still Suffice
Making decisions under uncertainty
Let’s say you’re a fine art collector, and you want to know whether a particular painting is worth buying at a price of $5 million dollars. Ideally, what you would do is run your own independent calculation of the artwork’s value, and then check if this value is more or less than the $5 million price tag. If you estimate that the painting is really worth $8 million, for instance, then you should buy it at $5 million. In other words, you’re trying to answer the question, “What is this painting worth?” The problem is that this can be quite difficult. You might not know the artist very well, or you might not know what condition the painting is in, or you might not know what the resale market for this painting is like. You may not be able to form an accurate opinion of the painting’s total value.
The good news is that you don’t necessarily need to do that. Instead of asking, “What is this painting worth?” you can ask the slightly easier question, “Is this painting worth more or less than $5 million?” If you have reason to believe that the market is overvaluing the painting (e.g. because you suspect it’s a forgery), then you should not buy. If you have reason to believe that the market is undervaluing the painting (e.g. because the seller is out-of-touch and doesn’t realize how popular the artist is), then you should buy. Whether the painting is really worth $6 million or $10 million of $100 million is, in this moment, irrelevant to you. All that you need to know is that it’s worth more than $5 million, which is enough to justify buying.
Quantitative traders think like this all the time. Traders don’t always try to calculate the total value of the items they’re trading — such a calculation would be error-prone, not to mention time-consuming in an environment where they may only have minutes to make a trade. Instead, traders often think marginally. For a trader, the most relevant question is always, “Should I buy or sell at the current market price?” Let’s say that you have advanced knowledge of a company scandal. You don’t know exactly how much this scandal will damage the company, but clearly it won’t be good. You have good reason to think that the company’s stock 24 hours from now will be at least slightly lower than it is today. So, on the margins, you should quickly sell stock in that company before the rest of the market finds out about the scandal.
In other words, it’s useful to cultivate your marginal thinking. When it comes to making decisions under high time pressure or uncertainty, it won’t always be possible or even desirable to make a complete expected value calculation. Sometimes, the best way to make a good decision quickly is to rely on heuristics: Is this thing overpriced or underpriced by the market? Should I be doing more or less of this activity than I’m doing now? Should I be spending more or less on something than I am now?
This insight can also apply to the policy world. While politics doesn’t work quite as cleanly as an open market, many of the insights from markets still apply. Some policy goals trade off against other policy goals (e.g. the goal of welfare spending trades off against the goal of reducing the national debt), and so politicians and legal systems have to decide which goals are the most important. For any political goal, politicians and legal systems can set an informal “value” on that goal. This value reflects how much political capital they’re willing to expend to promote that goal, as well as how many other goals they’re willing to sacrifice in service of that goal.
Just like in the market, it can be difficult to ascertain the optimal value of a political goal. (E.G. How much should we be willing to spend to live in a crime-free city?) But it might be easier to assess the marginal value. (E.G. Should our city be spending more or fewer resources on crime prevention than it’s spending today?)
When it comes to any policy goal, it’s worth asking: Is this goal overvalued or undervalued in the policy world? Should policymakers care more or less about this goal? Should laws do more or less to promote this goal, even if it comes at the expense of other goals?
One policy goal that I care deeply about is AI existential risk reduction. I believe that we are likely to witness the advent of superhuman artificial intelligence within the next decade, and I believe that when this happens, there is a non-trivial chance that this superintelligence will drive humanity extinct. I therefore think it is important for both government officials and private sector leaders to act to mitigate this risk. Of course, I recognize that there are tradeoffs involved. Promoting AI safety could involve more stringent AI regulations, which would harm the other noble goal of quickly harnessing AI’s benefits.
What is the optimal value of AI safety? Let’s imagine you have a button in front of you labeled “CREATE ASI NOW”. If you press this button, you will unleash an artificial superintelligence on the world. This will have a 95% chance of ushering in a techno-utopia where humanity lives in bliss and harmony, and a 5% chance of ushering in an apocalypse where all of humanity dies. Do you push the button? What if the odds of the extinction were 10% instead of 5%? What if they were 30%? 50%? 80%?
Or how about this: Let’s imagine you have a button in front of you labeled “PAUSE ASI”. The ASI will be created no matter what, but if you push the button it will be delayed for 10 years, which will give AI safety researchers more time to do their work. So the choice becomes (ASI now with a 5% extinction risk) versus (ASI in 10 years with a 1% extinction risk). Do you push the button then?
I don’t know how to answer these questions. I struggle to find a satisfying framework with which to assess the relative values of AI utopia versus AI extinction. Maybe I’ll develop my thoughts on this subject more in the future, but for the time being, I’m at an intellectual impasse.
Fortunately, I don’t actually need to know the optimal value of AI safety in order to make policy decisions. I just need to know the marginal value of AI safety. On the margins, should policymakers value AI safety more or less than they do today? I think the answer is pretty clearly “more”. Most politicians don’t even seem to grasp that AI existential risks are something to be taken seriously. Most of the public is only dimly aware of the issue. As I’ve detailed before, there are still lots of Pareto improvements left to make in the AI policy world — lots of low-hanging fruit that would improve AI safety without significantly harming other political goals. These are basic asks like “each leading AI developer should publish a safety and security protocol” and “there should be legal / financial protections for AI whistleblowers”.
Is it possible that the government could go too far and do too much to support AI safety? Maybe. If, at some distant point in the future, I start to believe that the government is doing too much to mitigate AI existential risks, then I’ll gladly change my tune. But that point seems extremely far away from where we are today, so for the time being, I’m on the side of more AI safety.
I can’t tell you the optimal value of AI safety as a political goal. But I am fairly confident that on the margins, policymakers should value AI safety more than they do today. We need more people promoting responsible AI policy and existential risk mitigation, and I am glad to lend my voice to that coalition.