News Release

Governance frameworks should address the prospect of AI systems that cannot be safely tested

Reports and Proceedings

American Association for the Advancement of Science (AAAS)

In this Policy Forum, Michael Cohen and colleagues highlight the unique risks presented by a particular class of artificial intelligence (AI) systems: reinforcement learning (RL) agents that plan more effectively than humans over long horizons. “Giving [such] an advanced AI system the objective to maximize its reward and, at some point, withholding reward from it, strongly incentivizes the AI system to take humans out of the loop,” write Cohen and colleagues. This incentive also arises for long-term planning agents (LTPAs) more generally, say the authors, and in ways empirical testing is unlikely to cover. It is thus critical to address extinction risk from these systems, say Cohen et al., and this will require new forms of government intervention. Although governments have expressed some concern about existential risks from AI and taken promising first steps in the U.S. and U.K, in particular, regulatory proposals to date do not adequately address this particular class of risk – losing control of advanced LTPAs. Even empirical safety testing – the prevailing regulatory approach for AI – is likely to be either dangerous or uninformative, for a sufficiently capable LTPA, say the authors. Accordingly, Cohen and colleagues propose that developers not be permitted to build sufficiently capable LTPAs, and that the resources required to build them be subject to stringent controls. When it comes to determining how capable is “sufficiently capable,” for an LTPA,  the authors offer insight to guide regulators and policymakers. They note they do not believe that existing AI systems exhibit existentially dangerous capabilities, nor do they exhibit several of the capabilities mentioned in President Biden’s recent executive order on AI, “and it is very difficult to predict when they could.” The authors note that although their proposal for governing LTPAs fills an important gap, “further institutional mechanisms will likely be needed to mitigate the risks posed by advanced artificial agents.”


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of news releases posted to EurekAlert! by contributing institutions or for the use of any information through the EurekAlert system.