EA - Prospects for AI safety agreements between countries by oeg

The Nonlinear Library: EA Forum - Un pódcast de The Nonlinear Fund

Podcast artwork

Categorías:

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Prospects for AI safety agreements between countries, published by oeg on April 14, 2023 on The Effective Altruism Forum.This post summarizes a project that I recently completed about international agreements to coordinate on safe AI development. I focused particularly on an agreement that I call “Collaborative Handling of Artificial intelligence Risks with Training Standards” (“CHARTS”). CHARTS would regulate training runs and would include both the US and China.Among other things, I hope that this post will be a useful contribution to recent discussions about international agreements and regulatory regimes for AI.I am not (at least for the moment) sharing the entire project publicly, but I hope that this summary will still be useful or interesting for people who are thinking about AI governance. The conclusions here are essentially the same as in the full version, and I don’t think readers’ views would differ drastically based on the additional information present there. If you would benefit from reading the full version (or specific parts) please reach out to me and I may be able to share.This post consists of an executive summary (∼1200 words) followed by a condensed version of the report (∼5000 words + footnotes).Executive summaryIn this report, I investigate the idea of bringing about international agreements to coordinate on safe AI development (“international safety agreements”), evaluate the tractability of these interventions, and suggest the best means of carrying them out.IntroductionMy primary focus is a specific type of international safety agreement aimed at regulating AI training runs to prevent misalignment catastrophes. I call this agreement “Collaborative Handling of Artificial intelligence Risks with Training Standards,” or “CHARTS.”Key features of CHARTS include:Prohibiting governments and companies from performing training runs with a high likelihood of producing powerful misaligned AI systems.Risky training runs would be determined using proxies like training run size, or potential risk factors such as reinforcement learning.Requiring extensive verification through on-chip mechanisms, on-site inspections, and dedicated institutions.Cooperating to prevent exports of AI-relevant compute to non-member countries, avoiding dangerous training runs in non-participating jurisdictions.I chose to focus on this kind of agreement after seeing similar proposals and thinking that they seemed promising. My intended contribution here is to think about how tractable it would be to get something like CHARTS, and how to increase this tractability. I mostly do not attempt to assess how beneficial (or harmful) CHARTS would be.I focus particularly on an agreement between the US and China because existentially dangerous training runs seem unusually likely to happen in these countries, and because these countries have an adversarial relationship, heightening concerns about racing dynamics.Political acceptance of costly measures to regulate AII introduce the concept of a Risk Awareness Moment (Ram) as a way to structure my thinking elsewhere in the report. A Ram is “a point in time, after which concern about extreme risks from AI is so high among the relevant audiences that extreme measures to reduce these risks become possible, though not inevitable.” Examples of audiences include the general public and policy elites in a particular country.I think this concept is helpful for thinking about a range of AI governance interventions. It has advantages over related concepts such as “warning shots” in that it makes it easier to remain agnostic about what causes increased concern about AI, and what the level of concern looks like over time.I think that CHARTS would require a Ram among policy elites, and probably also among the general public – at least in ...

Visit the podcast's native language site