EA - Two reasons we might be closer to solving alignment than it seems by Kat Woods

The Nonlinear Library: EA Forum - Un pódcast de The Nonlinear Fund

Podcast artwork

Categorías:

Link to original articleWelcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Two reasons we might be closer to solving alignment than it seems, published by Kat Woods on September 24, 2022 on The Effective Altruism Forum. I was at an AI safety retreat recently and there seemed to be two categories of researchers: Those who thought most AI safety research was useless Those who thought all AI safety research was useless This is a darkly humurous anecdote illustrating a larger pattern of intense pessimism I’ve noticed among a certain contingency of AI safety researchers. I don’t disagree with the more moderate version of this position. If things continue as they are, anywhere up to a 95% chance of doom seems defendable. What I disagree with is the degree of confidence. While we certainly shouldn’t be confident that everything will turn out fine, we also shouldn’t feel confident that it won’t. This post might have easily been titled the same as Rob Bensinger’s similar post: we shouldn’t be maximally pessimistic about AI alignment. The main two reasons for not being overly confident of doom are: All of the arguments saying that it’s hard to be confident that transformative AI (TAI) isn’t just around the corner also apply to safety research progress. It’s still early days and we’ve had about as much progress as you’d predict given that up until recently we’ve only had double-digit numbers of people working on the problem. The arguments that apply to TAI potentially being closer than we think also apply to alignment It’s really hard to predict research progress. In ‘There’s no fire alarm for artificial general intelligence’, Eliezer Yudkowsky points out that historically, ‘it is very often the case that key technological developments still seem decades away, five years before they show up’ - even to scientists who are working directly on the problem. Wilbur Wright thought that heavier-than-air flight was fifty years away; two years later, he helped build the first heavier-than-air flyer. This is because it often feels the same when the technology is decades away and when the technology is a year away: in either case, you don’t yet know how to solve the problem. These arguments apply not only to TAI, but also to TAI alignment. Heavier-than-air flight felt like it was years away when it was actually round the corner. Similarly, researchers’ sense that alignment is decades away - or even that it is impossible - is consistent with the possibility that we’ll solve alignment next year. AI safety researchers are more likely to be pessimistic about alignment than the general public because they are deeply embroiled in the weeds of the problem. They are viscerally aware, from firsthand experience, of the difficulty. They are the ones who have to feel the day-to-day confusion, frustration, and despair of bashing their heads against a problem and making inconsistent progress. But this is how it always feels to be on the cutting edge of research. If it felt easy and smooth, it wouldn’t be the edge of our knowledge. AI progress thus far has been highly discontinuous; there have been times of fast advancement interspersed with ‘AI winters’ where enthusiasm waned, and then several important advances in the last few months. This could also be true for AI safety - even if we’re in a slump now, massive progress could be around the corner. It’s not surprising to see this little progress when we have so few people working on it I understand why some people are in despair about the problem. Some have been working on alignment for decades and have still not figured it out. I can empathize. I’ve dedicated my life to trying to do good for the last twelve years and I’m still deeply uncertain whether I’ve even been net positive. It’s hard to stay optimistic and motivated in that scenario. But let’s take a step back: this is an extremely complex question, and we haven’t attac...

Visit the podcast's native language site