None. It’s a software problem.
At a recent wrap-up party of Harry Potter and the Methods of Rationality I attended on Pi day, I overheard a discussion about friendly artificial intelligence. I think several errors were made in that discussion, but unfortunately the suboptimal acoustic situation at the venue prevented me from offering my two satoshis. But I figured this would make for an interesting blog post, since if one person makes these mistakes, there must be more than one. (Incidentally, what HP:MoR, FAI and Bitcoin all have in common, is that I’ve heard about them from LessWrong.)
So far, humanity has had great success at building artificial specific intelligences. These are machines that can perform well in specific tasks which were once doable only with human intelligence. We have calculators that operate faster and more accurately than any human. We have chess programs that can easily beat the strongest human players. We even have cars that drive themselves more safely than humans.
What we don’t have is an artificial general intelligence (AGI) – a machine that has our ability to adapt to a very wide range of circumstances and solve practical problems in diverse fields. What will it take to create such a thing?
An argument I’ve heard says, that at our current technology level, we can build a machine with some specific level of intelligence (using, say, a generic state-of-the-art machine learning algorithm, such as a neural network). With hardware advances and Moore’s law, we will be able to build smarter machines, until one day, a computer will be as intelligent as a human. Past that point, it was said, computers with better hardware will become even smarter than humans, and gradually widen the gap.
Mankind has always been fascinated by the ability of birds to fly, and dreamed of gaining this ability itself. And people tried to proactively pursue this dream… By building feathered contraptions that resembled bird wings, attaching them to their bodies, and waving their arms vigorously.
That didn’t work.
People didn’t succeed in flying by building up muscle strength and flapping their arms more forcefully. They did it by understanding how flight works – the laws of physics, and aerodynamics in particular – and using this understanding to design a machine that can fly given our requirements and the tools available to us. These machines, of course, have only a superficial resemblance to birds.
Taking an algorithm which is crudely inspired by how brains are supposedly built, running it on increasingly faster hardware, and hoping that eventually general intelligence will emerge, is also not going to work. Instead, we need to understand how intelligence works, and use that to write software that will elicit intelligence from the technical capabilities of our computing hardware. Given the reliability and sheer processing power of modern digital computers, it is likely we will end up with a machine which is more intelligent than a human.
What’s next? The machine won’t wait around for Moore’s law to double its processing power and give it an edge in intelligence. Rather, it will use its superior intelligence to modify its own source code and create a better intelligence than we mere humans could create. The result will be even smarter and create an even better AI, and so on. The whole thing can explode rather quickly into absurd levels of intelligence.
What will this absurdly intelligent machine do? Another argument I’ve heard is that, since we wrote the code, it will only do what we told it.
It is a fundamental fact of theoretical computer science that, given an arbitrary program, there is no general way to tell if running this program will go on forever or stop at some point. Knowing whether a program stops or not is a pretty basic thing, so this already demonstrates the absurdity in thinking that knowing the code means knowing what the code does.
But we don’t need to go as far as these abstractions. Chess playing software were written by people, and these people have a good idea of the general way the program will go about finding the best moves. What they don’t know is what actual moves the program will play on the board. Indeed, chess programs often make moves no human would ever think of, because no human can do the trillions of calculations that the computer does.
But chess programs are just a specific intelligence. Once we build a program with general intelligence, we have no idea what specific course of action it will take. At first, we’ll have an idea about what the program does to reach a decision – but once the machine runs modified source code that it has written itself, we don’t even have that anymore.
It is generally assumed the AGI will be an “agent” – it will have a “target function”, a goal it wishes to achieve, and the software will be designed so that it always chooses the actions that best work toward this goal. We can try to construct the goal to be compatible with what we want, but “what we want” is incredibly complex and difficult to code; and the machine only cares about the goal we’ve written, not what we intended to write.
When we humans work towards our goals, we see fellow humans as our peers. When an AGI sees a human, it is more likely to see them as a collection of atoms that might be of more use to it in a different configuration. Avoiding the situation of a strong AI trampling humanity in pursuit of a naive target function that was coded into it, is exactly what the challenge of developing a Friendly artificial intelligence (FAI) is all about.
I’ve skipped over many details in this description, of course. But if you’re interested in learning more, you should stop listening to me – as I know nothing about this subject – and head over to https://intelligence.org/ (and if you ever decide to make a donation, they also accept Bitcoin).