
Artificial intelligence (AI) is one of the leading items of news coverage – along with lofty predictions and dire warnings. AI will either make the world imminently more efficient and productive, with vast benefits for the humans who share this earth – or it will take over that earth and destroy the human beings who inhabit it.
Human Compatible, by Stuart Russell, is a critical tool for those of us who struggle with understanding the technology, and, more important, the implications of its rapid development. But this is a book written for laymen – even though the writing is packed with technical information, the author is careful to illustrate these concepts with clear and concrete examples. enabling his readers to understand.
Russell’s book has three parts: exploring the idea of intelligence in humans and machines; what progress and problems we can expect; and a new way to think about AI, to ensure that machines remain beneficial to and compatible with humans.
I confess I was anxious to read the sections that would calm my fears about AI, but I found the first chapters eye-opening. Here, in extensive detail, Russell seeks to find “a reasonable definition of intelligence” (32)* by tracing “the history of intelligence in humanity – how our concept of intelligence came about and how it came to be applied to machines. “Then,” he writes, “we have a chance of coming up with a better definition of what counts as a good AI system” (13).
Next, he traces the development of intelligent machines from its ancient, 13th century beginnings to contemporary research.
Russell makes clear the challenges of AI research: he explains that the “problem what to do right now, at every instant in one’s life is so difficult that neither humans nor computers will ever come close to finding perfect solutions. The machine may be far more capable than we are, but it will still be far from perfectly rational” (39).
In part two, “A Look at Where We’re Going,” the author samples “what is coming down the pipe” (63), including current research and challenges of the AI ecosystem, self-driving cars, intelligent personal assistants, and smart homes and robots.
He also explores “conceptual breakthroughs to come,” (78) and states that we are far from solving the difficulties of creating “a general purpose, human level AI.” He explains that “All scientific discoveries rely on layer upon layer of concepts that stretch back through time and human experience. We have only elementary success in the use of cumulative generation in machines” (86).
In a section entitled “Imagining a Superintelligent Machine,” Russell discusses the current state of research — which he criticizes for “lack of imagination,” (93) — and identifies both the possibilities and the limitations of such a machine. He describes the difficulty of equipping a machine to fully understand and emulate the complexities of the human being: “Machines,” he reminds his reader, “are not human, and at an intrinsic disadvantage when trying to model and predict one particular class of objects” The class of objects? Human (98).
Russell also investigates possible misuses of these machines, misuses which fuel the nightmare scenarios of machines taking over the human world. He looks at the use of AI for an autocratic government and/or corporation to “surveille, persuade and control” (103) its citizens, as well as the “modification of the information environment (tracking our online uses, deep fakes, and bot armies.) This misuse, Russell believes, violates our right to mental, as well as physical, security. (107)
Another area he explores is the “elimination of work as we know it” (113). We have already witnessed this, as self-serve gas stations and retail check-outs have eliminated jobs, and Russell prophesizes that this substitution of AI workers for humans will continue. He examines both sides of the beliefs that the effects of AI will/will not make up for the reduction of jobs, including the prediction of economists that humans will be better off because of the increased productivity (117). The job opportunities, he believes, will be “in supplying interpersonal services that can be provided only by humans. We can still provide our humanity; the capacity to inspire others and confer the ability to appreciate and create is likely to be more needed than ever” (122).
In a chapter entitled, “Overly Intelligent AI,” the author explores the fear that humans will lose control over these machines. Russell rejects the idea of limiting AI research to avoid that loss. Instead, he advocates understanding that we may “suffer from a failure of value alignment – imbuing machines with objectives that are imperfectly aligned with our own. A partial and inadequate view of human purpose can lead to this loss of control” (137). For example, telling a machine to find the cure for cancer, without setting a value alignment, could lead the machine to “induce multiple tumors in every human being so as to carry out medical trials of the new compounds (138); asking it to reduce energy consumption could result in the machines’ persuading us to have fewer children (139). Giving a machine only a partial assignment could also result in unanticipated problems: telling a self-driving car to “take me to the airport as fast as possible” may result in the car hitting speeds of 180 mph (140). Russell explains that “Unfortunately, there are no simulators or do-overs. It’s certainly very hard, and perhaps impossible, for mere humans to anticipate and rule out in advance all the disastrous ways the machine could choose to achieve a specified objective” (140-141).
Another important point: “If you have one goal and a superintelligent machine has a different, conflicting goal, the machine gets what it wants and you don’t” (140).
Perhaps it has occurred to you that the solution is simply to turn the machine off. But having given the machine a purpose, it will fulfill that purpose, which it cannot do it if is turned off; therefore, it will not allow the user to turn it off. “The given objective creates as a necessary subgoal the objective of disabling the off-switch. If that objective is in conflict with human preference, we have the plot of Space Odyssey 2002” (141).
All is not smooth in the technical world, as Russell makes clear in the chapter entitled “Not-So-Great AI Debate.” He explores the arguments, pro and con, for proceeding with AI research; he refutes hysteria, but is very clear about the need for researchers to consider the implications of their work. He also refutes the laisse-faire attitudes based on beliefs that we are decades away from producing superintelligent machines, and exposes flaws in “whataboutery” and silence as a tactic to deal with the questions. The debate, he writes, “has highlighted the conundrum: if we build machines to optimize objectives, the objectives we put into the machines have to match what we want, but we don’t know how to define human objectives completely and correctly” (170).
In part three, Russell addresses the way he believes research should proceed. In Chapter 7, “A Different Approach,” he provides three “Principles for Beneficial Machines” (172):
1st principal: “The machine’s only objective is to maximize the realization of human preferences; it is, therefore, a “purely altruistic machine.”
2nd principal: “The machine is initially uncertain about what those preferences are; therefore, it is a “humble machine.”
3rd principal: ” The ultimate source of information about human preferences is human behavior.”
The author includes reasons for optimism, outlining strong economic incentives to ensure the machines defer to humans, and the abundance of “raw data” for learning about human preferences. He also includes cautions that economic competition can result in cutting corners, and that researchers are limited in their ability to affect global AI policies.
In addition to his optimism about the benefits of AI, Russell also states an “aspiration”: these machines will be “provably beneficial.” In such situations “we need to be sure that what is guaranteed is actually what we want and that the assumptions going into the proof are actually true” (184). To that end, he describes the mathematical proofs required, and he also introduces two characters: Robbie the Robot and Harriet the Human. Russell gives concrete examples of the ways that Robbie learns to assist Harriet– and in that assistance, Robbie learns Harriet’s preferences. He also explains how Harriet can be able to turn Robbie off, even if it is more intelligent than she is, and both the benign and less benign ramifications as Robbie learns more and more about Harriet’s preferences.
Russell believes the best solution to deter a machine intent on fulfilling its purpose is to “make sure it wants to defer to humans” (203). An important part of AI research, then, is to study “the planning and decision-making of a machine that has only partial preference information” (204).
Finally, in this chapter, Russell explores the idea of one machine creating a second, more advanced machine. But he reassures us that “AI researchers are only just beginning to get a handle on how to analyze even the simplest kinds of real decision-making systems, let alone machines intelligent enough to design their own successors” (210).
In the chapter entitled “Complications: US,” Russell discusses one of the major difficulties with implementing beneficial robots; that complication is the nature of humans.
In an imaginary world, Russell writes, we could all be Harriets, with a perfect robot-helper, Robbie. But the real world is made up of many people, many of whom are increasingly nasty. (211).
Russell explores all the characteristics of human beings that create a challenge for the development of intelligent machines. Trying to learn a human’s preferences, and taking into account the possibility that the human may be envious, rather than altruistic, or that one human’s preferences may interfere with another’s, are two of the problems Russell explores. Researchers are focused on building Utilitarian AI: in spite of its rather kitchen-appliance-sounding title, utilitarianism refers to “an ethical tradition according to which an action is right if it tends to promote happiness or pleasure and wrong if it tends to produce unhappiness or pain” (Britannica.com). So, AI researchers want to produce robots that, through understanding human preferences, can promote happiness or pleasure.
Russell identifies utilitarianism as “the most clearly specified and therefore most susceptible to loopholes. He identifies several loopholes, such as an early (1945) proposal to minimize human suffering. The AI machine responded that “this could best be achieved by rendering the human race extinct” (222).
Russell explains the different characteristics of human beings that can make serving their preferences challenging. Humans, he writes can be “nice, nasty, and envious” (227),and also “stupid and emotional” (231). They can be uncertain about their own preferences, and they can change those preferences over time. All of these human traits can affect Robbie’s ability to fulfill the preferences he has been created to serve.
The final chapter, “Problem Solved,” investigates the “core of a new approach to AI research” to create beneficial machines, machines which will defer to humans. They “will ask permission, act cautiously when the guidance is unclear, and allow themselves to be switched off” (247).
He reassures his readers that there is a desire about AI industries to maintain control over the systems, and governance of AI is “at least a superficial willingness among other players to take the interest of humanity into account” (250). He lists the guardrails that different nations and corporations are already establishing; they are aware of the dangers, and are promoting systems that will be secure from encroachment, as well as benefitting human beings.
There are negative possibilities beyond misuse: the disappearance of “the practical incentive to pass our civilization on to the next generation” (255) and “lazy, myopic human beings who find it pointless to engage in years of schooling when machines already have the knowledge and skill we are laboring to achieve” (255). This attitude, Russell believes, will lead to loss of human autonomy.
In the conclusion, the author suggests a solution to these problems. The solution, he states, “is cultural, not technical. . .we must move away from self-indulgence and dependency and move toward autonomy, agency, and ability” (255). If these seem to be impossible goals, our awareness of the challenges of AI transformations may make us more willing to make those changes.
In a recent discussion of AI with other retirees, I encountered an attitude that I discourage: the idea to totally ignore AI and refuse to use it. When I suggested they were already using it, in the form of Siri or Google maps, one friend replied that she never uses either one of those. But I suggest that we have more than an opportunity – we have an obligation – to be aware of movements that can well change our entire world – and to use our voices of wisdom to add to the dialogue. I encourage retirees to read this book, which is, of course, much more detailed and complete than my attempts to summarize it. Russell takes away the mystery (and perhaps the fear?) of AI, making us competent to participate in the decisions concerning this most remarkable innovation in human history.
*The page numbers listed in this blog refer to the Kindle version of the book and may not be consistent with the hard copy version.