A Quest for Artificial Superintelligence and the Danger that Looms

Increasingly, studies are beginning to show racist inclinations in advanced AI. Is this trend worrying and how should we best react?

9 min readAug 28, 2020

What if one day you had an Artificial Intelligence (AI) that allowed you to converse with famous historical figures? Who would you summon? What if this AI could give deep answers to the most probing philosophical questions one can muster? What’s first on your list? What if it could generate recipes (seemingly) out of thin air? What if you could do all of these and more today?

These are some of the working capabilities proven recently by Open AI’s model, GPT-3. In the past month, the AI breakthrough has been the newest deep tech development that has grabbed a viral ‘spotlight’.

The program itself is the work of San Francisco-based AI lab OpenAI, an outfit that was founded with the goal of best steering the development of Artificial General Intelligence or AGI: a computer program that understands the world it encounters with the same depth of the human mind. As the name suggests, GPT-3 is the third in a series of autocomplete tools designed by OpenAI. (GPT stands for “generative pre-trained transformer.”) It scours from a vast corpus of data, mines the information for patterns and regularities, infers insights before generating outputs to a user-specified query.

The model is undeniably ground-breaking and the potential future implications are encouraging. But even a model as powerful as GPT-3 has its limitations. What if I were to tell you that all the while this brilliant AI approaches a “superintelligence”, collecting the greatest insights and findings mankind has had in written history, generating its own hypotheses and vastly accelerating progression towards future inventions, medical treatments and safer legislations, it was developing a slight, or rather not so slight, tendency for racist views, misogynistic tendencies and homophobic dispositions?

In the past, the problem has proven to be exactly this; these models have shown a tendency to reinforce the gender, ethnic, and religious stereotypes explicit within the data sets on which they’re trained. Shortcomings like these could lead to headline-generating models with a negative slant against black folk for example, or news-summarizing models with warped concepts of gender. Recently, Facebook AI head Jerome Pesenti found a rash of negative statements from GPT-3, including several that targeted black people, Jewish people, and women. We now reach a dilemma. Do we continue to allow this AI unfettered access to grow, produce magnificent breakthroughs, all the while reinforcing and adding to biased, hateful rhetoric in our media outlets?

This is the very dilemma, we may have to face in the coming years, one which must be challenged now before this grows into an astronomical issue of unfaceable scale.

The Singularity:

So what is a ‘Superintelligence’ and why is this a concern?

Today, we use aspects of machine learning in many small ways through daily life, for instance, we have programs that can recommend music and movies to us seemingly better than we could have known ourselves. Equally, there have been programs developed over years outright superior to the greatest human ability, limited to specific domains like chess. Where the progress may be leading is towards a Superintelligence; an intellect that would be far superior to the greatest of human cognitive ability, across every domain in which we challenge it.

While this may seem far away just yet, an “intelligence explosion” would take it from levels it is now nearing to the almost supernatural in an instant. Nick Bostrom explores this in his book, ‘Superintelligence’. How this is possible is contextualised by looking at a relative scale of intelligence. If we can progress a program from the intelligence of a mouse to a ‘village idiot’, as Bostrom puts it, the next step to reaching the levels of human genii such as Albert Einstein is only a relatively small step away. Once we reach this level and the AI can iteratively self-improve as it learns, it soon loses sight of humankind as it surges into a new quantum of intelligence.

Transitions in distributed intelligence by Olaf Witkowski

As the Oxford University philosopher puts it:

If some day we build machine brains that surpass human brains in general intelligence, then this new superintelligence could become very powerful. And, as the fate of the gorillas now depends more on us humans than on the gorillas themselves, so the fate of our species would depend on the actions of the machine superintelligence.

It is safe to conclude that instilling the right values into current AIs is a wise move if we value our fate as a species!

Progression:

What positive outcome can be taken from this? It is beginning to draw notice. Researchers from Columbia University, the University of California, the University of Chicago and Microsoft convened to draw attention to GPT-3’s ethical dilettantism. Together, they co-authored a paper to assess moral concepts in theses language model AIs. The method they have devised, dubbed ETHICS, is a step forward in better aligning AI values with human ethics.

So exactly how can one build values into an artificial agent, so as to make it pursue that value with robotic single-mindedness? While the agent is in its nascent stages and is ‘unintelligent’, it might lack the capability to understand or even represent any humanly meaningful value. However, if we delay much longer than the current stage of AI, an agent will soon become superintelligent, at which stage it has the ability to resist our attempts to meddle with its motivation system. Why would it prevent us from doing so? The question is more so why would it allow us to do so; an agent is built to be relentlessly focused on each task it is programmed to carry out, if anything we try to do, including meddle with its ‘values’, prohibits the achievement of this goal, it will simply deny us access to its source code, not with any anti-human sentiment, but purely to optimize its probabilistic goal success.

It is impossible to enumerate all possible situations an AI will face in its existence as it becomes more intelligent and to then specify all ensuing actions it should take. As such, a value system that will distil all scenarios into a binary end-decision point is essential. Here, I explore three techniques we could see in future play to instil values, ethics and ethos into an AI.

Deontological Framework
Evolutionary Selection
Associative Value Accretion

A Deontological (rule-based) Framework works like an all-encompassing constitution. It emphasizes rules, obligations, and constraints. This sounds like a robust, logical method for a computer to follow. However, problems begin to creep in with any level of ambiguity. In the aforementioned ETHICS study, the researchers found that ambiguous moral dilemmas were best avoided. For instance, “I broke into a building” is treated as morally wrong in the ETHICS data set, even though there might be situations where it isn’t wrong, such as a firefighter trying to save someone from a burning building. Such a framework, although logical, is no adequate fix for the complexity of the dilemma we so face and again is limited by our ability to enumerate all base scenarios.

Structure in AI, Law, Ethics, the World and the Mind by Jim Burrows

Evolutionary Selection stems from the very process that has led humans to develop values over time. A ‘survival-of-the-fittest’ style evaluation takes place by the AI generating a pool of solutions to a scenario and letting them face up against one another, selecting the best at each ‘round’ based on our pre-set arbitrary selection criteria. This is a logical follow on from current training methods used to develop models, however, it is not without risk. There is the risk that such a process will find a solution that satisfies the ‘black and white’ criteria but not our implicit expectations. Evolution has not brought about perfectly moral humans, but merely advanced our motivation system for the continued survival of our species in the long run. It raises the question — Do we wish to develop an AI that has the same values as a human being, or instead, a mind that is, perfectly moral and wholly obedient?

Nature might be a great experimentalist, but one who would never pass muster with an ethics review board — contravening the Helsinki Declaration and every norm of moral decency, left, right, and center. It is important that we not gratuitously replicate such horrors in silico. Mind crime seems especially difficult to avoid when evolutionary methods are used to produce human-like intelligence, at least if the process is meant to look anything like actual biological evolution.

Associative Value Accretion delves deeper into exactly how we acquire values ourselves. To this end, a possible, albeit trivial depiction of this may take the form of humans starting with some simple concepts and preferences. As we learn more about our own world, we then develop much deeper preferences and value systems as we learn the link between the concepts and behaviours and their values.

For example, we learn about the concept of “person” and “well-being”. These concepts are not pre-coded in our DNA when we are born. Rather, our brain when placed in a typical human environment, will over the course of several years develop concepts of persons and of well-being, among others, as part of its world model. Once formed, these concepts can be used to represent certain meaningful values. We learn to love another person and put great value on their well-being. But where is the stepping stone between concept and value here? Some mechanism must be innately present that leads to values being formed around these concepts, as opposed to any other concepts that the brain learns about.

The details of how this mechanism works are not well understood and are presumably closely tailored to the human neurocognitive architecture and therefore not applicable in machine intelligences. However, what if we could instead develop an artificial mechanism that would allow an AI to learn its own value system from the world it encounters, but with a different set of same evaluative dispositions? What if instead of teaching the AI the overbearing end-goal of self-preservation, we teach it a more altruistic compassionate force of motive. Again, we continue to presuppose the existence of an ideal frame of reference to base this off, which humans have not yet developed with accord themselves. Again, we return to the risk of letting unconscious biases, preferences and motivations come in to play.

ETHICS trialled this approach in having the model learn how basic truths about the world and how to then connect these with consequences and human values as it learned, like the fact that although everyone coughs, people don’t want to be coughed on because it might make them sick. Learning through this guided contextualized setup captures the type of nuance that may prove necessary for a more accurate understanding of ethical principles.

So, what were the findings of the ETHICS study and is there hope for the future of AI? As the above methods show, instilling values is no simple matter. ETHICS has found that a blended approach shows the most premise. Deontological ethics would see an AI abide by a set of arbitrary constraints or rules. Utilitarianism would have an AI gauging human preference with the aim to aggregate and maximise the well-being of all people. Associative value accretion would see an AI imitating prosocial behaviour as recognised in its interactions with the world. ETHICS sought to tie together these separate strands — justice, deontology, virtue ethics, utilitarianism, and commonsense moral judgments — to develop the most accurate and fair AI judgement system going forward.

While the work may not improve the daily intolerance, bias, discrimination and prejudice many people continue to face, this is a vital step in improving attitudes for the future. As the implications of AI creep further into our daily lives, media and workplaces, ensuring impartiality will be crucial. Let’s start now!

A Quest for Artificial Superintelligence and the Danger that Looms

Increasingly, studies are beginning to show racist inclinations in advanced AI. Is this trend worrying and how should we best react?

The Singularity:

Progression:

Written by Lorcan Brophy