Sense from Chaos - Crossing the Semantic Gap

March 23rd, 2016

Edsger Dijkstra once said:

"The question of whether machines can think is about as relevant as the question of whether submarines can swim."

This was the view most AI researchers held back in the 1960s and 1970s. Back then, many thought that general-purpose AI could be achieved purely through symbolic manipulation. That is, it was thought that we could build machines that, through purely logical reasoning, would derive a sufficient understanding of the world to reach and even exceed human intelligence. This kind of vision of AI is illustrated in classic sci-fi novels, and embodied in the infamous HAL 9000 computer of 2001: A Space Odyssey.

The quest to build symbolic AI led to much research, and impressive early successes. These successes lead to great optimism, and the belief that computers would be able to effectively handle human language, machine translation and vehicle driving before 1980. Symbolic reasoning is very adequate for reasoning about mathematics, or small closed systems with very well-defined properties, such as the game of chess. Unfortunately, multiple dead ends were quickly reached. It was found, as philosophers had predicted, that you could hardly build a machine capable of reasoning about the real world through purely symbolic means. The problem was one of grounding.

Symbols in a vacuum don't mean anything. You can create an ontology in which you define the concept of a chair, and you can put logical statements in this ontology as to how people use chairs for sitting on, and that chairs are movable objects that behave according to Newtonian laws of physics. However, if you're trying to build a robot that can see and manipulate chairs, your ontology is basically worthless. You have two HD cameras providing you with 20 million pixels of raw data at 60 frames a second, and no logical statements are sufficient to help you tell where a chair might be in all that noise. Your machine has the concept of a chair, but this concept exists only in some Platonic realm that is completely divorced from the real-world.

Ultimately, the grandiose predictions of early AI researchers proved much too ambitious, because the real world was much too difficult to cope with, or at least, much more difficult to cope with than mathematicians had hoped. This lead to the first AI winter, with funding rapidly drying up for what is now known as GOFAI (Good Old Fashioned AI) research. For at least two decades after this, AI became kind of a dirty word in computer science circles. Those who still pursued AI-related research renamed their field machine learning, so as to avoid any association with the overhyped goal that we might one day build machines with human-level intelligence. From then on, machine learning researchers left behind the lofty goals of AI, and instead focused on the basics: small narrowly-defined learning tasks where they knew they could make headway.

Neural networks are not new, but it's only in the last few years that they have started to really shine, with deep learning. Advances in algorithms, access to large datasets and the unprecedented availability of computational resources have made it possible to scale this machine learning technique to networks with many layers of depth. Recently, some impressive feats have been achieved with deep neural networks, including object classification that exceeds human performance, and of course the much-discussed victory of the computer program AlphaGo over human Go champion Lee Sedol. Some amazing projects such as the neural artistic style transfer and deep networks that learn to synthesize new images have also surfaced.

At the moment, universities are seeing a huge increase in student interest in deep neural networks, with classes sometimes tripling in size over previous years. Governments and industry alike are pouring billions into deep learning research. It's undeniable: we're in the middle of a machine learning boom. The optimism is so great that well-known researchers such as Yann Lecun and Yoshua Bengio are even daring to use the word AI again.There's a lot of excitement (and some fear) about what deep neural networks will bring. People are becoming scared about robots taking human jobs, and the question of when computers will reach human-level intelligence is being asked.

To me, it really is a question of when. Provided that humanity doesn't sink into a post-apocalyptic middle-age, there is no doubt in my mind that machines can and will reach human intelligence. I just don't think that we'll get to Artificial General Intelligence (AGI) in the way that most people think we will. Many seem to believe that we just haven't come up with the right algorithm for human-equivalent intelligence, or that we just don't have enough computational power. Clearly, to reach human-level intelligence, what we need is a deep neural network with a number of connections equivalent to that present in the human brain, right? I don't think so. I don't think that's what we're missing.

In my view, the human brain is not running some one algorithm. The human brain is not just a huge wad of neurons, a huge uniform neural network. The human brain is made of many different specialized components that do different things, connected together by a variety of pathways. Deep neural networks are awesome, they're an amazing achievement, but they're just one piece of the puzzle. What deep neural networks do, is solve the perceptual problem. Deep learning allows us to do what the human visual cortex does. You get a million pixels of color information as input, and you turn this raw mass of data into a few classes of output. In short, with deep neural networks, we can turn real-world data into symbols.

There is no doubt in my mind that an AGI has to contain some sort of connectionist component, such as a neural network within it. To make useful AI programs however, the logical thing to do seems to be to assemble multiple specialized pieces together. In my view, AlphaGo is a beautiful illustration of this. It melds together multiple deep neural networks, which are used to do things such as assigning a value to different Go board configurations, along with a Markov tree search algorithm for looking at possible future moves. AlphaGo is very much a marriage of GOFAI techniques together with the power of deep neural networks. Deep networks make sense of complex board configurations without the need for hard-written logical rules as to how individual Go stones should be counted. Deep networks do the more intuitive perceptual work, and good old fashioned tree search does the logical reasoning based on this data.

Deep neural networks bridge the semantic gap between classical computer systems, where symbolic entities are defined in absolute terms in databases, and the fuzziness of the real world, where exceptions are the norm, and nothing can be entirely captured by absolute rules. If I had to guess, I would speculate that robots of the future, as they get more intelligence, are going to feature increasingly complex AI architectures made of multiple specialized components. There will be deep networks that do vision, hearing, perceive human faces and facial expressions. There will also be modules that do planning, navigation, logical/ontological reasoning and reasoning by analogy. All of this AI technology that we've been developing for the last 60 years is going to need to come together into a whole. That is one of the next big challenges. Making AGI happen won't just be scientific work, it will also pose a plethora of engineering challenges.

I think it's possible to build useful robots and human-equivalent AI without understanding the human brain in depth. However, I think that when we do build such AI, we're necessarily going to converge towards an architecture that does many things in ways that are actually quite similar to what happens, in computational terms, inside the human brain. Submarines may not swim, but in order to move forward, they still have to displace water. It's said that humans can only keep about 7 items of data in their working memory. It wouldn't surprise me all that much if, one day, neuroscientists manage to map out the human frontal cortex, and they discover that in this region of the brain, there are neurons laid out in a system that implements what can essentially be thought of as a small set of general-purpose registers, each able to store a neural vector representing a semantic concept. In other words, our ability to reason by analogy and to manipulate abstract concepts in our mind is probably embodied by neural machinery that serves to perform symbolic manipulations.