The past decade has seen the rise of GPGPUs. We’re leveraging the tremendous computational power of graphics cards to accelerate computationally intensive applications such as machine learning, video compression and sorting. Unfortunately, GPGPU is somewhat slow to gain adoption. There are multiple issues involved, such as the need for special GPU/numerical programming languages, complex drivers, vendor-specific differences, and the overhead of having to shuffle data in and out of a separate memory hierarchy.
I was recently reading a blog post claiming that matrix multiplication (GEMM) is the most expensive operation in deep learning, taking up to 95% of the execution time. This got me thinking that maybe GPGPUs are simply not ideal for most applications. Maybe future CPUs should begin to include numerical coprocessors. Sure, we already have SIMD, but the way it’s implemented on x86 CPUs is awkward and relatively inefficient, forcing you to deal with multithreading, prefetching, and SIMD registers of small fixed sizes. Every few years, Intel adds support for new instructions with new SIMD register sizes, rendering your code outdated (yuck). To do SIMD well, you basically write or generate code specialized for a specific CPU model, and even then, it’s just not that fast.
I believe that the Cray 1, which came out in 1975, had the right idea. You write a small loop (kernel) and let the CPU handle the memory traffic and looping. What I’m thinking of is essentially a CPU core which is optimized to run parallel-for instructions on a reduced numerical instruction set. Imagine having a specialized CPU core that shares the same memory space as other CPU cores and can handle prefetching, parallelization and low-level optimizations according to its capabilities, without you needing to change your code. Imagine not needing a driver or threads to make use of this core. Imagine how fast matrix multiplication or cross product could be if you had native hardware support for it.
It’s now been over 50 years since the birth of the AI field, and we still have no androids or sentient computers. Some believe it’s because Artificial General Intelligence (AGI) is simply unachievable, others say it’s because we just don’t have the computing capacity required yet, and then some think it’s because we’re lacking a fundamental breakthrough, we just haven’t grokked the “general-purpose intelligence algorithm”. The latter view is reflected in Jeff Hawkin’s On Intelligence book.
I’m currently reading Nick Bostrom’s Superintelligence book (which I highly recommend), and it’s got me thinking about intelligent machines, a topic that’s fascinated me since childhood. My own opinion is that strong AI is almost inevitable. Assuming that humanity doesn’t destroy itself or fall into a new middle age, sentient computers will happen, it’s only a matter of time. I’m of course hesitant to put a time scale on this prediction, since such predictions have so often been wrong, but I’m inclined to believe that it will happen within this century.
I personally don’t believe that computing capacity has ever been the problem. We’ve had impressively powerful supercomputers for decades, we just haven’t had very effective machine learning algorithms. The fact is that until recently, nobody knew how to train a neural network with 5 layers of depth short of brute forcing the problem with genetic algorithms, which is obviously not what human brains do. Now, with deep learning, we’re finally starting to see some interesting progress in machine learning, with systems outperforming human beings in image recognition tasks.
Still, I don’t believe that deep learning is the one breakthrough that will lead to strong AI. Deep learning is impressive and cool, but it isn’t strong AI, it’s simply much better machine learning than what we’ve had until now. I simply don’t think strong AI will come from one single breakthrough. We’re not going to suddenly grok the one algorithm for AGI. The people who believe this, in my opinion, fail to appreciate the complexity of the human brain. Our brains are made of a multitude of components specialized to perform different tasks effectively. Hence, it would be surprising if we could obtain human level intelligence from some one single algorithm.
The obsession with cracking the AGI algorithm, in my opinion, stems from the obsession some computer scientists have with mathematical elegance and the study of algorithms that fit within a single page of text. Algorithms short and simple enough that emergent mathematical properties can easily be proved or disproved. The human brain, however, is the result of an evolutionary process that occurred over hundreds of millions of years, and is too complex to be described by an algorithm that fits within a single page. The reason we haven’t created AGI, in my opinion, is that several component parts need to be put together in non-trivial ways. At this stage, machine learning, as a field, has largely been figuring out how these individual components can be built and optimized.
I think there will come a point where we begin to put different machine learning components, such as image classifiers, planners, databases, speech recognizers and theorem provers together into a coherent whole, and begin to see something resembling AGI. I don’t think, however, that some computer will suddenly reach sentience at the flick of a switch. Engineering is similar to natural selection in that it’s often a process of iterative refinement. It’s now possible to fly anywhere on the globe in a jet-propelled airplane, but the airplanes of today are very different from the Wright brother’s prototype. There’s no reason to think AI is any different.
The first AGIs probaby won’t be “superintelligent”. It’s likely that they will initially lag behind humans in several domains. They may show great intelligence in some areas while failing to grasp key concepts about the world which seem obvious to us. One can simply think of dogs. They’re sufficiently intelligent to be useful to us, but they can’t grasp everything humans do. Provided that we do eventually create AGI, these AIs will likely reach and surpass human intelligence, but this process is could take years or even decades of iterative refinement.
An interesting question, in my opinion, is whether the first AGIs will exist as agents on the internet, or whether they will come in the form of embodied robots. In the near future, I predict that we will see an increasing presence of intelligent agents online, the rise of the smart web. These agents will be used to automatically extract semantic information from web content, by doing things such as tagging videos or text with metainformation, and perhaps predicting the behavior of human individuals. This, however, is not AGI. There’s an argument to be made that robots may bring forth AGI simply because the need is there. We expect robots to effectively deal with the complexities of the real world as it is, which seems to require human-level intelligence.
In the 1970s, there was great enthusiasm about Artificial Intelligence (AI). Projects such as SHRDLU and Shakey the robot generated a huge amount of hype, and had people believing that computers would reach human-level intelligence within only a decade or two. Unfortunately, cracking the general intelligence problem turned out to be more difficult than anticipated. The AI field stagnated for years, and after multiple repeated failures, funding from AI research was cut down drastically. The term artificial intelligence became somewhat of a dirty word in computer science circles. In the mid-1980s, the field began a slow return with the arrival of artificial neural networks. To dissociate itself from needless hype, AI was rebranded as Machine Learning (ML), with much more modest goals such as producing programs capable of classifying objects in an image or translating human speech to text.
Recently, deep learning, a technique that allows training of deep multi-layered neural networks, has shown much promise. It has been able to outperform every other technique when it comes to image recognition, even outperforming humans. Deep learning has also helped us make some headway in natural language processing and machine translation. I’m going to take a break right here and tell you that this post is not about “the singularity”, Kurzweil synthesizers, jacking ethernet cables into your neck or how terminators are coming to your house with BDSM gear and cans of WD-40. What I want to talk about is not science-fiction, but the kinds of changes that are likely to happen in the next 5 to 10 years.
I believe the internet is about to change, a lot, in ways that are difficult to predict even for experts in the field. Deep learning and the widespread availability of extremely powerful computing clusters will make it possible to implement services and APIs (software agents) with increasing degrees of intelligence. As I write this, machine learning labs are turning out graduates who are surely eager to capitalize on their expertise by creating useful services that can be marketed to search engines, advertising agencies and social media. There are huge financial gains to be made, which makes these changes almost inevitable.
The first, most obvious change we’re likely to see in the next few years is that there’s going to be an increasing prevalence of facial and object recognition on the internet. If someone takes a picture of you and it gets uploaded somewhere, Google and Facebook will know about it. Progress will not stop at facial recognition, however. Computers will slowly begin to understand some of the content on the web on a semantic level. YouTube will soon know which people are in your videos, where the video is being filmed and what objects are in each shot. Eventually, it will also be able to transcribe what you say in textual form and, well, understand the actions people in videos are making at a basic level (e.g.: at the 5 minute mark Johnny kisses his girlfriend Lucy on the cheek, Lucy smiles and appears happy).
The level of understanding that intelligent agents possess is only going to increase over time. Again, I think it’s largely a matter of economics. There is so much profit to be made in creating smart agents that understand some aspects of the world, the incentives are so large, that it’s almost inevitable that companies will begin throwing hundreds of billions of dollars at the machine learning field, and progress is going to happen very rapidly. Soon enough, Google and Facebook will use GPS tracking and image recognition to understand what your average day looks like and predict your behavior. They might begin to do what can only be described as “psychological modelling”, and gain an understanding of you that, in some ways, is better than your own, at least as far as predicting your behavior goes. This is the threshold at which things become somewhat worrisome.
Extrapolating further, we can imagine smart agents capable of combining information from your various online profiles, chat conversations, photos and a variety of other data sources to gain a fairly good understand what is happening in your life. Without sentience ever coming into the picture, these intelligent agents are going to be able to make connections individual humans beings can’t realistically make. Google may be able to infer with some percentage of certainty whether or not you skipped breakfast, took your Wellbutrin, indulged in a few too many drinks at the office party and cheated on your partner. With access to a firehose of big data, they could also try to predict who your next boyfriend/girlfriend might be among some set of people you know. With targeted ads and very subtle alterations of your daily life experience, they could even influence who that someone will be in some cases.
It’s easy to imagine a plethora of much darker scenarios. What if some intelligent agent concluded some financial interests might be better off if you were dead, and then tried to subtly push you in the direction of depression, suicide or drug addiction? You might think that this is all very far-fetched and dark, that none of the big companies would ever do such evil, that the law will surely protect us, or that my predictions of technical advancement are simply way too optimistic. Perhaps you’re right. It seems to me though that if you create a powerful technological tools and there is sufficient economical motive to use them for a given purpose, somebody will. There is surely some financial incentive to steer your life in certain directions.
What would the internet be like if spammers had access to intelligent agents capable of generating believable and unique content? There are no humanoid robots walking our streets, but I believe it might be wise to start thinking about the implications of upcoming technological changes now. Though history has shown time and again that humans tend to be more reactive than proactive.
On the upside, the smart web means some good things in the realm of education. Imagine what the world would be like if you could ask Wikipedia questions verbally, and get tailored answers, maybe even custom diagrams and visual explanations generated based on your queries. Imagine if all the depressed people in the world could get access to Cognitive Behavioral Therapy (CBT) for free from the comfort of their homes. One thing is absolutely certain, we’re living in interesting times.
The HyperText Markup Language (HTML) was invented by Physicist Tim Berners-Lee of CERN, and its first publicly available specification surfaced on the internet in 1991. It was meant intended as a simple format for people to share enriched static documents or pages that could refer to one another through hyperlinks. The original design was relatively simple, with just 18 types of elements. As of today, the HTML specification in PDF format is 1156 pages long.
As of today, Chromium is about 17 million lines of code, comments excluded. The sheer complexity of HTML+JS+CSS is enormous. This is a problem because it means there are many areas where implementations can differ. What I mean is that your webpage is increasingly likely to break or not behave the same on different web browsers. The fact that HTML is a fast-moving target and a “living standard” certainly won’t help. What recently dawned on me is that HTML too complex for any individual or corporation to ever implement a new web browser from scratch.
A few years ago, I started playing with new HTML APIs to produce audio output. I was one of the first play with Mozilla’s Audio Data API. This API allowed you to generate raw audio samples and send them to an audio output, among other things. Unfortunately, Chrome never implemented this API. They opted instead to create the competing Web Audio API. Since Web Audio has the backing of the W3C, it became obvious it would win out, and the Web Data API would die. It took about 3 years for Firefox to finally implement Web Audio. I’m still not sure if Safari and IE support it properly.
The victory of Web Audio over Audio Data frustrated me a bit, because I really thought the Mozilla’s Audio Data API had the right approach: it was simple and low-level. You generated samples and wrote them out for output. The Web Audio API, in comparison, is essentially a modular synthesizer, with many different types of audio processing nodes you can interconnect together. It’s way more complex than the Audio Data API. The Web Audio API is most likely implemented in C++ and interfaced with JS through several dozen types of HTML DOM nodes.
The Web Audio API is stupid for two simple reasons. The first is that you really could implement your own modular synthesizer in JS (please refrain from telling me that JS is too slow for that as I’ve already implemented one). Letting people implement their own modular synthesizer as part of a library is more flexible and more portable. The second reason Web Audio is stupid is that implementing many kinds of pre-made synthesizer nodes in C++ and interfacing them through the DOM almost guarantees that Web Audio will not sound the same on different web browsers.
In my opinion, the problem with HTML is the growing complexity and feature redundancy. The web is moving very fast, too fast maybe, probably in part because it’s a hugely competitive market. Everyone wants HTML to support the latest and coolest feature their need in the way they like best. Design by committee ensures that bloat accumulates. I think that one day, HTML is bound to crumble upon its own weight, and something more minimalistic will take its place. Something designed from the ground up for web applications instead of static webpages. The app stores used on cellphones may be a hint of things to come.
Intelligence is a hard thing to define. The running joke in AI circles that every time a computer manages to do something we previously thought required human-level intelligence, we raise the bar and define this something to be what intelligence is not. A good example of this is the game of chess. Not long ago, many thought that computers could never equal chess grandmasters. They believed being a chess grandmaster required a powerful intuition, a kind of magic that only humans could ever possess. In 1997, the efforts of the Deep Blue team at IBM demonstrated that this belief was quite mistaken.
Today, chess-playing computers are usually not thought of as intelligent machines. We tend to believe intelligence is something only humans are capable of. We see human intelligence as fundamentally different from what animals and computers do. Animals may not be able to build computers or write poems, but a quick examination of the world around us reveals that many animals have emotions, complex social dynamics, puzzle-solving abilities, the capability to fashion and use tools as well as some degree of self-awareness.
I don’t view intelligence as exclusively human. Fundamentally, I believe that intelligence is the capability of a system (biological or computerized) to dynamically adapt to its environment. One thing that greatly enhances the survivability of lifeforms is to be able to dynamically adapt their behavior in response to changing environmental conditions. Humans are an extreme example of this: our brains and nervous systems allow us see with our eyes and interpret images, to move very rapidly, to understand our environment and devise elaborate plans going far into the future.
I believe that intelligence exists on a spectrum, or in varying degrees. Cats have a fairly powerful understanding of their environment too. They understand their own movements very well. They can predict the behavior of physical objects and their own body as a part of the physical world. Bees can see flowers, avoid obstacles and navigate based on light polarization patterns in the sky. Plants change the direction of their growth so they can more efficiently capture sunlight. Jellyfish have no brains, but they have a network of nerves which controls their swimming according to factors such as the day/night cycle.
At an even lower level, paramecium are unicellular organisms which can move around water using ciliate. Being just one cell, they have no neurons. Nevertheless, they can sense chemicals in the water surrounding them, detect obstacles, eat smaller prey organisms, reproduce sexually and have even shown some ability to learn. What is obvious is that even though paramecium have no brains or neurons, they contain chemical and genetic machinery that implements some kind of algorithmic process which changes their behavior based on environmental conditions.
You might be surprised to find out that some viruses are able to navigate inside host cells and express different genes (change their behavior) based on where they are inside these host cells. Viruses are thought of as non-lifeforms by many scientists, but they are definitely able to adapt. In fact, without even getting into dynamic changes in the expression of viral DNA, one could say that the fast mutation of viruses constitutes a form of adaptation in itself. Viruses reproduce so fast and in such large numbers, that they are able to do a brute-force search for new attack vectors on host organisms.
At the root of it all, natural selection is an optimization process that helps lifeforms adapt to their environment. One could argue that natural selection, a force of nature, is a kind of intelligence in and of itself. There is no intelligent design, but nature finds a way to make things more efficient through trial and error. I would argue that in a way, natural selection is a process that has been optimizing itself to help life spread and adapt more effectively. In the beginning, there were only unicellular organisms, and they reproduced through mutation only. Mutations are a very slow and inefficient way to adapt. Then came sexual reproduction. This allowed genes from multiple viable individuals to be recombined, increasing rate at which viable genetic combinations can be explored, thus increasing adaptability.
Multicellular organisms arose. This gave rise to inter-cellular communication and nervous systems which allowed rapid movements. Nervous systems gave rise to centralized brains which could perform computations of ever-increasing complexity. Eventually, organisms came which had such complete models of their environment that they understood their own existence and wondered about the origin of life itself. Today, humans are intelligent enough that they are beginning to understand their own DNA. Soon, humans will be using their intelligence to make conscious choices about their own evolutionary destiny. This will allow us, as a species, to make evolutionary leaps that no other species ever has.
Over a month ago, I received reviews for the basic block versioning paper I submitted to ECOOP 2015. The reviews were again mixed, ranging from great to terrible. After 3 previous rejections at other conferences, I’d lowered my expectations and thought we were most likely in for a fourth refusal. To be honest, I was somewhat angry. The worst review we had received was unfair and harsh. Whoever had written it seemed to have a fairly poor understanding of JIT compilation, but that didn’t stop them from aggressively criticizing our work.
Fortunately for me, ECOOP offers a chance for submitters to respond to peer reviews, and I was able to channel my anger into motivation to write a very strong and detailed rebuttal. I still didn’t think our paper would make it, but at the very least, I thought, I gave it my all. I was pleasantly surprised when I learned earlier this month that the paper was accepted! Two reviewers increased the score they gave us, tipping the balance in our favor.
This is an important turning point for me, because it means the key contribution of my research work (the idea of basic block versioning) is getting published. I’m finally getting some official recognition from my academic peers for the past four years of hard work I poured into my research. Getting at least one publication from my PhD work was a personal goal I’d set for myself.
Now that this goal is reached, this paper offers me something to build upon. Publishing about additional contributions from my PhD work should be a fair bit easier now that I have my foot in the academic door. I’m already working on a new paper about typed object shapes, and I’ll be able to cite the ECOOP paper. I’ll be able to present further contributions as improvements over the work I already published, instead of trying to explain them in isolation.
And hey, I get to go give a talk in Prague! Sometimes I think being an academic isn’t all that bad.
In December, I submitted a paper about my work on basic block versioning to ECOOP 2015. Basic block versioning (BBV for short) is a technique to specialize machine code in order to eliminate type checks. It works by generating type-specialized versions of basic blocks on the fly based on the types observed during the execution of a program. It’s a way to type-specialize code without doing type analysis per-se. The technique, as described in the ECOOP paper, works at the intraprocedural level (inside function bodies) and only deals with primitive types (i.e. integer, string, object, boolean). My goal with this paper was to explain basic block versioning in its simplest possible form and avoid complicating the explanation by having to delve into the details of various enhancements to the technique.
I still don’t know if the paper will be accepted or not, but in the meantime, my research has been advancing. I extended BBV to perform overflow check elimination on loop increments, which turned out to be a relatively trivial extension to make. I also implemented what I refer to as typed shapes, which is a mechanism to version code based on object shapes and property types that I feel meshes quite well with BBV. Most recently, I’ve been working on extending BBV to work interprocedurally. That is, allowing type information to cross function call boundaries. I see this last addition as the missing piece of the puzzle which completes the BBV approach and makes it into a viable complete alternative to traditional fixed point interprocedural type analyses.
My implementation of interprocedural BBV can be divided into at least four components:
- Generating type-specialized function entry points (passing types to callees)
- Type-specialized call continuations (passing return types back to callers)
- Shapes-changed “dirty” flag propagation to call continuations (shape mutation/aliasing info)
- Threading the global object shape through function calls
So far, I have the first two parts working on my development branch with all tests passing and all benchmarks working. The results are quite positive so far. I’ve been able to achieve a reduction in the number of type tag tests of 98% on average over 26 benchmarks. On many of the simpler benchmarks, there are essentially no type tests executed during the entire execution.
The results so far are impressive enough that I believe BBV may actually be eliminating more type tests than what is achievable with traditional type analyses. This is assuming that such analyses eliminate redundant type tests without versioning or duplicating code. I intend to explore this further as part of my PhD dissertation. I would like to be able to show that BBV is not only fast but actually more capable than iterative fixed-point type analyses.