It’s an opinion that often comes up in development circles. We’re in the middle of a sort of language boom right now, and every time someone comes along and announces a new language, other people suggest that we should all stop creating new programming languages, because there’s already too many out there. Common complaints include the amount of effort needed to constantly learn new languages, the fragmentation caused by the constant introduction of new languages and the lack of innovation in these.
If you feel that new languages require too much effort to learn, the first thing I have to say is: tough luck. Programming is very much about constantly learning new things and being able to learn on your own. Once you’ve mastered a few programming languages, you should begin to see repeating patterns and find that the skills you’ve already acquired are very transferable. For instance, I’ve never written a single line of Go, but I’ve used C, C++ and D. I’m sure I could get started writing Go code within a few hours, and become reasonably proficient within a week. Go doesn’t scare me. Another important thing to realize is that not knowing a programming language isn’t necessarily a flaw you have to correct. Case in point: there are still many job postings out there for COBOL programmers. It’s perfectly acceptable to specialize in one or a few languages of your choosing.
As for the issue of fragmentation, I think there’s truth to it, but it’s not as bad as people imagine. There are only a few languages out there which are truly mainstream. These have existed for years, and if you know just 3 out of the top 10, you’ll be well equipped to realize personal projects or land yourself a development job somewhere. The new programming languages that come out, for the most part, remain relatively fringe and are used mostly by hobbyists. Most of them will likely die out. I’ve written my PhD project in D, and I’ve found that surprisingly few programmers have ever written code in that language. In fact, none of the programmers I’ve met in person outside of DConf had ever used D.
The biggest kind of fragmentation problem, in my opinion, is self-inflicted. I’ve spoken to people at various companies who told me that their architecture was made of some mishmash of parts written in five or six different languages. That seems like an issue to me, if only because more languages means more dependencies, more breakage, more interoperability problems and more code maintenance issues. That’s not the fault of all these languages for existing though, it’s simply bad planning. The administration there let some capricious programmers get their wish and include new code written in their pet language in the system, with little regard for the added complexity this language soup would introduce.
There’s some argument to be made that many of the new languages coming out lack originality. At the moment, most of them are statically typed and compiled ahead of time and most of them have similar syntax. It’s true that there isn’t a lot of innovation overall, but I don’t think that’s a reason to stop creating new programming languages. The design space is huge, infinite in fact, and we’re only beginning to explore it, in my opinion. Consider that even to this day, all the commonly used languages are based on the editing of text files. Also remember that even the languages which don’t succeed on a large scale, such as LISP, can have a tremendous impact on other languages down the road. Imagine a world where there were only two programming languages in existence: COBOL and Fortran. Is that a world you would like to live in? I would argue that there is a need for new languages to solve new problems.
Earlier this year, my first paper about Basic Block Versioning (BBV) was accepted at ECOOP 2015. BBV is a JIT code generation technique I’ve been working on which is very effective for eliminating redundant dynamic type tests. In this first paper, we (my advisor and I) were able to show that our technique eliminates 71% of dynamic type tests across our set of benchmarks, resulting in significant performance improvements. Last week, I submitted a paper about an interprocedural extension to BBV. This extends on the original work we did by generalizing the technique to propagate type information through function calls, about function parameters and return types as well.
The improvements are quite striking. We’re now able to eliminate 94.3% of dynamic type tests on average, and we eliminate more than 80% of type tests on every benchmark. To put things in perspective, I decided to compare this result with what’s achievable using a static type analysis. I devised a scheme to give me an upper bound on the number of type tests a static analysis could possibly eliminate. First, execute all the benchmarks and record the result of each type tests. Then, re-execute the benchmarks with the type tests that always evaluate to the same result removed. This is equivalent to using a static type analysis with access to “perfect” information about which type tests are going to be redundant. The resuls of this experiment are shown in the graph below:
I was very pleased when I first saw these results. The “perfect” static analysis eliminates, on average, 91.7% of type tests, which is less than what we achieve with interprocedural BBV. You might be wondering how this is possible, how BBV can possibly eliminate more type tests than what should be an upper bound on the number of type tests that can be eliminated. The main point is that the analysis is just an oracle that tells us whether any given type test is going to be redundant and safe to eliminate or not. In contrast, BBV has the power to selectively duplicate sections of code, which makes it possible to eliminate even more type tests.
The main reason that code duplication (or versioning), is useful, is that it allows BBV to separate out contextual information which wasn’t present in the original untransformed program. If you want a simple example, think of a loop where some variable x is an integer in the first iteration, and then becomes a string in every subsequent iteration. A traditional type analysis will see that this variable could be either an integer or a string, and conclude that we don’t know what type this variable will have at run time. In contrast, BBV might be able to unroll the first iteration of the loop, and know that it will be integer in this first iteration, and a string in every other iteration, thus eliminating all type tests on the variable x.
There are interesting implications to the results obtained with BBV, in my opinion. I was recently talking to someone at Facebook about HHVM, their virtual machine for PHP and Hack. They were telling me that Hack has gradual typing, and that this information isn’t yet used to optimize code in HHVM. Based on the results I got, I would say that there is probably no real need to type-annotate code in order to improve performance. Gradual typing can be useful for documentation, tools and safety, but it doesn’t really offer performance advantages. BBV can already eliminate over 94% of type tests on average, and this is only the beginning, there are still many easy ways to further improve upon these results. Alternatively, flipping what I just said on its head, if a JIT compiler can known the types of most variables at code generation time, the type assertions introduced by gradual typing can be implemented at little to no cost.
I recently blogged about my thoughts on the medium-term future of the internet, and the imminent coming of the Smart Web. There’s been a huge amount of progress in machine learning in the last five years, largely due to breakthroughs in deep learning. You might not be directly aware of it, but we’re at the beginning of a machine learning boom right now, a neural network renaissance. Google and Facebook are pouring huge amounts of money into deep learning. In the next few years, we’re going to see the fruits of these investments. Self-driving cars, automatic closed captions and more accurate machine translation come to mind, but I would argue that the ramifications are going to quickly expand much beyond this. If you think computers and the internet have changed the world in the last 20 years, you should really brace yourself for what’s coming, because really, that was just a warm up.
A few day ago, I interviewed at a web advertisement company in New York. Let’s call them Cloud7. They explained to me that they do Real-Time Bidding (RTB). According to them, every major internet ad provider does this now. When you click on a link and start loading a webpage, the ad provider gets blobs of data providing them with a rough idea of who you are (age, sex, income bracket), the websites you’ve been to, what you’ve been shopping for, etc. Many advertisers, wanting to sell you their products, then get to bid some amount (cents, fractions of cents) to buy ad spaces on the page you’re loading. Multiple ad auctions are over in tens of milliseconds before the page is done loading. If you’re wealthy and you’ve been visiting many car websites recently, then car vendors might be willing to outbid everyone to show you car ads, because they stand to make much more money selling you a car then a shoe company would selling you shoes.
You’ll be interested to know that the web advertisement world is already set up so that the information ad providers like Cloud7 receive about you is in part supplied by outfits referred to as third party data providers. There is already, as of now, a market in place for APIs that can produce information about visitors to a webpage. Information about you is already automatically gathered by multiple entities, traded for a monetary value and used to better pick the ads you see. The technology is somewhat primitive right now, but it’s improving constantly. Improvements in ad targeting can translate in huge revenue increases, so there’s a clear incentive to make these systems smarter. There’s a huge incentive to gather a richer set of information about you, and the market to buy and sell that information is already in place.
What deep learning will allow us to do is to bridge the semantic gap between the fuzzy thing that is the real world, and the symbolic world computer programs operate in. Simply put, machines will soon have much more understanding of the world than they currently do. A few years from now, you’ll take a picture of your friend Sarah eating an ice cream cone, and some machine in the cloud will recognize Sarah in the said picture. It will know that she’s eating ice cream, probably chocolate flavored by the color of it. Facial expression recognition will make it possible to see that she looks excited with a hint of insecurity. Combining information from multiple third party data providers, it won’t be too difficult to infer that you and Sarah are on your third date together. Looking at browsing history and social network profiles, it might be possible to have a pretty good idea how you two feel about each other, and whether this relationship is going to flourish or perish. What you yourself don’t know is that Sarah wanted to impress you so much, she went out and bought a new dress to wear on this date during her lunch break. Odds are you two will see each other again.
Why would Google or Facebook care about your date with Sarah, and your feelings for each other? Because that information can be useful and valuable in the right hands, which makes that information worth money. You might be more interested in having meals at fancy restaurants near her work in the next few weeks, or in buying that PlayStation 5 game she’s been talking about. Personal lubricant, scented candle and fluffy handcuff manufacturers think you might be more interested in their products than before. I don’t think this is so far-fetched. Google, Facebook, Amazon and every consumer-facing company out there want your money. The better they understand you, your life, and the world, the better chance they have at successfully getting you to hand them your cash. They might actually make your internet experience way more fun in the process. At the very least, the ads you see are going to be increasingly smart and relevant, which isn’t necessarily a bad thing.
Unfortunately, not everyone has “Don’t Be Evil” as their company motto. There’s another group of businesspeople, besides advertisers, which stands to profit hugely from machine learning. The people I’m talking about are scammers. Deep learning can be used to recognize people and objects, extract semantic information out of pictures, videos and tweets, but that’s not all it’s useful for. As illustrated in this amazing blog post, neural networks can also be used to generate content. Soon enough, scammers might be able to automatically produce content that begins to look eerily real. I don’t think it’s that far-fetched to think that your writing style could be imitated, complete with accurate details of your life thrown in. What if there was a program that could generate fake naked pictures of you and e-mail them to people you know? Worse, what if it were possible for a piece of software to call people you know and impersonate your voice on the phone? Sure, the machine doing the calling isn’t self-aware, but if it can have some rudimentary understanding of what people say to it and follow some kind of script, that might still be enough to cause a lot of trouble.
I found an interesting talk on YouTube titled “What Killed Smalltalk could Kill Ruby Too”:
Robert Martin makes the case that one of the big weaknesses of Smalltalk is that it was just “too easy to make a mess”. Smalltalk was highly dynamic, and encouraged people to “monkey patch” things and do quick fixes/hacks. He also makes the point that Smalltalk just “didn’t play well with others”. When you think about it, Smalltalk had its own source control, IDE and GUI built into live images, living alongside your program. Smalltalk isn’t just a language, it’s an operating system and a way of life. It’s conflating things that would maybe be best left separate.
It seems to me that in some key areas, the Smalltalk creators placed their own radical ideas above everything else. They chose idealism over pragmatism. Smalltalk was a language created with a grandiose vision. It had some deeply rooted principles which didn’t necessarily work so well in practice, such as the idea that everything had to be an object, that the object metaphor should be applied everywhere, one size fits all. At the end of the day, programmers want to get things done and be productive. If the language design or implementation gets in the way of getting things done, people will leave. Pragmatism is key for a programming language to succeed.
Smalltalk was also designed with the idea that it should be easy to learn and intuitive. This has led its creators to have a heavy focus on graphical user interfaces. I watched an introduction to Self on YouTube (Self is a direct descendent of Smalltalk) and saw the heavy emphasis on interacting with objects through UIs. The user interfaces showcased in this video are, in my opinion, horribly complex and unintuitive. Pretty much all of the interactions done through the UI would have been simpler and easier to understand if they had been done by writing one or two lines of code instead!
When you sit down and think about it for one second, you have to realize that programming doesn’t fundamentally have anything to do with graphical user interfaces. Yes, you can use programming code to create GUIs, but there is no reason that programming should have to involve GUIs and be tied to them. The metaphor of writing code has been extremely successful since the very beginning, and it probably makes more sense to the mathematical mind of a skilled programmer. Not everything has to have a visual metaphor. This is again a case of pushing some idealistic principle too far, in my opinion.
I believe that a lack of pragmatism is something that has killed many languages. Not just Smalltalk, but Scheme too. My first experience with Scheme involved trying and failing to install multiple Scheme distributions because I couldn’t get all the dependencies to work. Then, finally getting a Scheme compiler installed, and struggling to implement simple routines to parse text files, because Scheme doesn’t include the most basic string routines. The Scheme compiler I’d selected bragged that the code it produced was highly optimized, but once I finally managed to write my own string routines, I compiled my program, ran it, and it was dog slow. Parsing a one-megabyte CSV spreadsheet took over a minute. I ended up rewriting the code in Python. Why don’t more people code in Scheme? Because they try to realize their ideas in Scheme, and it just doesn’t quite work out.
I’ve recently been thinking about the topic of originality. You’ll often hear people say that “it’s all been done before” and “what’s old is new again”. The world population has recently passed the 7 billion mark. According to some estimates, there have been up to 120 billion human beings alive since the dawn of humanity. In a world so big, it’s hard to believe you’re unique. It’s easy to feel irrelevant and worthless. Some philosophers have even tried to make the argument that all possible thoughts have been thought of before, leaving you no chance of ever coming up with anything original. After all, human beings have existed for hundreds of thousands of years, and if there’s been 120 billion of us so far, there’s been a lot of thinking going on.
I think the best way to answer this question is with a thought experiment. We don’t know enough about neuroscience to exactly define what a “thought” constitutes. I’ll make some simplifying assumptions to give us some chance to grasp at this problem.
Let’s imagine that:
- Thoughts are patterns of neural firings in a small cluster of 512 neurons in your brain.
- Every human being has this same neural cluster.
- The wiring of the thought cluster entirely fixed, and identical in every individual, not affected by environment or genetics.
- Neurons in the thought cluster fire in a synchronized manner, 1000 times per second
In this imagined view, each thought is representable by a boolean vector of 512 bits, and any brain can have up to 1000 thoughts per second. In our imagined, simplified world, there are (2^512) ~= 1.34×10^154 possible thoughts in total.
Using some back of the envelope math, assuming there have been 120 billion human beings alive so far, each living for 100 years, each having up to 1000 possible thoughts per second, this gives us:
1000 * (365 * 24 * 60 * 60) * 100 ~= 3.2 * 10^12 thoughts per human being over a 100 year lifespan.
Hence (120 * 10^9) * (3.2 * 10^12) = 3.84×10^23 thoughts happened so far, out of 1.34×10^154 possible thoughts.
You might be wondering what the point of this was. My example is obviously ridiculous. Human thoughts likely are not patterns of firings in a cluster of 512 neurons. We have tens of billions of neurons in our brains, each with thousands of synapses, and our neurons do not fire according to a synchronous clock like a modern silicon chip. Furthermore, each brain’s connectivity is uniquely affected by a combination of both environment and genetics, and hence, no two people have exactly the same neurons and synapses in the same place.
The point is that the estimate of 1.34×10^154 possible thoughts is probably off by one hundred orders of magnitude. However, the estimate of 3.2 * 10^10 thoughts per year per human being may actually be generous. Hence, I surmise that not every possible thought has been thought. Far from it. The universe will likely dissipate before that has any chance of happening.