This is my bet: the age of dynamic languages is over. There will be no new successful ones.
Like him, I’ve noticed that despite the fact that there have been an enormous number of new programming languages coming out recently, the overwhelming majority of them are statically typed. Elben and others make the argument that this is because static languages are better equipped to deal with larger projects, they have better tooling, and programmers prefer them.
Dynamic languages are at a disadvantage. Most of the mainstream ones out there today were designed by amateurs, people with no formal CS background, or no adequate background in compiler construction. They were designed with no regard for performance, and an impractical mash of features that often poorly work together. Most of the dynamic languages you know are simply poorly crafted. This has resulted in some backlash. I think it’s pretty clear that there’s some amount of prejudice when it comes to dynamic languages.
In universities, computer science professors generally want little to do with dynamic languages. Compiler design courses are focused on statically typed and compiled languages. Type theoretic courses will teach you about Hindley-Milner type inference, but will leave you ill-equipped to understand dynamic typing. Students coming out of your average university compiler and PLT classes have no idea about the challenges involved in creating a JIT compiler, and know little about dynamic typing. It’s no big surprise that these students would go on to create statically typed programming languages.
There might be another factor at play here. Dynamic languages such as PHP, JS, Python and Ruby, in addition to being relatively poorly designed, are the languages that powered the explosion of the web. Nowadays, much (most?) of the programming happening in the world is web development. Much of this work is done by people with no formal CS background. The result is that you have millions of people with less formal education writing code in less well designed languages. JS, Python, PHP and Ruby, and by extension all dynamic languages, are seen by many academics as the programming languages of the riffraff, the unwashed masses, or maybe simply programming languages for relatively ignorant beginners.
Have static languages won? It seems to me that what people really like about static languages is IDE support for things like simple refactorings and autocompletion. Program analysis that can provide some guarantees, find certain classes of bugs without having to run programs with every possible combination of inputs. It’s perfectly legitimate for programmers to want these things. They help alleviate the cognitive burden of working with large (and small) codebases. But, these advantages aren’t inherently advantages of statically typed programming languages. I would argue that Smalltalk had (has) some amazingly powerful tools that go way beyond what the Eclipse IDE could ever give you.
I believe dynamic languages are here to stay. They can be very nimble, in ways that statically typed languages might never be able to match. We’re at a point in time where static typing dominates mainstream thought in the programming world, but that doesn’t mean dynamic languages are dead. So long as dynamic languages do offer advantages, either in terms of expressiveness or ease of learning, they will still be around. You should remember that, in the end, there is no war between static and dynamic languages. There are only tools and tradeoffs.
I will conclude by saying that in my view, programming languages are constantly evolving and influencing each other in what seems like an organic process. Features that are viewed as good ideas tend to spread from one language to many others. Think about closures, for instance. The functional crowd has been in love with closures since the days of LISP, and now, almost all mainstream programming language have closures. Going back to Elben Shira’s post, he states:
We will see a flourishing of languages that feel like you’re writing in a Clojure, but typed. Included will be a suite of powerful tools that we’ve never seen before, tools so convincing that only ascetics will ignore.
I’ve written, back in 2012, about my belief that static and dynamic typing can essentially be combined. You can have statically compiled languages that use type inference to realize what is effectively dynamic typing. That is, the compiler inserts union types where appropriate, and does so automatically. The Crystal programming language is a realization of this idea. This isn’t static languages winning a war over dynamic languages though. It’s the influence of dynamic languages bleeding into static languages. The Crystal language developers are entirely honest about the fact that this language is based on Ruby. Their aim is to build a language which captures much of the flexibility of Ruby’s dynamic typing and also provides you with static guarantees.
About a year ago, I was considering postdoc options. Two university professors had invited me to join their research groups. I wasn’t quite sure what to do. I like research, and I like the work I’ve been doing as part of my PhD. On the other hand, I’ve been growing increasingly frustrated with academia, and more specifically, with the publication game. I’ve had papers rejected several times now. More than once, reviewers who were clearly associated with competing research projects (and did not try to disguise this fact) have shot down my work with unfair, intellectually dishonest and sometimes hostile criticism. In general, I’ve come to feel that, at least in my sub-field, the exploration of new ideas is discouraged, and I’m not getting judged on the quality of my work, I’m getting judged on how well I play the publication game.
The end of my PhD is just a few months away now, and I had to make a choice. I was tempted to continue my research on basic block versioning, but the prospect of working very hard and maybe not being able to publish a single paper made me uncomfortable. Another issue is that the two professors who approached me for postdocs have made it clear that I should apply for a postdoc scholarship. They didn’t have enough money to pay me, and if I didn’t get this scholarship, I couldn’t do a postdoc. They were also putting pressure on me to decide as fast as possible, I got to understand that postdoc positions are limited and it’s a very competitive environment.
Recently, I attended a conference and got to meet a researcher who’s pretty well known in my field. He’s someone I really look up to, someone who’s name I’d seen on several papers that have shaped the development of my own research. We had dinner together a few times during the conference, and discussed various topics. One of the things that really struck me though, is that this guy is in the process of hopping from postdoc to postdoc. He’s struggling to publish his research, getting many of his papers shot down, and having difficulty finding a position as a university professor. He’s a much better academic than I am, and he’s still struggling.
In the meantime, I never really looked for a job, but I’ve been approached by IBM, Facebook, Microsoft, Google, Apple, Twitter, Amazon, Autodesk, AppNexus, Two Sigma, Reservoir Labs, D-Wave, and a few startups. It’s been a stark contrast. On the one hand, academia is offering me a chance to maybe do a postdoc, but only if I’m deemed good enough by the people who judge scholarship applications, and I have to decide now. On the other hand, industry people are bending over backwards to try and get me to come talk to them. I decided to go out and try interviewing for some of these companies, and over a month ago, I made my decision. I signed a generous offer from a company in the bay area.
I have no illusions that industry is some amazing utopia. I’m sure it will take me some time to adapt, and that I’ll miss some of the perks of being an academic. I know I’ll also miss Montreal, the city where I was born and raised. Still, I’ve been in university for over 11 years now, and I really think it’s time for me to try something different. I think that if I continued on the academic path, I’d be headed for stagnation and a burnout. Industry, in contrast, seems full of opportunities to explore. And hey, it won’t hurt that I’ll be making over six times what I get as a PhD student. For the last two years, I’ve been renting a tiny bedroom with a window on a noisy street, and sleeping on an uncomfortable futon that’s hurting my back. One of the first things I’m buying when I make it to California is a queen-sized bed, and the best mattress that money can buy.
The last three years have seen a slew of impressive deep learning feats. Machines are finally able to effectively extract meaning from the fuzzy thing that is the real world. This is already enabling all kinds of things, from smarter websites to self-driving cars. I’m very enthusiastic about this rapid progress. I think it’s becoming very clear that in the next 20 years, the world will see an increasing robotic presence. I don’t mean to say that androids will instantly become ubiquitous. We’re obviously not there, both in terms of AI capabilities, and in terms of being able to produce lightweight, affordable and nimble robots. What I mean to say is that many of us have already accepted self-driving cars as an inevitability, since there are already working prototypes. The question is then, why stop there?
How much of a stretch is it to go from a self-driving car to a street-sweeping robot? What about an automated garbage-disposal truck with a robot arm that picks up garbage bags without human workers? What about an automated lawnmower you can control with your smartphone? It’s obviously going to take some time for these things to be developed, but I think we can all agree that the automated garbage-disposal truck is not that far-fetched. How much more of a stretch is it to go from such a robot to one that restocks store shelves? Those who have warned us of robots taking human jobs are most likely right.
In the domestic realm, there is already a Berkeley prototype of a robot that can fold laundry. I think one of the biggest hurdles there is going to be the cost. The economic reality is that going from a university prototype to a useful product requires several years and a large monetary investment in R&D. Furthermore, no investors are going to fund the development of such products if the amount of people who can afford to buy them is too small to make a profit. This means that even though we’re very close to being technologically capable of building domestic robots, it’s going to take some time before they reach the market.
Still, I think at this point, it’s only a matter of time. Robots are going to become increasingly present in the world around us, and this will likely generate a feedback cycle. The range of robotic capabilities will expand, the cost of components will go down, and as robots enter the global mindshare, people will become increasingly likely to want to apply robotics to various tasks. We’re about to see a robotic revolution. At this point, it seems inevitable.
It’s an opinion that often comes up in development circles. We’re in the middle of a sort of language boom right now, and every time someone comes along and announces a new language, other people suggest that we should all stop creating new programming languages, because there’s already too many out there. Common complaints include the amount of effort needed to constantly learn new languages, the fragmentation caused by the constant introduction of new languages and the lack of innovation in these.
If you feel that new languages require too much effort to learn, the first thing I have to say is: tough luck. Programming is very much about constantly learning new things and being able to learn on your own. Once you’ve mastered a few programming languages, you should begin to see repeating patterns and find that the skills you’ve already acquired are very transferable. For instance, I’ve never written a single line of Go, but I’ve used C, C++ and D. I’m sure I could get started writing Go code within a few hours, and become reasonably proficient within a week. Go doesn’t scare me. Another important thing to realize is that not knowing a programming language isn’t necessarily a flaw you have to correct. Case in point: there are still many job postings out there for COBOL programmers. It’s perfectly acceptable to specialize in one or a few languages of your choosing.
As for the issue of fragmentation, I think there’s truth to it, but it’s not as bad as people imagine. There are only a few languages out there which are truly mainstream. These have existed for years, and if you know just 3 out of the top 10, you’ll be well equipped to realize personal projects or land yourself a development job somewhere. The new programming languages that come out, for the most part, remain relatively fringe and are used mostly by hobbyists. Most of them will likely die out. I’ve written my PhD project in D, and I’ve found that surprisingly few programmers have ever written code in that language. In fact, none of the programmers I’ve met in person outside of DConf had ever used D.
The biggest kind of fragmentation problem, in my opinion, is self-inflicted. I’ve spoken to people at various companies who told me that their architecture was made of some mishmash of parts written in five or six different languages. That seems like an issue to me, if only because more languages means more dependencies, more breakage, more interoperability problems and more code maintenance issues. That’s not the fault of all these languages for existing though, it’s simply bad planning. The administration there let some capricious programmers get their wish and include new code written in their pet language in the system, with little regard for the added complexity this language soup would introduce.
There’s some argument to be made that many of the new languages coming out lack originality. At the moment, most of them are statically typed and compiled ahead of time and most of them have similar syntax. It’s true that there isn’t a lot of innovation overall, but I don’t think that’s a reason to stop creating new programming languages. The design space is huge, infinite in fact, and we’re only beginning to explore it, in my opinion. Consider that even to this day, all the commonly used languages are based on the editing of text files. Also remember that even the languages which don’t succeed on a large scale, such as LISP, can have a tremendous impact on other languages down the road. Imagine a world where there were only two programming languages in existence: COBOL and Fortran. Is that a world you would like to live in? I would argue that there is a need for new languages to solve new problems.
Earlier this year, my first paper about Basic Block Versioning (BBV) was accepted at ECOOP 2015. BBV is a JIT code generation technique I’ve been working on which is very effective for eliminating redundant dynamic type tests. In this first paper, we (my advisor and I) were able to show that our technique eliminates 71% of dynamic type tests across our set of benchmarks, resulting in significant performance improvements. Last week, I submitted a paper about an interprocedural extension to BBV. This extends on the original work we did by generalizing the technique to propagate type information through function calls, about function parameters and return types as well.
The improvements are quite striking. We’re now able to eliminate 94.3% of dynamic type tests on average, and we eliminate more than 80% of type tests on every benchmark. To put things in perspective, I decided to compare this result with what’s achievable using a static type analysis. I devised a scheme to give me an upper bound on the number of type tests a static analysis could possibly eliminate. First, execute all the benchmarks and record the result of each type tests. Then, re-execute the benchmarks with the type tests that always evaluate to the same result removed. This is equivalent to using a static type analysis with access to “perfect” information about which type tests are going to be redundant. The resuls of this experiment are shown in the graph below:
I was very pleased when I first saw these results. The “perfect” static analysis eliminates, on average, 91.7% of type tests, which is less than what we achieve with interprocedural BBV. You might be wondering how this is possible, how BBV can possibly eliminate more type tests than what should be an upper bound on the number of type tests that can be eliminated. The main point is that the analysis is just an oracle that tells us whether any given type test is going to be redundant and safe to eliminate or not. In contrast, BBV has the power to selectively duplicate sections of code, which makes it possible to eliminate even more type tests.
The main reason that code duplication (or versioning), is useful, is that it allows BBV to separate out contextual information which wasn’t present in the original untransformed program. If you want a simple example, think of a loop where some variable x is an integer in the first iteration, and then becomes a string in every subsequent iteration. A traditional type analysis will see that this variable could be either an integer or a string, and conclude that we don’t know what type this variable will have at run time. In contrast, BBV might be able to unroll the first iteration of the loop, and know that it will be integer in this first iteration, and a string in every other iteration, thus eliminating all type tests on the variable x.
There are interesting implications to the results obtained with BBV, in my opinion. I was recently talking to someone at Facebook about HHVM, their virtual machine for PHP and Hack. They were telling me that Hack has gradual typing, and that this information isn’t yet used to optimize code in HHVM. Based on the results I got, I would say that there is probably no real need to type-annotate code in order to improve performance. Gradual typing can be useful for documentation, tools and safety, but it doesn’t really offer performance advantages. BBV can already eliminate over 94% of type tests on average, and this is only the beginning, there are still many easy ways to further improve upon these results. Alternatively, flipping what I just said on its head, if a JIT compiler can known the types of most variables at code generation time, the type assertions introduced by gradual typing can be implemented at little to no cost.
I recently blogged about my thoughts on the medium-term future of the internet, and the imminent coming of the Smart Web. There’s been a huge amount of progress in machine learning in the last five years, largely due to breakthroughs in deep learning. You might not be directly aware of it, but we’re at the beginning of a machine learning boom right now, a neural network renaissance. Google and Facebook are pouring huge amounts of money into deep learning. In the next few years, we’re going to see the fruits of these investments. Self-driving cars, automatic closed captions and more accurate machine translation come to mind, but I would argue that the ramifications are going to quickly expand much beyond this. If you think computers and the internet have changed the world in the last 20 years, you should really brace yourself for what’s coming, because really, that was just a warm up.
A few day ago, I interviewed at a web advertisement company in New York. Let’s call them Cloud7. They explained to me that they do Real-Time Bidding (RTB). According to them, every major internet ad provider does this now. When you click on a link and start loading a webpage, the ad provider gets blobs of data providing them with a rough idea of who you are (age, sex, income bracket), the websites you’ve been to, what you’ve been shopping for, etc. Many advertisers, wanting to sell you their products, then get to bid some amount (cents, fractions of cents) to buy ad spaces on the page you’re loading. Multiple ad auctions are over in tens of milliseconds before the page is done loading. If you’re wealthy and you’ve been visiting many car websites recently, then car vendors might be willing to outbid everyone to show you car ads, because they stand to make much more money selling you a car then a shoe company would selling you shoes.
You’ll be interested to know that the web advertisement world is already set up so that the information ad providers like Cloud7 receive about you is in part supplied by outfits referred to as third party data providers. There is already, as of now, a market in place for APIs that can produce information about visitors to a webpage. Information about you is already automatically gathered by multiple entities, traded for a monetary value and used to better pick the ads you see. The technology is somewhat primitive right now, but it’s improving constantly. Improvements in ad targeting can translate in huge revenue increases, so there’s a clear incentive to make these systems smarter. There’s a huge incentive to gather a richer set of information about you, and the market to buy and sell that information is already in place.
What deep learning will allow us to do is to bridge the semantic gap between the fuzzy thing that is the real world, and the symbolic world computer programs operate in. Simply put, machines will soon have much more understanding of the world than they currently do. A few years from now, you’ll take a picture of your friend Sarah eating an ice cream cone, and some machine in the cloud will recognize Sarah in the said picture. It will know that she’s eating ice cream, probably chocolate flavored by the color of it. Facial expression recognition will make it possible to see that she looks excited with a hint of insecurity. Combining information from multiple third party data providers, it won’t be too difficult to infer that you and Sarah are on your third date together. Looking at browsing history and social network profiles, it might be possible to have a pretty good idea how you two feel about each other, and whether this relationship is going to flourish or perish. What you yourself don’t know is that Sarah wanted to impress you so much, she went out and bought a new dress to wear on this date during her lunch break. Odds are you two will see each other again.
Why would Google or Facebook care about your date with Sarah, and your feelings for each other? Because that information can be useful and valuable in the right hands, which makes that information worth money. You might be more interested in having meals at fancy restaurants near her work in the next few weeks, or in buying that PlayStation 5 game she’s been talking about. Personal lubricant, scented candle and fluffy handcuff manufacturers think you might be more interested in their products than before. I don’t think this is so far-fetched. Google, Facebook, Amazon and every consumer-facing company out there want your money. The better they understand you, your life, and the world, the better chance they have at successfully getting you to hand them your cash. They might actually make your internet experience way more fun in the process. At the very least, the ads you see are going to be increasingly smart and relevant, which isn’t necessarily a bad thing.
Unfortunately, not everyone has “Don’t Be Evil” as their company motto. There’s another group of businesspeople, besides advertisers, which stands to profit hugely from machine learning. The people I’m talking about are scammers. Deep learning can be used to recognize people and objects, extract semantic information out of pictures, videos and tweets, but that’s not all it’s useful for. As illustrated in this amazing blog post, neural networks can also be used to generate content. Soon enough, scammers might be able to automatically produce content that begins to look eerily real. I don’t think it’s that far-fetched to think that your writing style could be imitated, complete with accurate details of your life thrown in. What if there was a program that could generate fake naked pictures of you and e-mail them to people you know? Worse, what if it were possible for a piece of software to call people you know and impersonate your voice on the phone? Sure, the machine doing the calling isn’t self-aware, but if it can have some rudimentary understanding of what people say to it and follow some kind of script, that might still be enough to cause a lot of trouble.