I found an interesting talk on YouTube titled “What Killed Smalltalk could Kill Ruby Too”:
Robert Martin makes the case that one of the big weaknesses of Smalltalk is that it was just “too easy to make a mess”. Smalltalk was highly dynamic, and encouraged people to “monkey patch” things and do quick fixes/hacks. He also makes the point that Smalltalk just “didn’t play well with others”. When you think about it, Smalltalk had its own source control, IDE and GUI built into live images, living alongside your program. Smalltalk isn’t just a language, it’s an operating system and a way of life. It’s conflating things that would maybe be best left separate.
It seems to me that in some key areas, the Smalltalk creators placed their own radical ideas above everything else. They chose idealism over pragmatism. Smalltalk was a language created with a grandiose vision. It had some deeply rooted principles which didn’t necessarily work so well in practice, such as the idea that everything had to be an object, that the object metaphor should be applied everywhere, one size fits all. At the end of the day, programmers want to get things done and be productive. If the language design or implementation gets in the way of getting things done, people will leave. Pragmatism is key for a programming language to succeed.
Smalltalk was also designed with the idea that it should be easy to learn and intuitive. This has led its creators to have a heavy focus on graphical user interfaces. I watched an introduction to Self on YouTube (Self is a direct descendent of Smalltalk) and saw the heavy emphasis on interacting with objects through UIs. The user interfaces showcased in this video are, in my opinion, horribly complex and unintuitive. Pretty much all of the interactions done through the UI would have been simpler and easier to understand if they had been done by writing one or two lines of code instead!
When you sit down and think about it for one second, you have to realize that programming doesn’t fundamentally have anything to do with graphical user interfaces. Yes, you can use programming code to create GUIs, but there is no reason that programming should have to involve GUIs and be tied to them. The metaphor of writing code has been extremely successful since the very beginning, and it probably makes more sense to the mathematical mind of a skilled programmer. Not everything has to have a visual metaphor. This is again a case of pushing some idealistic principle too far, in my opinion.
I believe that a lack of pragmatism is something that has killed many languages. Not just Smalltalk, but Scheme too. My first experience with Scheme involved trying and failing to install multiple Scheme distributions because I couldn’t get all the dependencies to work. Then, finally getting a Scheme compiler installed, and struggling to implement simple routines to parse text files, because Scheme doesn’t include the most basic string routines. The Scheme compiler I’d selected bragged that the code it produced was highly optimized, but once I finally managed to write my own string routines, I compiled my program, ran it, and it was dog slow. Parsing a one-megabyte CSV spreadsheet took over a minute. I ended up rewriting the code in Python. Why don’t more people code in Scheme? Because they try to realize their ideas in Scheme, and it just doesn’t quite work out.
I’ve recently been thinking about the topic of originality. You’ll often hear people say that “it’s all been done before” and “what’s old is new again”. The world population has recently passed the 7 billion mark. According to some estimates, there have been up to 120 billion human beings alive since the dawn of humanity. In a world so big, it’s hard to believe you’re unique. It’s easy to feel irrelevant and worthless. Some philosophers have even tried to make the argument that all possible thoughts have been thought of before, leaving you no chance of ever coming up with anything original. After all, human beings have existed for hundreds of thousands of years, and if there’s been 120 billion of us so far, there’s been a lot of thinking going on.
I think the best way to answer this question is with a thought experiment. We don’t know enough about neuroscience to exactly define what a “thought” constitutes. I’ll make some simplifying assumptions to give us some chance to grasp at this problem.
Let’s imagine that:
- Thoughts are patterns of neural firings in a small cluster of 512 neurons in your brain.
- Every human being has this same neural cluster.
- The wiring of the thought cluster entirely fixed, and identical in every individual, not affected by environment or genetics.
- Neurons in the thought cluster fire in a synchronized manner, 1000 times per second
In this imagined view, each thought is representable by a boolean vector of 512 bits, and any brain can have up to 1000 thoughts per second. In our imagined, simplified world, there are (2^512) ~= 1.34×10^154 possible thoughts in total.
Using some back of the envelope math, assuming there have been 120 billion human beings alive so far, each living for 100 years, each having up to 1000 possible thoughts per second, this gives us:
1000 * (365 * 24 * 60 * 60) * 100 ~= 3.2 * 10^12 thoughts per human being over a 100 year lifespan.
Hence (120 * 10^9) * (3.2 * 10^12) = 3.84×10^23 thoughts happened so far, out of 1.34×10^154 possible thoughts.
You might be wondering what the point of this was. My example is obviously ridiculous. Human thoughts likely are not patterns of firings in a cluster of 512 neurons. We have tens of billions of neurons in our brains, each with thousands of synapses, and our neurons do not fire according to a synchronous clock like a modern silicon chip. Furthermore, each brain’s connectivity is uniquely affected by a combination of both environment and genetics, and hence, no two people have exactly the same neurons and synapses in the same place.
The point is that the estimate of 1.34×10^154 possible thoughts is probably off by one hundred orders of magnitude. However, the estimate of 3.2 * 10^10 thoughts per year per human being may actually be generous. Hence, I surmise that not every possible thought has been thought. Far from it. The universe will likely dissipate before that has any chance of happening.
I just got informed that my second paper on basic block versioning, an extension of my previous work, has been rejected. Most academics don’t really talk about these things. You probably shouldn’t publicly say that your paper has been rejected, because you want to project some kind of image of never-ending flawless success. The calculated, business-like, aseptic thing to do is to keep quiet, rework your paper, submit it somewhere else, rinse and repeat.
I’m talking about it. I need to let out some steam, express my frustrations a little bit. If that’s a bad career move, well, so be it. I don’t want to spend my life hiding behind a façade, pretending I’m perfect and always cheerful. Living life without ever expressing yourself is a fast path to depression, if you ask me. At the moment, I’m both frustrated and sad. I’ve spent months working on this paper. It was a good paper. Somehow though, it wasn’t good enough. It didn’t make the cut. Better luck next time. Call me cynical, but it is a little depressing considering this conference has an acceptance rate of about 45%. Damn.
I’ve worked with a conference’s program committee before. I’ve had to evaluate a paper about a programming language that consisted of a hand-written AST encoded in XML, with no tool support. I think the paper was 8 pages long. They were pitching this as a revolutionary new idea. This was back in the day of the XML-all-the-things craze. Are you telling me that my latest submission is in the same category as the XML one? I guess when it comes to computer science conferences, you’re either a zero or a one. There is no middle ground. Your idea is either deserving of publication, or piped into /dev/null.
The perverse thing is that this constant stream of rejection discourages exploration. As an academic, you really want your papers to get accepted. Your funding and ultimately your academic career depend on it. I’ve already started to adapt the way that I work. When I started my PhD, I had no idea how the paper game was played. Now, when I have a new idea for my research, I have to ask myself: it this publishable? It’s really interesting, it has a lot of potential, but is it publishable?
The reason why conferences have limited acceptance rates dates back to the days when conference papers were published in these books called “proceedings” which were purchased, printed and shipped in the mail. You couldn’t accept every paper, it wasn’t physically or financially possible. Nowadays, it’s estimated the Google server farms have a total storage capacity rated in multiple exabytes. Conceivably, we could make all submissions to all conferences available on conference websites.
Why do so many computer science papers come without any source code? Because the current practices in our field discourage replication and encourage “massaging” of results. In the spirit of transparency, we could make all submissions available, along with all of the reviewer comments. Maybe we don’t want all papers to be on the same footing. Maybe your paper would get ranked into class A, B, C or D, maybe you’d get some score on a 5 or 10 point scale. Certainly, not everyone could realistically be invited to come and give a talk. Still, is there really a need to silently discard 50 to 90% of all submissions to a conference?
It feels like censorship. When a paper is rejected, it strongly discourages further exploration of that research avenue. You’re telling me that my idea doesn’t deserve to be seen. Worse, you’re giving my academic competitors a chance to beat me to the punch. Science is about proving and disproving things, but it’s also about playing with ideas. In the world of computer science conferences, there’s very little room for disproving anything, and even less room for playing with ideas. We don’t have time for that. The next conference deadline is coming up real soon, and we have funding applications to write. Peer reviews can become peer pressure, a civilized form of hazing.
Fortunately, my paper is already online on arXiv. It’s timestamped. It’s out there. I don’t know if I’ll have time to publish this paper at an academic conference before the end of my PhD, I’m being pressed to finish as soon as possible, and submit more papers. If it gets rejected one or two or three more times, it might never get into any conference. I can at least take some comfort in the idea that some of my research was published, and my latest work is out there. It might inspire someone to explore a similar research direction.
My personal opinion is that academic research in compilers is dying. It’s going to go the way of operating systems research. Why? Because there’s too much infrastructure to build. It takes too long. It’s just not practical to publish about. These days, the game-changing, innovative work in compilers is largely happening in the industry, and it’s being done by people who left academia.
This week I am in Prague, at the European Conference on Object Oriented Programming (ECOOP) to present my research on Basic Block Versioning. Getting to ECOOP was fairly stressful. I was flying overnight but can never manage to sleep on airplanes. Sleep-deprived, I had to run like mad in an attempt to make an impossible connection in Paris. The Charles de Gaulle airport is organized in a way that I had to wait for two shuttle buses and go through security twice. Fortunately, the Paris-Prague flight was slightly delayed, and I barely made the connection, but my checked luggage did not.
I presented my paper Wednesday afternoon. The talk went very smoothly and the audience questions were rather friendly. The paper is now available online from the publisher if you’re interested in reading it. I was very happy to see that my talk and all others were filmed. The video is not yet available, but I have uploaded the slides. In addition to giving a talk, I also presented a poster explaining the main aspects of my paper. I was pleasantly surprised when they informed me that I had won the distinguished poster award.
There are many interesting people here, including VM engineers from Mozilla and Google, Brendan Eich and Bjarne Stroustrup. I had the privilege of visiting touristic sites, sharing a meal and discussing VM design with Carl Friedrich Bolz (of PyPy fame) and Sam Tobin-Hochstadt. My main regret is that I’ve had a very difficult time adapting to the local time zone. I’m sleeping poorly at night and crashing every afternoon. This has resulted in me missing many interesting talks. I’m looking forward to the recorded videos being uploaded. The VM and language design talks from Curry On are of particular interest to me.
I haven’t written much about my progress on Higgs recently. This is because I’ve been busy writing and submitting conference papers. My ECOOP paper about basic block versioning has been accepted, and I just submitted a new paper about typed object shapes to DLS 2015. In July, I’ll be presenting my research at ECOOP in Prague, and then I’ll begin work on a third paper about interprocedural basic block versioning, which will complete my PhD work.
The unfortunate thing, for me, is that coding and working on Higgs is the part of my research that I enjoy the most (what can I say, I like building systems), and I haven’t had time to do much coding recently. Fortunately, there are a few more improvements to Higgs that I’ll implement for my next paper. I’ll be improving the calling convention, which should increase the performance of function calls quite a bit. I also have plans to improve the performance of global variable accesses even further.
The past decade has seen the rise of GPGPUs. We’re leveraging the tremendous computational power of graphics cards to accelerate computationally intensive applications such as machine learning, video compression and sorting. Unfortunately, GPGPU is somewhat slow to gain adoption. There are multiple issues involved, such as the need for special GPU/numerical programming languages, complex drivers, vendor-specific differences, and the overhead of having to shuffle data in and out of a separate memory hierarchy.
I was recently reading a blog post claiming that matrix multiplication (GEMM) is the most expensive operation in deep learning, taking up to 95% of the execution time. This got me thinking that maybe GPGPUs are simply not ideal for most applications. Maybe future CPUs should begin to include numerical coprocessors. Sure, we already have SIMD, but the way it’s implemented on x86 CPUs is awkward and relatively inefficient, forcing you to deal with multithreading, prefetching, and SIMD registers of small fixed sizes. Every few years, Intel adds support for new instructions with new SIMD register sizes, rendering your code outdated (yuck). To do SIMD well, you basically write or generate code specialized for a specific CPU model, and even then, it’s just not that fast.
I believe that the Cray 1, which came out in 1975, had the right idea. You write a small loop (kernel) and let the CPU handle the memory traffic and looping. What I’m thinking of is essentially a CPU core which is optimized to run parallel-for instructions on a reduced numerical instruction set. Imagine having a specialized CPU core that shares the same memory space as other CPU cores and can handle prefetching, parallelization and low-level optimizations according to its capabilities, without you needing to change your code. Imagine not needing a driver or threads to make use of this core. Imagine how fast matrix multiplication or cross product could be if you had native hardware support for it.