Skip to content

Zeta’s JITterpreter

About six weeks ago, I made ZetaVM open source and announced it on this blog. This is a compiler/VM project that I had been quietly working on for about 18 months. The project now has 273 stars on GitHub. This is both exciting and scary, because so much of what I want to do with this project is not yet built. I really want to show this project to people, but I also find myself scared that people may come, see how immature/incomplete the project is at this stage, and never come back. In any case the project is open source now, and I think it would be good for me to write about it and explain some of the design ideas behind it, if only to document it for current and potential future collaborators.

One of the main goals of ZetaVM is to be very approachable, and enable more people to create their own programming languages, or to experiment with language design. With that goal in mind, I’ve designed a textual, stack-based bytecode format that resembles JSON, but allows for cycles (objects can reference one-another). Functions, basic blocks, and instructions in this bytecode format are all described as a graph of JS-like objects. This is very user-friendly. Anyone can write a Python script that outputs code in this format and run the output in ZetaVM. It’s also very powerful, in a LISP-y code-is-data kind of way: you can generate and introspect bytecode at run time, it’s just made of plain objects and values that the VM can manipulate. ZetaVM has first-class bytecode.

The downside of all this, as you can imagine, is that it inherently has to be absolutely dog slow. Or does it? The first version of the Zeta interpreter traversed the graph of bytecode objects the naive way, and it was indeed dog slow. I’ve since written a new interpreter which removes the overhead of the object-based bytecode by dynamically translating it to an internal representation (think dynamic binary translation). The internal IR is compact and flat, executing it involves no pointer hopping.

The new interpreter generates code on the fly, which means it is, by definition, a Just-In-Time (JIT) compiler. It’s architecture is based on Basic Block Versioning (BBV), a compilation technique I developed during my PhD (talks and papers on the topic are linked here if you’re curious). BBV has the nice property that it generates code lazily, and the generated code naturally ends up compact and fairly linear. This is not done yet, but BBV also makes it possible to specialize code to eliminate dynamic type checks very effectively, and to perform various optimizations on the fly.

You might be wondering why I’m bothering with an interpreter, instead of just writing a JIT that generates machine code. One of the motivating factors is that Zeta is still at a very early stage, and I think that an interpreter is a better choice for prototyping things. Another factor is that it occurred to me that I could potentially make Zeta more portable by having the interpreter do much of the compilation and optimization work. The interpreter can do type-specialization, inlining and various code simplifications.

The interpreter will be designed in such a way that the internal IR it produces is optimized and in many ways very close to machine code. It should then be possible to add a thin JIT layer on top to generate actual machine code. The resulting JIT will hopefully be much simpler and easier to maintain than if one were compiling directly from the raw object-based bytecode format. Another benefit of this design is that all of the optimizations that the interpreter perform will not be tied in with the specifics of x86 or other architectures, they will remain portable.

At the moment, the new interpreter is at a point where it lazily compiles code into the flat internal format, but performs no other optimization. This was enough to get a 7x performance improvement over the naive interpreter, but the current system is still quite a bit below the performance level of the Python interpreter, and there is definite room for improvement. Some of the first optimizations I would like to introduce are the elimination of redundant branching instructions, and the use of inline caching to speed up function calls.

The elimination of redundant branches is fairly easy to do with BBV. Code is lazily compiled, and appended linearly into a buffer. When generating code for a branch, if the block we are jumping to is just about to be compiled, then the branch is redundant. BBV will naturally tend to generate code that flows linearly along hot code paths and branches out-of-line for infrequent code paths. That is, the default path often comes right next and requires no branching.

Inline caching is a classic technique that was pioneered by the Smalltalk and SELF VMs. It’s used, among other things, to eliminate dynamic property lookups when polymorphic function calls are performed (see this excellent blog post for more information). Currently, ZetaVM, when performing a function call, needs to read multiple properties on function objects. For instance, it needs to find out how many arguments the function has, and what is its entry basic block. These property lookups are dynamic, and relatively slow. The end result is that the call instruction is very slow compared to other instructions.

Most call instructions will end up always calling the same instruction. Hence, dynamic overhead on function calls can largely be eliminated by caching the identity of the function being called by a given call instruction. That is, one can cache the number of arguments and entry basic block associated with a function object the first time a call instruction is run, and then reuse this information when calling the function again, provided we keep calling the same function. This information will be cached in line, in the instruction stream, right after the call instruction opcode, hence the name inline caching. I anticipate that with inline caching, function calls in Zeta can be made several times faster.

ZetaVM, my new compiler project

Like many of you I’m sure, I’ve wanted to create my own programming language for a long time. I think it’s a common aspiration for many programmers, to create a language with all the features we love and none of the ones we hate. Our own “ultimate programming language”, created in our own image. What I’ve come to realize over time, however, is that I’m actually not quite sure what should go into my ultimate programming language. I have a lot of ideas, but they don’t necessarily fit into a coherent whole. It’s also the case that my ideas about what the “ultimate programming language” should be like keep changing as I gain more programming experience and get exposed to new ideas.

My PhD was in compiler design, and this is something I truly enjoy playing with. As such, I’ve decided that my next programming language project wouldn’t be a programming language per-se, but actually a compiler, a platform to create new programming languages. I’m doing this in part because I enjoy it, and it’s something I feel confident I’m good at, but also because I think I can build a platform that will make it much easier for myself and others to do programming language design and experimentation. ZetaVM is going to have, as one of its main design goals, to make creating new programming languages very accessible. It will make it possible for anyone who’s mastered a language such as Python to create a language of their own, in less than 2000 lines of code. Not only that, but ZetaVM will instantly make your language decently fast. It will have JIT optimizations suitable to languages such as Python/Lua/JavaScript, and will instantly give you fast objects, type-specialization, etc.

ZetaVM is a virtual machine for dynamic programming languages. It will provide native support dynamic typing and most common data types found in Python/Lua/JS/Ruby, such as strings, extensible objects, extensible arrays. What makes it particularly easy to get your own language running on this VM is that Zeta’s Intermediate Representation (IR) is representable as a textual format similar to JSON. This makes it fairly trivial for you to write, say, a Python parser for your new language, and generate Zeta IR in a textual format at the output. You don’t have to worry about implementing dynamic typing, or register allocation, or garbage collection, or arrays and objects, all of that is done for you. I’ve created a simple language called Plush (JS and Lua’s bastard child), which demonstrates how this can be done, and serves to help me bootstrap and test the system.

Beyond making it easy for myself and others to create programming languages, Zeta will be a platform for me to try some bold new ideas in the design space of programming languages and JIT compilers. I would like to try and tackle one of the biggest issues plaguing programming languages today, which is that of code rot. My goal is to eventually freeze the IR and APIs provided by Zeta, so that code that runs on Zeta today might have a chance of still working in 20 years, without any changes. This goal is ambitious, but I have some ideas which I believe might make it work.

Finally, one big disclaimer I should give is that Zeta is still a young and immature project. In its current state, Zeta is experimental and will have many breaking changes, as most new languages/platforms do. Zeta also currently only has a naive interpreter which walks the object-based IR and is dog slow, about 200K instructions per second. I’m currently working on an interpreter that will compile the object-based IR into a lower-level internal IR. This interpreter will use Basic Block Versioning (BBV) and self-modifying code. I believe it should realistically able to reach speeds of 100MIPS within the coming months. My plan after that is to build a lightweight JIT which will sit on top of the optimizing interpreter and compile the internal IR to machine code.

Technodiversity

The year is 2048. Migration to IP-V6 has just been completed. Every object in the world, including your coffee mug and the chair you’re sitting on, comprises a manycore RISC-V cluster, running Linux, with its own unique IP address. Haskell, because of its mathematically provable superiority, has come to supplant every other programming language. Writing code in any language other than Haskell is now a misdemeanor, punishable by up to 64 days of bandwidth throttling and a fine of up to 0.125BTC.

Wouldn’t things be much simpler if every computer system was built on the platform, ran the same operating system, and every program was written in the same programming language? I think that in many ways, this is the wet dream of many programmers. No more dealing with cross-language boundaries, portability issues, and multiple incompatible implementations of subpar standards. Things could be much simpler and more efficient than they are now.

The biggest problem, however, is that in a world where every computer system runs the same software, the same version of the same operating system, on the same hardware, every computer system has exactly the same bugs and security weaknesses. Given that some security flaw exists, a single computer virus could potentially contaminate every computer system in the world in a very short amount of time. In today’s world, this Hollywood-like doomsday scenario seems extremely implausible. The real world is too messy, or too diverse, for it to be practical to build a single virus or worm that could infect every system.

In a lot of ways, the chaos of the technological world resembles that of nature. Like animal species competing for survival, various technologies and standards compete for funding and mindshare. In nature, species specialize to exploit new environments. Diversity, in the technological world, exists in part because specialization makes systems more efficient, which allows the exploitation of new market niches. In nature, genetic diversity, or biodiversity, makes it near-impossible for a single virus to wipe out the entirety of life on earth.

Today’s technological world is definitely messy, but we can take comfort in the fact that competition really does foster innovation. The technological diversity that exists right now is a sign that we live in a thriving ecosystem, as opposed to one that is stagnant. With self-driving cars and a variety of home automation devices on the horizon, we can also take comfort in the idea that technodiversity may actually be keeping us safe.

 

The Brain’s Registers

In the last few years, there’s been a lot of impressive work done with neural networks. One of the most interesting things I’ve seen is Word2vec, a technique to compute word embeddings, to map english words to large vectors of neural activations. This is interesting because this latent space for words has interesting properties. Concepts that are semantically close end up close together in the vector space, and it’s possible to do arithmetic on word vectors, giving us results such as ‘queen – woman + man ≈ king’.

The ability to do approximate arithmetic on word vectors is nifty, but what I think is most interesting is that Word2vec shows that we can translate fuzzy human concepts such as english words or sentences into an encoding from which computers can extract semantic information. There are now machine translation models based on recurrent neural networks which take a sentence in a language X, generate an activation vector, and then feed that vector into another recurrent neural network trained to generate text in another language Y. These machine translation models have reached the point where they are competitive with traditional statistical models.

The ability to create an intermediate representation for machine translation is quite powerful, but vectors of neural activations can be used to encode semantic information contained in many things besides written language, and also to transfer information from one modality to another. Researchers at Google have also shown that you can generate captions from images by first encoding images into semantic vectors, and then feeding those vectors into a recurrent neural network which generates text. Hence, a vector of neural activations can be used to represent semantic information, meaning, maybe even concepts and ideas in an abstract space that is agnostic to input and output modalities.

I hope this doesn’t seem too far-fetched, but I’m going to suggest that it’s highly likely that such semantic vectors are used to convey meaning and transfer information between different representations inside the human brain, particularly in the areas of the prefrontal cortex (PFC) that implement higher-level thinking. I’m going to go even further, and make the prediction that the PFC implements something analogous to the registers found in a microprocessor. Semantic registers which are used to buffer, store and transfer vectors of neural activations that encode meaning, concepts or ideas.

There is obviously still a lot we don’t know about the human brain, or the way in which it implements higher-level thinking, and the neurons in artificial neural networks are very loose approximations of their biological cousins. Still, bees, bats and birds use wings to fly. These are different kinds of wings, with different properties, which have evolved independently, but they are all wings nonetheless. All three have come to evolve a similar solution to the problem of flight because this solution is natural and efficient. In the same vein, I think you could make the argument that a register is a natural concept when it comes to shuffling and operating on data.

The registers in a microprocessor hold small pieces of temporary data. They store vectors of bits and typically have a few input and output ports. They implement simple operations, such as the ability to reset/erase the contents stored, and the ability to open or close the flow of information to/from various components inside the processor. In digital hardware, opening and closing of data ports is accomplished by gating signals with transistors. In biology, the gating of signals is accomplished by inhibitory neurons.

According to Wikipedia, working memory is “a cognitive system with a limited capacity that is responsible for the transient holding, processing, and manipulation of information”. It’s believed that human beings are limited to being able to hold only about seven items at a time in working memory. Could it be that this temporary store of information is the human brain’s equivalent of a CPU’s register file, built out of a relatively small set of neural registers, with inhibitory neurons gating their outputs? I believe this could begin to explain our amazing (and yet limited) ability to shuffle and operate on concepts and ideas stored in our working memory.

In a typical CPU, the register file is connected to components such as an Arithmetic and Logic Unit (ALU). The ALU can read data from one or more registers, and perform operations such as addition, subtraction and multiplication between the values read, and store the result back into registers. Values can also be compared and the result of such comparisons used to decide which action the CPU should take next. Human beings are not born with the ability to compute signed integer multiplications, but there are certain high-level thinking faculties which may be innate, such as the ability to reason by analogy.

One could imagine that, somewhere in the PFC, there may be a dedicated component which is able to load semantic registers and draw analogies between semantic vectors, to answer the question of whether A is to B as C is to D, and perform some action based on the result. It could be that the part of our brain that does high-level thinking contains a number of operators that connect to one or more semantic registers and perform various operations on the concepts represented. These would be our brain’s ALU: the set of primitives that enable our high-level cognitive abilities.

As Edsger Dijkstra famously said, “The question of whether machines can think is about as relevant as the question of whether submarines can swim.” Maybe my attempt to make sense of higher-level human thinking by drawing analogies with computing hardware is misguided. Maybe working as a compiler engineer and looking at machine code for too long has warped my own cognition beyond repair. The ideas I’m proposing here are largely speculative, but maybe I’m right, and maybe, within a decade or two, someone will begin to map out the circuits that constitute the brain’s registers and operators.

 

 

 

 

Method Call Syntax in a JS-like Language

As a side-project, I’m working on a small JavaScript-like programming language called Plush. Like JavaScript, this language is going to have objects and method calls. In this blog post, I’m going to discuss some seemingly trivial aspects of method call syntax and semantics which illustrate why programming language design is difficult, and solicit your feedback as to potential solutions.

Like JavaScript, plush has object literals which can be defined with the curly brace syntax. It will also have function objects, which are in fact closures. In JavaScript, you can assign a function to an object property, and it becomes callable as a method, like so:

o = {}; // empty object
o.m = function (x) {…};
o.m(2); // call the function in property m as a method

This all seems fine and dandy, but there are some issues with the way JavaScript implements this. The first is that method call syntax introduces some “hidden” semantics. When you call a JS function as a method, there is a hidden this argument (a reference to the object the method is called on) which gets passed to functions. This is invisible to the programmer. Hence:

// The statement below:
o.m(2);

// Is not equivalent to these statements:
f = o.m;
f(2);

// But it is equivalent to these two statements:
f = o.m;
f.call(o, 2);

The passing of the hidden this argument is a little bit annoying because it breaks the programmer’s intuition in a small way. JS syntax makes method calls look like regular function calls, but there is a difference. Sure, it’s not a big deal, but I ran into this issue while implementing IO functions in Plush. I wanted to implement some C++ “host” functions which would allow the language to interface with the outside, as such:

io = import “core.io”; // import the IO module

io.print(“hello world!”); // here ‘print’ is a host function written in C++

Maybe you see where this is going. If I implement method calls as JavaScript does, then every function, including host functions, need to take a hidden this argument, even if they don’t need it. I can work around this by making a special case where host functions called as methods behave differently, but then I potentially have to add an extra dynamic check at every function call. This gets annoying. I can work around this issue by wrapping my C++ host functions into plush functions which handle the unneeded this argument.

The second issue I ran into involves the way the this argument is handled. JavaScript has this problem where if you declare a closure inside a method, you can’t directly access the parent function’s this argument, as described in this blog post. This annoying quirk of the language was “solved” by the introduction of Function.prototype.bind. The JS workaround seemed like an ugly kludge to me, and so I thought “why not just do it like Python”? Why not make the this argument explicit. Make programmers declare and name the this value, in the same way that Python forces you to declare the self argument in methods.

Upon first examination, making the programmer explicitly declare the this argument seems like a fairy good idea. However, it does have the annoying consequence that an argument that was previously hidden is now a positional argument. Consider the following scenario:

// An IO module is imported, this produces an object containing IO functions
io = import “core.io”;

// The print method is “extracted”, so that we can call it with a shorthand name
print = io.print;

// This works fine. It passes a hidden this argument value to the print method
io.print(2);

// ERROR: we are not passing a this argument value
print(2);

In the case of modules, it’s clear that the print function shouldn’t even need a this argument value. I pondered this problem, and had the idea that possibly, method calls could have a different syntax from property accesses. The “arrow syntax” would make explicit the passing of the this argument:

// Call of the method m of object o
// The “arrow syntax” is specifically for method calls, and passes a this argument
o->m(2);

// Equivalent function call, passing an explicit this argument
o.f(o, 2);

io = import “core.io”;

// Regular function call, print does not take a this argument, it is not a method
io.print(2);

// This works correctly, as one might expect
print = io.print;
print(2);

The solution I came up with is not perfect, it raises some potential problems. For one, with a special arrow syntax for method calls, it becomes possible to call object properties using both regular function calls, and arrow style method calls which pass a this argument. Experience tells me that if both styles are possible, people are going to use both, which could be messy. For example, what do you do if you have a set of methods which do not need a this argument? Do you declare one anyway? Would you end up with a mix of regular function calls and method calls on the same object?

So, I’ve been using gedit this whole time

This is my coming out. I’m 31. I’ve been programming since I was 16, but all this time, I’ve never learned how to use a “proper” text editor.

It’s probably because as a teenager, I grew up with Windows 98, and on that platform, back then, command-line tools were very much second-class citizens. In university, I was introduced to Linux and somehow, over the years, I became hooked. I spent a total of 11 years in university, and over that course, I probably wrote over 400K lines of code. Most of it on Linux, and most of it using a text editor called gedit.

It’s the GNOME text editor. Its noteworthy features are syntax highlighting, and the ability to have multiple tabs open at once. It’s a fairly shitty text editor. It will freeze up if you try to open a text file more than a few hundred kilobytes, or with lines that are too long for it to properly digest. If you were to think of emacs as a sushi meal prepared by an experienced chef, then you could think of gedit as a microwaved bowl of ramen with a half cup of sriracha sauce dumped on it. I had the ramen right here at home, no cash on hand, and well, I was hungry.

I think the problem is, in university, there was never any class that was really about teaching us how to use properly and efficiently use tools. I learned about programming languages, operating systems, compiler design, and all those wonderful things, but all this time, I was never given the opportunity to sit down and think about the tools I was using. The craftsmanship of programming, so to speak. I started using gedit because it was the default on the Linux distro my university ran at the time. I stuck with it because of habit.

During my undergrad, we were given assignments with tight deadlines. We worked our asses off. Some semesters, I remember having as much as 40 hours of homework per week. Needless to say, there was a lot of pressure. Pressure to get things done quickly Given tight assignment deadlines, I didn’t really feel like spending 10 or 20 hours familiarizing myself with vi or emacs, or dealing with people who would tell me to go RTFM (Read the Fucking Manual). I went with gedit because it was intuitive and comfortable, as shitty as it was.

At my current workplace, we run MacOS, and well, while there is a port of gedit (not even kidding), it’s a fairly miserable experience. The MacOS port of gedit is like one of those freak creatures in a horror movie that begs you to pkill -9 it. Not knowing where to run, I started using the GitHub atom editor. It’s alright, but like gedit, it has its annoying quirks.

I’m 31, and I can’t help but want to take a step back. I don’t know how to use vim, but I think I’d like to learn. I can appreciate that a lot of thought was put into its design. It’s not a gimmicky gadget, it’s a powerful tool: a programmer’s text editor. I’m trying to convince myself that the time investment is worth it. At the very least, I can appreciate that vim is much more cross-platform, and stable across time, than any other text editor I’ve ever put up with.

My Experience with the ESP8266 – Making an LED Strip I can Control from my Shell

IoT is one of those buzzwords that’s been thrown around so much that it’s become largely synonymous with disappointing marketing hype. Still, home automation, in principle at least, has a lot of potential. The ESP8266 chip came out about two years ago, and it’s been drawing a lot of interest from the “maker” community. This inexpensive chip, for those who don’t know, incorporates a wifi module, flash memory, and a small 32-bit Xtensa core in a tiny package. This has had many people excited, because it means all kinds of electronic projects can be connected to wifi networks for only a few dollars in hardware cost.

I’ve known about the ESP8266 for a while, but until now, it wasn’t so interesting. Early versions of the chip had only a handful of I/O pins. You also needed to install some clunky SDK provided by the vendor to program it, and the instructions were not straightforward. Think tons of dependencies and lots of boilerplate code. Thankfully, this isn’t the case anymore. It’s now possible to get a NodeMCU module on eBay for less than $3.50 shipped, or $7 on Amazon. This module has many I/O pins, its own 3.3V voltage regulator, its own USB interface for programming, and best of all, is programmable with the Arduino IDE.

I’ve recently completed two projects with the ESP8266. One of them is a wifi-enabled power outlet that can be remotely switched on and off. The other is an RGB LED strip whose color can be changed remotely. I work with a Linux machine at home, and needed to update my udev rules in order for the USB interface of my NodeMCU modules to be recognized so that I could program them. Besides that, the whole process has been almost seamless.

holderThe components I used to interface a NodeMCU board with a 12V RGB LED strip

There is a simple “Hello World!” web server example that comes with the Arduino core package, and this example is less than 100 lines long. Making a wifi-enabled LED strip, on the software side, was a simple matter of parsing web request arguments for red, green and blue color values, and translating these into PWM intensity values to control output pins. Hardware-wise, I connected three 2N2222 transistors to the D0-D2 output pins, which are able to handle the current required to drive my one meter LED strip. The real beauty of it though, is that I can control the LED strip with shell commands, by issuing HTTP requests, which opens up the realm of scripting home automation:

# Make it red!
wget -O- 192.168.0.56/?r=255&g=0&b=0

2016-09-24-19-22-34

I intend to play with more of these in the future. There are now LED strips with individually-addressable LEDs, which seems like a lot of fun. I would also like to make a box with temperature, light and motion sensors that I can query remotely. If you’re interested in working with the ESP8266, I personally recommend buying a NodeMCU instead of a bare module. It will make your life much, much simpler.