Progress on the Higgs JIT

March 31th, 2013

I've been busy working on various aspects of the new Higgs JIT compiler in the last month or so. Among other things, I decided to expand it into a proper tracing JIT compiler complete with support for side-exit traces (or sub-traces as I like to call them). This brought some performance improvements. I'm also happy to report that the JIT compiler appears quite stable and is enabled by default. At this point, the Higgs JIT is still quite simple and primitive compared to commercial JavaScript VMs, but nevertheless provides marked speedups relative to the interpreter on every benchmark I tested it on.

I'm interested in finding the source code for the latest version of the Mozilla TraceMonkey shell (as well as instructions on how to build it) so that I can compare the performance of Higgs to another JavaScript tracing JIT. TraceMonkey is no longer maintained by Mozilla and is no longer in their main source code repository.

One important optimization I added to Higgs was to implement a stack frame initialization analysis. Previously, Higgs initialized every stack slot as soon as a function was called to make sure that stack frames were always in a consistent state when the garbage collector (GC) was called. As you can imagine, this is rather slow for large stack frames, and it's often a complete waste of time if the variables are going to be initialized later on. Now, a simple analysis actually checks which variables have been written to (and so may contain references to heap objects) and the GC uses this information when performing a collection. This significantly reduces the amount of work that has to be done during function calls.

An important limitation of the current JIT compiler is that it directly translates the interpreter's intermediate representation into a linear list of x86 machine instructions. This is an issue because the machine code produced is of fairly poor quality and would need to be further optimized, but contains little semantic information to facilitate optimization. One of the things I plan to do is to implement a liveness analysis to facilitate basic register allocation and eliminate redundant writes to the interpreter stack. I'm unfortunately not sure what is the best data structure to use (in terms of memory usage) to store liveness information and associate it with a function's intermediate representation (if anyone has advice, please comment).

Tom, a friend of mine, recently started helping with the Higgs project. Among other things, he took the time to track down and help me fix bugs that were causing crashes in various benchmarks. This process involved fixing many inaccuracies in the Higgs implementation of the JS spec. We're now at the point where all of the V8 and SunSpider benchmarks are working correctly, except for those using regular expressions. Tom has also started implementing a Foreign Function Interface (FFI) system for Higgs. Once completed, this should allow us to add support for APIs such as OpenGL, and POSIX sockets. This might give Higgs some real usefulness outside of academic experiments.