Metaprogramming for JavaScript

November 1st, 2011

I've mentioned before that one of the things I find is missing from JavaScript is a macro system. More generally, is lacks a "clean" way to do metaprogramming. What I've done in Tachyon, in the cases where I needed to generate many similar pieces of code based on a template structure, is to generate code by concatenating strings, and then having the host VM compile that. In JavaScript, you can simply pass a string to the Function constructor, which will try to parse and compile a function definition from your string (you can also do this using eval). This is powerful in that you can generate arbitrary code on the fly, but generating code by string concatenations can be hard to maintain and error-prone. It makes me miss C++ templates, and pales in comparison to Scheme's macro system.

For my compiler writing purposes, I would like to have a way to generate code based on some sort of template, which would be instantiated based on a set of named parameters (i.e. a dictionary of parameters). I would like to be able to:

Replace type declaractions in a file (as C++ templates allow).
Replace named constants in a file (as can be done in C with #define).
Enable or disable segments of code (as can be done in C with #ifdef).
Unroll loops ahead of time and parameterize each iteration individually.
Fill in lists of arguments and the likes based on parameters.
Reorder named segments of code according to a priority system.

Some use cases I envision for this are things I already do using string concatenation, such as the generation of methods for object memory layouts and the generation of Foreign Function Interface (FFI) proxy functions. Other use cases would include auto-specializing primitive functions for optimization purposes.

A fairly obvious way to implement this metaprogramming system is to create a specialized text preprocessor. Essentially, having a function that can instantiate template files based on a dictionary of parameters. This has the advantages that it's easy to implement and use, and the disadvantages that it's perhaps unhygienic and restrictive, because it's not that much more advanced than C's preprocessor.

My research advisor has instead suggested that I implement some sort of Abstract Syntax Tree (AST) manipulation system, which would allow naming AST segments and splicing them. This is more in line with what Scheme does. The issues I see with this are that I'm not really sure how to proceed, I find the current AST rather cumbersome to manipulate, and such a system would require tighter coupling with our parser. Ideally, grammar extensions would be implemented for AST splicing, but that's also rather tricky, and perhaps best avoided at this point in Tachyon's development. Implementing an AST splicer would instead require us to disguise our spicing syntax as regular JavaScript so as to avoid having to extend the grammar.

Another thing to consider is that I'd rather not reinvent the wheel twice. I've already said that I'd eventually like to implement some sort of macro system so we can properly support things like assertions. I'd rather have my template system be usable in the macro system or vice versa, instead of having two completely independent systems with similar purposes. A macro system, for our purposes, could simply be something that allows us to associate names in the parser with generator functions. When a macro name is found in the source, the parser calls the associated generator function with the AST segments corresponding to the function call arguments. The generator function is then free to manipulate those as it sees fit, and produce another piece of AST or source as output, to be spliced at the call site by the parser.

The consideration that this metaprogramming framework might fit into a macro system leads me to believe that perhaps an AST splicing system is the better choice, from a technical standpoint. This requires some extensions to our parser, however, such as the ability to have named AST nodes which our macro system can identify. Another requirement is the implementation of some AST splicing operators. Finally, I would like to examine the way in which AST nodes are currently manipulated, and simplify the API, so that manipulating ASTs becomes less cumbersome.