The Indirection Problem

April 20th, 2012

In this post I'm going to talk about a performance disadvantage inherent to most (if not all) current implementations of dynamic languages and, more generally, languages that make use of garbage collection. This problem affects JavaScript, but also Python, Lua, Scheme and even Java to some extent. Surprisingly, this performance issue is one that is seldom talked about. When you ask someone why JavaScript code is slower than C, they will mention dynamic typing, run-time checks, number types and other sources of overhead, but they will seldom mention indirection.

What do I mean by indirection? Why is it a problem? In JavaScript (and most other dynamic languages), objects, arrays, functions and strings are reference types rather than value types. Concretely speaking, all JavaScript objects are allocated on the heap and manipulated through pointers. This means that accessing object properties necessarily means going through one or more levels of indirection. You might think that this isn't such a big problem. Most objects in C/C++ are also heap-allocated, after all. There is one big difference, however: in C and C++, the programmer can intentionally nest (inline) objects inside of each other. In JavaScript, this cannot be done manually.

As a simple example, suppose you wanted to define and instantiate an object representing a car and its 4 wheels, with each wheel having angles of direction and rotation. In C, you could do the following:

struct Car
{
    char makeModel[128]; // String for the car's make and model
    struct Wheel
    {
        double direction; // Wheel direction angle
        double rotation;  // Current angle of rotation
    } wheels[4];
} car;

In JavaScript, this could be done as follows:

// Car object literal
var car =
{
    makeModel: 'make and model string',
    wheels: [
        { direction: 0, rotation: 0 },
        { direction: 0, rotation: 0 },
        { direction: 0, rotation: 0 },
        { direction: 0, rotation: 0 },
    ]
}

Both of these accomplish the same goal. In either language, one can write car.wheels[0].direction = 0.32, for example. In practice, there are several differences as to how this will be implemented, however:

In C, the car object is explicitly allocated on the stack. In JavaScript, it is implicitly allocated on the heap.
The C object, including the wheel objects and the rotation angles, is one contiguous piece of memory. This is not the case in JavaScript. Modern JS engines will allocate a car object, a string object, an array object to store the wheels, and individual objects for each wheel. The JS car object only contains pointers.
In C, the rotation angles are contained within the wheel objects, in the same contiguous piece of memory. Under Google V8, this would not be so, the rotation angles, if they are floating-point, would each be stored in objects of their own.

These differences have a performance impact. In C, accessing a wheel rotation angle only requires accessing the stack pointer at some fixed offset. There is only one level of indirection. In JavaScript, there is indirection from the car variable to the car object, from the car object to the wheel array, from the wheel array to the wheel objects, and potentially from the wheel objects to the rotation angles. Concretely, this means:

Allocating all of the needed JavaScript objects will be slower. There may need to be multiple calls to the allocator. Object headers and pointers may need to be initialized.
Accessing object properties will also be slower in JavaScript. This is not so relevant in this toy example, but you could imagine that if we were updating a list of car objects in a game engine loop there could be a significant impact on performance.
The JavaScript representation will use more memory. In addition to storing the wheel objects, we have to store a header for these objects and pointers to the said wheels. Memory alignment constraints might impose additional memory waste as well.
Because the JavaScript representation is larger, it will not fit in as few cache lines as the C representation, this can cause further performance losses when dealing with large sets of objects.
In Google V8, incrementing a wheel rotation angle would cause a new floating-point object to be allocated.

I'm well aware that there are ways to deal with some of these issues in JavaScript. Perhaps the objects could be represented differently (we could have avoided the array). Perhaps we could have used the new Float64Array extension for rotation angles. The point is simply that often, the intuitive JavaScript representation performs worse due to the various reasons I've outlined. The programmer can do some things to recoup part of the performance, but this is often at the cost of readability. Ideally, one would want the (JIT) compiler to be able to mitigate or eliminate these performance issues without the code having to be changed.

One of the main reasons why I wrote this post is because I know that compilers can do more in this regard, and more would probably be done if compiler writers were more aware of these issues. The Java VM can do escape analysis and automatically allocate some objects on the stack when this is found to be safe. Papers have also been published, on the topic of automatic object inlining, that is, automatically allocating child objects inside of parent objects when possible. Such optimizations can help eliminate indirection and memory usage overhead. However, these techniques are not widespread in modern JIT compilers, and more research could be used to refine them.