Skip to content

Method Call Syntax in a JS-like Language

November 22, 2016

As a side-project, I’m working on a small JavaScript-like programming language called Plush. Like JavaScript, this language is going to have objects and method calls. In this blog post, I’m going to discuss some seemingly trivial aspects of method call syntax and semantics which illustrate why programming language design is difficult, and solicit your feedback as to potential solutions.

Like JavaScript, plush has object literals which can be defined with the curly brace syntax. It will also have function objects, which are in fact closures. In JavaScript, you can assign a function to an object property, and it becomes callable as a method, like so:

o = {}; // empty object
o.m = function (x) {…};
o.m(2); // call the function in property m as a method

This all seems fine and dandy, but there are some issues with the way JavaScript implements this. The first is that method call syntax introduces some “hidden” semantics. When you call a JS function as a method, there is a hidden this argument (a reference to the object the method is called on) which gets passed to functions. This is invisible to the programmer. Hence:

// The statement below:
o.m(2);

// Is not equivalent to these statements:
f = o.m;
f(2);

// But it is equivalent to these two statements:
f = o.m;
f.call(o, 2);

The passing of the hidden this argument is a little bit annoying because it breaks the programmer’s intuition in a small way. JS syntax makes method calls look like regular function calls, but there is a difference. Sure, it’s not a big deal, but I ran into this issue while implementing IO functions in Plush. I wanted to implement some C++ “host” functions which would allow the language to interface with the outside, as such:

io = import “core.io”; // import the IO module

io.print(“hello world!”); // here ‘print’ is a host function written in C++

Maybe you see where this is going. If I implement method calls as JavaScript does, then every function, including host functions, need to take a hidden this argument, even if they don’t need it. I can work around this by making a special case where host functions called as methods behave differently, but then I potentially have to add an extra dynamic check at every function call. This gets annoying. I can work around this issue by wrapping my C++ host functions into plush functions which handle the unneeded this argument.

The second issue I ran into involves the way the this argument is handled. JavaScript has this problem where if you declare a closure inside a method, you can’t directly access the parent function’s this argument, as described in this blog post. This annoying quirk of the language was “solved” by the introduction of Function.prototype.bind. The JS workaround seemed like an ugly kludge to me, and so I thought “why not just do it like Python”? Why not make the this argument explicit. Make programmers declare and name the this value, in the same way that Python forces you to declare the self argument in methods.

Upon first examination, making the programmer explicitly declare the this argument seems like a fairy good idea. However, it does have the annoying consequence that an argument that was previously hidden is now a positional argument. Consider the following scenario:

// An IO module is imported, this produces an object containing IO functions
io = import “core.io”;

// The print method is “extracted”, so that we can call it with a shorthand name
print = io.print;

// This works fine. It passes a hidden this argument value to the print method
io.print(2);

// ERROR: we are not passing a this argument value
print(2);

In the case of modules, it’s clear that the print function shouldn’t even need a this argument value. I pondered this problem, and had the idea that possibly, method calls could have a different syntax from property accesses. The “arrow syntax” would make explicit the passing of the this argument:

// Call of the method m of object o
// The “arrow syntax” is specifically for method calls, and passes a this argument
o->m(2);

// Equivalent function call, passing an explicit this argument
o.f(o, 2);

io = import “core.io”;

// Regular function call, print does not take a this argument, it is not a method
io.print(2);

// This works correctly, as one might expect
print = io.print;
print(2);

The solution I came up with is not perfect, it raises some potential problems. For one, with a special arrow syntax for method calls, it becomes possible to call object properties using both regular function calls, and arrow style method calls which pass a this argument. Experience tells me that if both styles are possible, people are going to use both, which could be messy. For example, what do you do if you have a set of methods which do not need a this argument? Do you declare one anyway? Would you end up with a mix of regular function calls and method calls on the same object?

Advertisements
15 Comments
  1. FWIW Lua does it like this – they use colon (:) as a “call with implicit self” syntax. So a:foo(bar) is equivalent to a.foo(a, bar). A nice consequence is that you can cache the function object (f = a.foo; f(a, bar)) which works naturally because dot is just a property access and colon is a property access followed by a function call.

    • Thank you for bringing this up. I didn’t know Lua had this. It’s interesting to see that it’s used in a language and works well. I might do the same.

    • I also prefer the Lua style over the js-style, but using Lua for a few years I have realized it’s easy to forget it sometimes, and it can result in bugs that take some time to track down. My top pick in this case (lua/js style language) would be to just use closures to make objects (a function that returns a table/object with the public functions)

  2. Tobias permalink

    Python has the concept of “bound” methods that retain the this/self/…-argument as closured value:

    “`
    >>> class A(object):
    … def x(self, y):
    … print y

    >>> A.x

    >>> A.x(‘ho’)
    Traceback (most recent call last):
    File “”, line 1, in
    TypeError: unbound method x() must be called with A instance as first argument (got str instance instead)
    >>> a = A()
    >>> a

    >>> a.x
    <bound method A.x of >
    >>> y = a.x
    >>> y
    <bound method A.x of >
    >>> y(‘ho’)
    ho
    >>>
    “`

    I think this is pretty elegant and uniformity is not lot.
    Maybe this helps?

    • I think this works in Python because Python has classes and objects as class instances. In JS, it seems weird to have even more special semantics hidden inside the “.” operator (capturing the object when accessing a member function).

      • Tobias permalink

        That makes sense. But then again, that’s somehow the cost of the prototype-based OO AND having non-method functions, I think. If every function had to belong to an object, this wouldn’t be a problem ;)

        • Having every method belong to an object doesn’t seem very natural to me. Functions can exist without objects. There’s also the concept of multiple dispatch which is interesting.

  3. Peter Goodman permalink

    It kind of seems like this problem stems from the “flexibility” of letting one assign functions to things, as opposed to declaring things as methods directly. This seems like it causes more problems than it solves.

    One alternative solution, that tries to maintain the flexibility, would be to have method and function definitions. A function that accesses `this` is an error. It would also be an error to call a method that isn’t bound to an object.

    So you’d have something like this:

    o.m1 = method() { … }
    m2 = method() { … }
    o.m2 = m2

    o.m1(); // OK
    m2(); // Runtime error
    o.m2(); // OK

    o.f1 = function() { foo(this); }
    f2 = function() { foo(this); }
    o.f2 = f2
    o.f3 = function() { /* no this access */ }

    o.f1(); // Runtime error
    f2(); // Runtime error.
    o.f2(); // Runtime error.
    o.f3(); // OK

    • I’ve thought about this solution too. It’s not a bad idea, but it likely means that some type/error checking has to happen at every call site, which isn’t the end of the world, but there is some extra cost. For now, I implemented the Lua method call syntax. I aim to design my language in such a way that the syntax and semantics will be redefinable by users, so this solution could be implemented too, and tested.

  4. Job van der Zwan permalink

    Out of curiosity: what’s your motivation for starting from JavaScript’s approach to a hidden this-argument in the first place? It’s probably the least intuitive convention I’ve ever seen in programming, and I still don’t know what the benefit is supposed to be.

    • It’s a language I’m very familiar with. I think it has some nice qualities.

      The advantage of the hidden “this” argument is that it addresses the problems I’ve outlined in this post fairly well.

      • Job van der Zwan permalink

        Ah alright. I must have misread your post :D. The way I interpreted the problems was that you argued that the problems were cause by the existence of “this” to begin with.

        Let’s try that again with better reading comprehension on my side this time… ;)

  5. Any language with a “this” concept is bad in my opinion. When you are reading a section of code, yo you should not have to wonder what your environment carries. Also, closures are a disaster for readability and ensuring correctness, because the function you are creating does not exist anywhere in your source code exactly. How can you possibly prove (or even approach proof) something is correct when it is dynamically created? Closures are a way of partially implementing code-as-data, which was LISP’s great strength (and weakness). The way of the future is to have a declarative language syntax, where as much as possible the code is shifted toward non-executed constants, and code that is executed automatically as needed. The real defects of JS are: 1) insufficient fundamental data types, 2) lack of closure when using standard operations, 3) lack of ability to track erroneous or missing data, 4) lack of integration of drawing and event tracking into the syntax. 5) a weak data structure that tops out at JSON or one-dimensional arrays. JS is insanely fast for an interpreted language, but large-scale coding in JS is a total nightmare, as proven by the over 2 billion dollar cost for the healthcare.gov website.

    • If the healthcare.gov website really did “cost” two billion dollars, it has nothing to do with programming language choices, it’s most likely because someone somewhere directly pocketed 99% of that money, as happens with most other “expenses” in the american health industry.

      As for any language with a “this” concept being bad, this is your personal opinion of object-oriented programming. By making the this/self argument visible, some of the magic hidden syntax of JS is made explicit, which seems like a step in the right direction. I think you would have the same issue with multimethods, or anything else of the sort.

  6. Regrets for arriving late to this party. It seems you have settled your approach.

    I faced the same dilemma for my language Acorn and picked a different solution (similar to an option you rejected). In Acorn, ‘self’ (used similarly to ‘this’ in Javascript) is ALWAYS the implicit first parameter for every function. ‘self’ is never specified as part of a parameter list, it cannot be declared, and it is immutable. The pseudo-variable ‘self’ is used to retrieve its value (it always exists and always has a value as part of the execution context). Let’s examine your use cases:

    1) dragon.speak(“Leave or die”). This works just like in Javascript: it retrieves the function held by the ‘speak’ property of the dragon object and executes that function with dragon implicitly passed just before “Leave or die”. During this call, whenever the ‘speak’ function refers to ‘self’, it gets the dragon object.

    2) speak(“Leave or die”), where speak is a variable holding an anonymous function. This will call the function directly, with ‘null’ as the implicit first parameter representing ‘self’. If the function definition ever references ‘self’, it becomes a closure which captures the value of ‘self’ used by the function that dynamically created the function/closure. Thus, anonymous closures share ‘self’ with their creator and anonymous functions do not use ‘self’ and are effectively oblivious to its ‘null’ value.

    If by “host function”, you mean functions imported from a named or default library, these nearly always use the first use case’s syntax. Mostly, such functions are true object-oriented properties intended to apply to instances. However, any such functions that do not operate on an instance are still packaged as part of a namespace, so (say) Console.print(“Hello”) would retrieve print from the Console namespace. While executing “print”, ‘self’ is Console. However if “print” is not instance-oriented, it would not notice or care that it got a namespace for ‘self’.

    For the Acorn programmer, this approach is syntactically economical, easy to understand and hard to specify incorrectly. It has a minor performance penalty (a trivial stack push) given that ‘self’ is always passed, even when it is not needed or used. This penalty is quite small for a very minor proportion of function calls (most are true OO instance calls). It was a price I felt comfortable incurring given the simplicity benefits I believe it yields to the Acorn programmer.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: