Skip to content

Software Bugs That Cause Real-World Harm

Years ago, when I was an undergraduate student at McGill, I took a software engineering class, and as part of that class, I heard the infamous story of the Therac-25 computer-controlled radiotherapy machine. Long story short: a software bug caused the machine to occasionally give radiation doses that were sometimes hundreds of times greater than normal, which could result in grave injury or death. This story gets told in class to make an important point: don’t be a cowboy, if you’re a software engineer and you’re working on safety-critical systems, you absolutely must do due diligence and implement proper validation and testing, otherwise you could be putting human lives at risk. Unfortunately, I think the real point kind of gets lost on many people. You might hear that story and think that the lesson is that you should never ever work on safety-critical systems where such due diligence is required, and that you’re really lucky to be pocketing hundreds of thousands of dollars a year working on web apps, where the outcome of your work, and all the bugs that may still remain dormant somewhere in your code, will never harm anyone. Some people work on safety-critical code, and these people bear the weight of tremendous responsibility, but not you, you’re using blockchain technology to build AirBnB for dogs, which couldn’t possibly harm anyone even if it tried. I’d like to share three stories with you. I’ve saved the best story for last.

Back in 2016, I completed my PhD, and took my first “real” job, working at Apple in California. I was joining a team that was working on the GPU compiler for the iPhone and other iDevices. While getting set up in California prior to starting the job, it occurred to me that showing up to work with an Android phone, while being part of a team that was working on the iPhone, might not look so great, and so I decided to make a stop at the Apple store and bought the best iPhone that was available at the time, an iPhone 6S Plus with 128GB of storage. Overall, I was very pleased with the phone: it was lightweight, snappy and beautiful, with great battery life, and the fingerprint sensor meant I didn’t have to constantly type my pin code like on my previous Android phone, a clear upgrade.

Fast forward a few months and I had to catch an early morning flight for a work-related conference. I set an early alarm on my phone and went to sleep. The next day, I woke up and instantly felt like something was wrong, because I could see that it was really sunny outside. I went to check the time on my iPhone. I flipped the phone over and was instantly filled with an awful sinking sense of dread: it was already past my flight’s takeoff time! The screen on my phone showed that the alarm I had set was in the process of ringing, but for some reason, the phone wasn’t vibrating or making any sound. It was “ringing” completely silently, but the animation associated with a ringing alarm was active.

I did manage to get another flight, but I needed my manager’s approval, and so I had to call him and explain the situation, feeling ashamed the whole time (I swear it’s not my fault, I swear I’m not just lazy this bug is real, I swear). Thankfully, he was a very understanding man, and I did make it to the conference, but I missed most of the first day and opening activities. It wasn’t the first or the last time that I experienced this bug, it happened sporadically, seemingly randomly, over the span of several months. I couldn’t help but feel angry. Someone’s incompetence had caused me to experience anxiety and shame, but it had also caused several people to waste time, and the company to waste money on a missed flight. Why hadn’t this bug been fixed after several months? How many other people were impacted? I had a cushy tech job where if I show up to work late, people ask if I’m doing alright, but some people have jobs where being late can cause them to be fired on the spot, and some of these people might have a family to support, and be living paycheque to paycheque. A malfunctioning alarm clock probably isn’t going to directly cause a person’s death, but it definitely has the potential to cause real-world harm.

The point of this blog post isn’t to throw Apple under the bus, and so I’ll share another story (or maybe more of a rant) about poor software design in Android OS and how it’s impacted my life. About 3 years after working at Apple, when the replacement battery in my iPhone 6S Plus started to wear out, I decided to try Android again, and so I got myself a Google Pixel 3A XL. This phone also had a nice fingerprint scanner, but the best differentiating feature was of course the headphone jack. Unfortunately, Android suffers from poor user interface design in a few areas, and one of the most annoying flaws in its user interface is simply that the stock Android OS doesn’t have flexible enough options when it comes to controlling when the phone rings, which is one of the most important aspects of a phone.

Being a millenial, I don’t particularly like phone calls. I would much prefer to be able to make appointments and file support tickets using an online system. However, my deep dislike for phone calls probably stems from a more personal issue, which is that my mother is an unmedicated schizophrenic. She doesn’t respect my boundaries. She has done things such as randomly call me in the middle of the night because her irrational paranoia causes her to be worried that shadowy evil figures are coming after me. Thankfully, Android now has “bedtime mode” feature, which allows me to make it so that phone calls won’t cause my phone to ring between 10PM and 8:30AM. If my mom happens to die in a hospital in the middle of the night, I’ll just have to find out and be sad the next day. My sleep is sacred, and bedtime mode allows me to enforce some basic boundaries using software.

Bedtime mode is quite useful, but I still have the other problem that my mom could decide to randomly call me in the daytime as well, and unfortunately I rarely want to take her phone calls. However, I also don’t want her to end up homeless or in jail (which has happened before, but that’s a story for another time), and so I don’t want to block her and completely lose the ability to receive her calls. This results in me having to almost always have my phone set to “do not disturb”, so that I don’t have to be disturbed at random times by unwanted phone calls. I wish that Android had an option to set a specific person to never cause the phone to ring, and it seems like that should be an easy feature to implement that would have a real positive impact on the quality of lives of many people, but I digress.

The real problem is that, although I hate phone calls, our society is still structured in such a way that sometimes, I have to receive “important” phone calls. For instance, my doctor recently placed a referral for me to see a specialist. I’ve been told that the hospital is going to call me some time in the next few weeks. I don’t want to miss that phone call, and so I have to disable “do not disturb”. However, because the stock Android OS has only one slider for “Ring & notification volume”, disabling do not disturb means that my phone will constantly “ding” and produce annoying sounds every time I get a text message or any app produces a notification, which is very disruptive. The fact is, while I occasionally do want my phone to ring so I can receive important phone calls, I basically never want app notifications to produce sound. I’ve been told that I should go and individually disable notifications for every single app on my phone, but you tell me, why in the fuck can’t there simply be two separate fucking sliders for “Ring volume” and “Notification volume”? In my opinion, the fact that there isn’t simply highlights continued gross incompetence and disregard for user experience. Surely, this design flaw has caused millions of people to experience unnecessary anxiety, and should have been fixed years ago.

This is turning out to be a long-ish blog post, but as I said, I’ve kept the best story for last. I’m in the process of buying a new place, and I’ll be moving in two weeks from now. As part of this, I’ve decided to do some renovations, and so I needed to get some construction materials, including sheets of drywall. This is a bit awkward, because I’m a woman living in the city. I don’t have a car or a driver’s license. Sheets of drywall are also quite heavy, and too big to fit in the building’s elevator, meaning they have to be carried in the stairs up to the third floor. Yikes.

In Montreal, where I live, there are 3 main companies selling renovation supplies: Home Depot, Rona and Reno-Depot. Home Depot is the only one that had all the things I needed to order, so I went to their website and added all the items to my cart. It took me about 45 minutes to select everything and fill the order form, but when I got to the point where I could place the order, the website gave me a message saying “An unknown error has occurred”. That’s it, no more details than that, no description of the cause of the error, just, sorry lol, nope, you can’t place this order, and you don’t get an explanation. I was really frustrated that I had wasted almost an hour trying to place that order. A friend of mine suggested that maybe she could try placing the order and it would work. I printed the page with the contents of my cart to a PDF document and sent them over. It worked for her, she was able to place the order, and so I sent her an electronic payment to cover the costs.

Since my new place is on the third floor, we had some time pressure to get things done, and heavy items would have to be carried up the stairs, we paid extra specifically to have the items delivered inside the condo unit and within a fixed time period between noon and 3PM. The total cost for delivery was 90 Canadian dollars, which seems fairly outrageous, but sometimes, you just have no choice. I was expecting my delivery before 3PM, and the Home Depot website had said that I would get a text 30 minutes before delivery. At 2:59PM, I received two text messages at the same time. The first said “Your order has just been picked up”. The second said “Your order has just been delivered, click here to rate your delivery experience”. Again, I was filled with a sense of dread. Had they tried to reach me and failed? Had they just dumped the construction materials outside? I rushed downstairs. There was no sign of a delivery truck or any of the materials. I figured there must be another software bug, despite what the second text message said, the delivery clearly hadn’t happened yet.

Sure enough, at 3:27PM, 27 minutes after the end of my delivery window, I received a phone call from a delivery driver. He was downstairs, and he was about to dump the construction materials on the sidewalk. NO! I explained that I had paid extra to have the materials delivered inside the unit. I could show him the email that proved that I had paid specifically for this service. He argued back, according to his system, he was supposed to dump the materials at the curb. Furthermore, they had only sent one guy. There was no way he alone could carry 8 foot long, 56-pound sheets of drywall up to the third floor. I raised my voice, he raised his. After a few minutes, he said he would call his manager. He called back. The delivery company would send a second truck with another guy to help him carry the materials upstairs. I felt angry, but also glad that I had stood my ground in that argument.

The first guy waited, sitting on the side of the curb in the heat, looking angry, doing nothing, for about 30 minutes until the second guy showed up to help. When the second delivery guy showed up, he asked to see the email. I showed him proof that I had paid to have things delivered upstairs. He also stated that their system said they only had to drop things in front of the building, but that he believed me. The delivery company was a subcontractor, and this was a software bug they had encountered before. This bug had caused multiple other customers to be extremely upset. So upset, in fact, that one customer, he said, had literally taken him hostage once, and another one had assaulted him. Gross, almost criminal incompetence on the part of one or more developers somewhere had again caused many people to waste time and to experience stress, anger, and even violence. The most infuriating part of this though, of course, is that bugs like this are known to exist, but they often go unfixed for months, sometimes even years. The people responsible have to know that their incompetence, and their inaction is causing continued real-world harm.

The point of this blog post is that, although most of us don’t work on software that would directly be considered safety-critical, we live in a world that’s becoming increasingly automated and computerized, and sometimes, bugs in seemingly mundane pieces of code, even web apps, can cause real-world suffering and harm, particularly when they go unfixed for weeks, months or even years. Part of the problem may be that many industry players lack respect for software engineering as a craft. Programmers are seen as replaceable cogs and as “code monkeys”, and not always given enough time to do due diligence. Some industry players also love the idea that you can take a random person, put them through a 3-month bootcamp, and get a useful, replaceable code monkey at the other end of that process. I want to tell you that no matter how you got to where you are today, if you do your job seriously, and you care about user experience, you could be making a real difference in the quality of life of many people. Skilled software engineers don’t wear masks or capes, but they can still have cool aliases, and they truly have the power to make the world better or worse.

Memory, Pages, mmap, and Linear Address Spaces

We don’t always think of it this way, but on modern machines, memory and pointers are an abstraction. Today’s machines have virtual memory, divided in blocks called “pages”, such that the addresses represented by pointers don’t necessarily map to the same address in physical RAM. In fact, mmap even makes it possible to map files to memory, so some of these addresses aren’t even mapped to RAM addresses at all.

Two weeks ago, I wrote about UVM, the small virtual machine I’ve been building in my spare time. This VM has a relatively low-level design where it has untyped instructions and pointers for instance. Generally speaking, I’ve done my best to design the VM to be fairly “conventional”, in the sense that most design elements are aligned with common practice and unsurprising. In my opinion, this is important because it keeps the design approachable to newcomers. Having a design that’s unsurprising means that new users don’t need to read a 100-page manual and familiarize themselves with new terminology to get something done.

Even though I’ve done what I could to keep the design of UVM unsurprising, there is one aspect that’s unconventional. At the moment, UVM uses what’s known as a Harvard Architecture, where code and data live in two separate linear address spaces. Essentially, code and data live in two separate arrays of bytes. There’s actually a third address space too: the stack is distinct from the address space used for the heap. That means you can’t directly get a pointer to something that’s stored on the stack.

It’s maybe not that unconventional when you think about it, because WASM works the same way. You can’t directly get a pointer to a stack variable in WASM either, and you also can’t directly read/write to code memory. Same goes for the JVM. It just seems unconventional because UVM presents itself as a fairly low-level virtual machine that gives you pointers, and yet there are restrictions on what you can do with those pointers.

There’s a few reasons why the stack, the heap and executable memory are separate in UVM. The main reason is performance. By creating distinct address spaces, we make accesses to these different address spaces explicit. At the moment, UVM is interpreted, but my goal is to eventually build a simple JIT compiler for it. That brings in performance considerations. If everything lived in a single address space, then potentially, every write to memory could write anywhere. Every single time that you write to memory, UVM would have to validate, somehow, that you didn’t just overwrite the code that is about to be executed.

There are also performance considerations for stack variable accesses. In a JIT, it can be useful for some performance optimizations to be able to assume, for instance, that you know the type of things that are stored on the stack. It can also be useful to be able to store stack elements in registers in a way that’s not visible to the running program. If you can directly get pointers to stack elements, then it becomes much harder for the JIT to be able to make any kind of assumptions about what may or may not be on the stack. As an aside, to some degree, you can guard against that by using a register machine. Registers, when you think about it, are a kind of private address space accessible only to the currently running thread.

I ended up having a discussion with Justine Tunney on twitter about whether UVM should use a more traditional architecture with pages and mmap instead of a Harvard Architecture:

Justine’s Blink VM uses a SIGSEGV trick in order to catch writes to executable memory without having to explicitly check for them. It leverages the hardware’s page protection to do the checking. I’m not sure if this trick would be portable to a platform such as Windows, for example, but it does solve that problem on Linux/Mac and most POSIX platforms.

However, I don’t think that having code live in the same memory space as data is all that useful, and Justine seemed to agree. Writes to executable memory are relatively rare. They have to be, because it’s an expensive operation. On modern-day systems, whenever you write to executable memory, you need to perform a system call to change memory protection twice, and you may also need to flush the instruction cache. As such, needing a system call to write to executable memory instead of being able to directly do it though a pointer is hardly an inconvenience.

A more important question with regards to the design of UVM is whether it should or shouldn’t have a concept of pages and a primitive like mmap, instead of a flat linear address space for code. At the moment, there are no pages in UVM’s heap, just a long array of bytes, which you can extend and shrink with a primitive similar to sbrk. I thought that this could be problematic when it comes to freeing memory, because it means UVM can only release memory to the OS by shrinking the heap. It can’t “punch holes” to release the memory of individual pages to the OS. However, Justine mentioned that the dlmalloc allocator has been designed to favor sbrk, by ensuring that the topmost memory is more likely to be empty. The conclusion of that discussion is that implementing mmap in UVM would add complexity, and probably isn’t necessary.

At the moment, I think that the design of UVM’s memory address space seems to satisfy most of the needs I can anticipate, except for one. In the future, it seems like it would be useful to be able to share memory between different processes. The question is: how do you do that in a VM that doesn’t have mmap? If you can’t directly map a chunk in your memory space to be shared memory, then how do you share memory at all? There are at least three possible answers to that question. The first would be to not support shared memory, the second would be to add support for an mmap-like primitive later on (not impossible), and the last would be to share memory using a different mechanism.

Ideally, if UVM supports shared memory in the future, I’d like to be able to provide strong guarantees. For example, I’d like to be able to guarantee that all reads/writes to shared memory are atomic. I think it might also be useful to be able to get a lock on a chunk of shared memory, or to implement something like transactional memory (but only within that specific shared chunk). I don’t know how well that works with mmap, because we’re talking about much stronger guarantees than what is normally provided by hardware. It seems it could be possible, however, to allocate shared memory blocks with a unique ID, and to provide special primitives to access these shared memory blocks. Here we’re essentially talking about a 4th memory space that would have to be accessed trough a special kind of “fat pointer”.

In conclusion, at the moment, UVM doesn’t have pages, mmap or shared memory. The design is pretty simple and it works well. I can see a path towards adding mmap support in the future if it turns out to be necessary, but I don’t think that it will be. There are still some quirks that I think I may need to fix. One odd property in UVM is that address zero in the heap is a valid address, and accessing it doesn’t fault. However, C, Rust and other languages tend to rely on address zero being invalid. As such, it might make sense to add that restriction to UVM as well, to be more in line with established practice. Another quirk is that at the moment, the heap can be resized to have any requested size. It might make sense to only allow the heap to have a size that is a multiple of some arbitrary “page size”, such as 4KiB or 16KiB, if only to allow future optimizations.

Building a Minimalistic Virtual Machine

Just over a year ago, I wrote a blog post about a topic that’s very important to me, which is the problem of code rot, of software constantly breaking because of shifting foundations, and the toll it takes on programmers, and on society at large. We’re no doubt collectively wasting billions of dollars and millions of human hours every year because of broken software that should never have been broken in the first place My own belief is that stability is undervalued. In order to build robust, reliable software, it’s important to be able to build such software on stable foundations.

One of the potential solutions I outlined last year is that if we could build a simple Virtual Machine (VM) with no dynamic linking and a small set of minimalistic APIs that remain stable over time, it would make it a lot easier to build software without worrying about becoming randomly broken by changing APIs or dependencies. Such a VM wouldn’t necessarily meet everyone’s needs for every possible use case, but it could help a lot of us build software for the long term.

This is something I’ve been thinking about for at least two years, but in all honesty, I didn’t really dare even get started on this project, because I felt scared of the amount of work it represented. I was also scared of potentially getting a fairly negative reception with lots of criticism and cynicism. Last December though, I decided that, well, fuck it, I’m really interested in working on this project, I keep thinking about it, so I’m doing it. I’m well aware that the concept of a VM is not a new idea, and I’m sure that some people are going to tell me that I’m basically reinventing WASM, or that I should base my system on an existing processor architecture and work on something like Justine Tunney’s blink instead. I’ll elaborate a bit on why I’m taking a different approach.

WASM is trying to be a universal binary format and satisfy many different use cases in very different areas. Because it has so many stakeholders, it evolves very slowly. For instance, we were promised that WASM would have support for JIT compilers and include a garbage collectors 5 years ago, but this support still isn’t here today. At the same time, even though WASM is evolving relatively slowly, there are a ton of new features in the works, and it will surely become a very complex system. With so many stakeholders, the risk of massive feature creep is real.

In my opinion, the focus on minimalism is crucial for guaranteeing the longevity of both the VM itself and the software running on it. Exposing large, complex APIs to software running on the VM can become a liability. One of the biggest issue with modern web browsers, in my opinion, is that they’ve become so complex, it’s basically impossible to guarantee that your web app will behave the same on different browsers.

Working on my own web-based music app, I discovered that there are major differences in the way the pointer events API behaves on Chrome, Firefox and Safari. That’s kind of insane to even think about, because handling mouse clicks is something basic and fundamental that so many web apps rely on, and this API has been around at least as far back as 2016. Why can’t it behave the same in every browser? In my opinion, it’s in part because browsers are such hugely complex beasts that are growing in complexity so fast, that even tech giants can’t manage to have enough developers to implement all the APIs properly. If you don’t believe me, just take a look at this audio API bug that I ran into two years ago, that was reported 4 years ago, and still isn’t fixed at the time of this writing.

UVM has a fairly strict focus on minimalism, which will help keep the VM portable and maintainable. It’s also a lot smaller than other systems. It’s a simple untyped stack-machine. I made that choice because I want the VM to be both intuitive and easy to target. It’s not based on emulating an existing Instruction Set Architecture (ISA), because in my opinion, existing ISAs have a lot of quirks and a broad surface area (the ISA itself is a large, complex API). There’s also a phenomenon where if I said, for example, that UVM emulates an x86 CPU and Linux system calls, then people would expect UVM to support more and more x86 instructions, as well as more and more system calls. In order for the VM to maintain its strict focus on minimalism and simplicity, it has to be its own thing.

At this stage, UVM is just an interpreter with a tiny set of APIs. It’s still very immature and should be considered hobbyist-grade at best. That being said, some of its strengths is that it’s a small system that’s designed to be easy to understand. It’s important to me that software be approachable. It can run programs written in its own assembler syntax, but unlike other systems, the assembler doesn’t use some weird esoteric syntax, it uses a simple syntax based on NASM/YASM. If you’ve programmed in assembler before, and you understand how a stack machine works, then the syntax of UVM’s assembler should seem fairly intuitive to you.

I’m also in the process of building a toy C compiler that targets UVM. It’s currently lacking several features, but it already supports macros and enough C to be able to write something like a simple paint program, a snake game, and a few other simple graphics programs. UVM provides a frame buffer API that’s really easy to use. Just two function calls and you can plot pixels into a window, which makes the system fun to develop for as you can write a simple 2D game in something like 200 lines of C code without any boilerplate.

So here it is, UVM is currently in very early stages, and I don’t expect everyone to understand the purpose of this project, but I would really like to connect with people who share the vision, and find potential collaborators. If you’d like to know more, I recorded a short video demo and wrote a decent amount of documentation on GitHub, including some notes to explain the various technical decisions made in the design of UVM. There’s also automatically-generated documentation for the system calls that UVM exposes to software running on the VM. I’m also happy to answer any questions and to accept pull requests to improve the documentation.

In terms of what’s coming next, I want to improve the C compiler and I’d like to add an audio API. I could use some input on how to best design a simple file IO and networking API. Longer-term, I would also like to design a simple and safe parallelism model for UVM. Probably something based on actors, with or without the possibility of sharing memory, but parallel computation is not my area of expertise. I could honestly use some input. If UVM is something that’s interesting to you, feel free to reach out via GitHub issues and discussions or via twitter.

Typed vs Untyped Virtual Machines

One of the things that’s been on my mind recently is the idea of building a virtual machine for code archival purposes. Something that’s optimized for long-term stability and longevity, with the goal of helping prevent code rot. This is a fundamentally hard problem to solve, because the world changes, and so software changes with it, but at the same time, there’s no fundamental reason why software written 5, 10, 20 or even 50 years ago couldn’t run anymore.

To some extent, you can still run 50 year old software if you have the right emulator, but I think that this is going to become harder and harder to do, because modern software stacks are becoming increasingly complex and typically have a lot of external dependencies. Then there’s the added problem that you also have to worry about the emulator itself suffering from code rot. A friend of mine who’s learning about programming was asking me the other day why it is that iPhone apps need to be updated so often when the functionality of the apps isn’t changing. It’s because the foundation that these apps are sitting on keeps changing, and if said apps aren’t routinely updated, they’ll quickly stop working. Often, in the software world, dependencies are liabilities.

Many kinds of softwares could be designed to work just fine with very limited kinds of inputs and outputs. If you think about something like a spreadsheet program, a text editor, or many kinds of 2D and 3D games, most of these programs fundamentally only need access to a keyboard, a mouse, the filesystem and a display. All of these things have been available for over 30 years, and although the fundamentals of what they do haven’t really changed, the APIs to access them are changing constantly. Many software programs could be much better protected from code rot if they were written to run on a VM with stable APIs/ABIs and fully statically linked.

In my opinion, to protect software from code rot, we need an approach to software design that’s more intentional, minimalistic and disciplined. That approach to design should start with the VM itself. The VM should provide a stable set of API. It’s easier to keep the set of APIs stable if the VM itself is minimalistic and small. Keeping the VM small makes it easier to port to new architectures, which makes keeping software running easier. A small VM is also easier to implement correctly and consistently across platforms.

There are many different ways to design a VM. The design I have in mind is something like a single-threaded bytecode VM. The VM could be register-based or stack-based. I have a certain fondness for stack-based VMs because they’re very simple, easy to target, and stack-based bytecode has the neat side-effect that it provides implicit liveness information (values popped off the stack are dead). At the end of the day, whether a VM is register-based or stack-based isn’t very important because it’s entirely possible to convert between the two.

Another important design axis which is not as trivial and can have much broader implications is whether the VM is typed or untyped. The WASM and Java VMs are typed in the sense that they have a built-in notion of the type of values, functions, modules and objects. In contrast, something like a Commodore 64 emulator would emulate the instruction set of its 6510 CPU, but does so without assigning types to values in registers or in memory, treating values as bits and bytes without really tracking what they represent.

There are advantages to building a type system into a VM. There’s the potential for optimizations, potentially increased safety, and also the ability to more easily decompile and repair broken programs. However, on the flip side, a VM design that incorporates a type system seems inherently more complex and more constrained. Simply put, if you want to enforce rules around typing in your VM, then you have to build typing rules into your design, and that forces you to try to precisely categorize the kinds of computations your VM can perform with a lot more detail. This in turn forces the typed VM design to take on a lot more responsibilities.

In order to track the types of object fields an array elements, the typed VM design has to have a notion of what is an object (or struct), of what is an array, etc. It has to have a pre-established notion of what is a function so that it can assign types to function calls. It also has to have a notion of every kind of control flow structure that is supported so that it can assign types to them. In contrast, an untyped VM design can easily represent any control flow mechanism, be it function calls, exceptions, continuations, and coroutines using simple jump instructions. The untyped VM also doesn’t care about the way objects and arrays are implemented, because it treats memory as a linear array of bytes.

Beyond having the ability to assign types to everything, it seems to me that a typed VM design must inherently take on more responsibilities than an untyped VM design. In order to maintain typing constraints while offering decent performance, the typed VM is forced to take the responsibility of implementing a sophisticated JIT compiler that will understand and enforce all these constraints. This is because the running program can’t be trusted to implement its own JIT compiler as this can’t be proven safe. Furthermore, if the typed VM wants to support garbage collection, it also has to take on the responsibility of implementing its own GC, because once again, the running program can’t be trusted to manage its own memory while respecting typing constraints.

An untyped VM design can be much more minimalistic. It has to enforce a small set of hard constraints, such as making sure that pointer dereferences respect valid address bounds so that the running program can’t crash the host VM, but it doesn’t really need to care about the types of values or enforcing typing constraints. For performance, it can implement a very barebones JIT compiler based on dynamic binary translation, but it doesn’t have to care about optimizing type checks, and it can even allow the running program to manage its own memory and implement its own garbage collector.

In summary, I think there’s a strong case to be made that an untyped VM design can easily be much smaller and more minimalistic than a typed VM design. A simpler untyped VM has two major strengths. The first is that it doesn’t place as many restrictions on running programs. Programs can implement their own control flow structures, their own GC, or even their own JIT. The second is that a smaller, simpler VM is much easier to port, reimplement and maintain. If you think about the amount of effort that would be required to build a new Java VM from scratch and make it perform well, you quickly realize that such an undertaking is only possible for a massive corporation. There is still no official RISC-V support by the JVM, and it’s easy to understand why.

Minimalism in Programming Language Design

Four years ago, I wrote a blog post titled Minimalism in Programming, in which I tried to formulate an argument as to why it’s usually a good idea to try to minimize complexity in your programming projects. Today, I want to write about something I’ve been thinking about for a long time, which is the idea that we also ought to take a more intentionally minimalistic philosophy when designing programming languages.

Designing a programming language to be intentionally minimalistic is an idea that’s highly underrated in my opinion. Most modern programming languages adopt much more of a maximalist design approach. Rapidly adding new features is seen as a competitive edge over other programming languages. The general thinking seems to be that if your language doesn’t have feature X, then people will choose to use another language, or that adding more features is an easy way to show progress. This line of thinking is simplistic, and disregards many other key aspects that are necessary for a programming language to succeed and thrive, such as learnability, stability, tool support and performance.

Change and Churn

I’d like to make the argument that intentionally designing a programming languages to have fewer features, and to change less rapidly over time, is in itself a powerful feature. When a programming language changes often, it necessarily causes breakage and churn. Tools become out of date, codebases need to be updated, libraries become broken, but it causes churn on the human side too.

I first started programming in C++ around 1998. I haven’t really touched the language in a few years, and I have to say, I feel kind of lost. So many new features have been added that it’s a different language now. Last year, I wanted to use C++20 modules in a new project, only to find that support in G++ and Clang was so incomplete that modules were just not a viable feature. My general impression at the time was that there aren’t enough people working on C++ compilers to keep said compilers up to date. The language has become so complex, and so many new features have been added, that compiler developers are kind of burned out. It seems to me that slowly but surely, C++ is crumbling under its own weight.

Something that many people forget, is that for a language to succeed, there has to be good tool support. If the language and its feature set keeps changing, then tools need to be updated constantly. One of the many problems with C++ is that its grammar is very hard to parse. That was already the case back in 1998. If you add on top of that the problem that the grammar changes to become even more complex every year or two, what do you think the impact of that will be? The people maintaining C++ tools are going to want to go do something else with their lives, and so will the users of those tools.

Learnability and the Human Element

More recently, colleagues and I have decided to port a C codebase to Rust. I’m generally pleased with the core feature set of Rust and I feel that in many ways it’s a great improvement over C and C++. However, one of the main weaknesses of Rust, in my opinion, is its high complexity. Both at the syntactic and semantic level, Rust is a very complex language. The syntax can get very verbose, and there’s a lot to know, a lot of rules and unintuitive subtleties about what you can and can’t do where. The learning curve is steep and the cognitive load is high.

Last week, I was pair programming with a colleague when he said “I feel like the Rust compiler is always telling me that I’m too stupid”. That remark surprised me, because I’d had the same thought. Somehow Rust feels unergonomic, and the high complexity of the language surely contributes to that feeling that the language is a bit user-hostile. It breaks your intuition, and it constantly feels like the compiler is telling you that you’re writing code wrong. Two days after my colleague made that remark, I saw a post appear on Hacker News titled Rust: A Critical Retrospective which echoed similar feelings about Rust’s complexity.

In a lot of ways, I feel like designing a language to be minimalistic, to have fewer concepts, and to choose primitives that combine well together, is a good way to make the language easier to learn. If the programming language has fewer concepts, there’s less to learn, and your level of proficiency will increase faster. Code written in a more minimalistic language may also be easier to read. If we think about C++ code, we have a situation where the language has so many redundant features that a typical workplace will mandate that code be written in a subset of C++, with some language features being explicitly banned. That can mean that people writing C++ code at different workplaces will have a hard time reading each other’s code because foreign C++ code will be written in a different dialect.

In some ways, I feel like intentionally minimizing complexity and keeping the feature set small is a way of better respecting programmers. It means we respect that programmers are people with potentially busy lives and many things to do, and that they probably don’t have time to read hundreds of pages of documentation to learn our language. Programming languages are user interfaces, and as such, they should obey the principle of least surprise. Minimizing complexity is also a way to reduce cognitive load and respect human limitations. Human beings are amazingly capable creatures, but we’re also basically just clever monkeys that can talk. We can only keep a few items in our working memory, we can only account for so many design constraints, and we can only focus for so long. A well-designed programming language ought to help us succeed despite our human limitations.

At the end of the day, I think that a language’s complexity and how intuitive it feels is going to affect its ability to attract and retain new users. In my opinion, the focus on reducing friction contributed greatly to Python’s initial success and rapid increase in popularity. I think it’s also fair to say that many people were frustrated when the complexity of the Python ecosystem increased, for example, during the switch from Python 2 to 3, or when the redundant walrus operator was introduced.


So far, I’ve made multiple references to minimalism and I’ve also briefly mentioned the principle of least surprise. I’ve hinted that minimalism also means having a smaller feature set and less concepts to learn. Minimalism doesn’t just mean a smaller feature set though. It also means carefully choosing features that combine together seamlessly. If we design a language with a large feature set, there’s a combinatorial explosion in how these different features could interact, which means we’re more likely to end up with situations where some language features interact together poorly.

Imperative programming languages typically make a grammatical distinction between statements and expression. Functional languages instead tend to be structured in a way that everything inside a function body is an expression. The latter is more minimalistic, and also imposes less constraints on the programmer. Some languages impose a distinction between code that can be run at compile time vs code that can be run at program execution time. This distinction often increases the complexity of the language as there tends to be a duplication of language features and fairly arbitrary restrictions as to what code the compiler is able to run at compilation time.

In terms of minimizing surprise, we want to avoid introducing strange corner cases that only show up in some circumstances. Another important pitfall to avoid is introducing hidden behaviors that the programmer may not expect. An example of this would be the equality (==) operator in JavaScript, which actually includes an implicit conversion to the string type, meaning that 1 == “1” evaluates to true. Because of this undesirable hidden behavior, JS actually has a separate strict equality operator (===) which doesn’t perform the hidden string conversion. This suggests to me that JS should only ever have had a strict equality operator, and that if you want to convert the values you’re comparing to strings before performing the equality comparison, you should just have to explicitly spell that out.

Implementation Complexity

Language design is hard because the space of possible programming languages is infinite, and so compromises have to be made. It’s hard to provide hard numbers to quantify what makes one design better than another. Some of the things that can be quantified to some degree are the complexity of the implementation of a language and also the way that a particular language implementation performs.

My PhD thesis involved the implementation of a JIT compiler for JavaScript ES5. As such, I got to become intimately familiar with the semantics of the language and everything that has to go on behind the scenes to make JavaScript code run fast. At times, that was a frustrating experience. I’ve become convinced that a lot of the complexity and the hidden behaviors in JS and in many other languages are essentially bad for everyone.

Unnecessary complexity in a language is bad for those learning the language, because it makes the language less intuitive and harder to learn. It’s bad for the programmers working with the language everyday, because it increases their cognitive load and makes it harder to communicate about code. It’s bad for language implementers and tool maintainers, because it makes their job harder, but at the end of the day, it’s also bad for end users, because it leads to software with more bugs and poorer performance.

To give you an example of unnecessary implementation complexity, many object-oriented languages have this idea, borrowed from Smalltalk, that everything should be an object, including booleans and integer values. At the same time, languages implementation for these languages have to do a lot of work behind the scenes to try and represent integers efficiently (as machine integers) while presenting an interface to the user that resembles that of an object. However, the abstraction presented to the user for an integer object is typically not really the same as that of a normal OOP object, it’s a leaky abstraction, because being able to redefine integer values makes no sense, because integer values have to be singletons, and because being able to store properties/attributes on integers is both dumb and terrible for performance and so typically isn’t allowed.

Ultimately, integers are not objects in the object oriented sense. They’re a distinct type of atomic value with a special meaning, and that’s okay. The mistaken idea that “everything should be an object” doesn’t actually simplify anything in practice. We’re lying to ourselves, and in doing so, we actually makes the life of both language implementers and programmers more complicated.

Actionable Advice

This blog post has turned into more of a rant than I expected it to be. It’s easy to critique the status quo, but I’ll also try to conclude with some actionable advice. My first piece of advice for aspiring language designers is that you should start small. Your language is a user interface, and an API which people use to interface with machines. The smaller the API surface, the less you risk introducing accidental complexity and subtle design mistakes.

My second piece of advice is that if you can, you should try to keep your language small. Limiting yourself to a smaller feature set likely means you will want to choose features that don’t overlap and that provide the most expressiveness, the most value to programmers. If you do want to grow your language, do it slowly. Take some time to write code in your language and work through the potential implications of the design changes that you are making.

It’s easy to add new features later on, but if you add new features and people begin using them, it’s going to be hard or even impossible to take these features back, so choose wisely. Remember that you don’t have to please everyone and say yes to every feature request. No language or tool can possibly satisfy every use case, and in my opinion, trying to do so is a mistake.

Lastly, remember that language design is an art. It’s a delicate balance of many different constraints, just like user interface design. Brainfuck is a language that is very small and has very few concepts, but nobody would call it expressive or elegant. Lisp is regarded by many as one of the most beautiful and elegant languages in existence, but my PhD advisor, a Scheme fanatic, had the habit of writing code with single-letter variable names and very few comments. An elegant language doesn’t automatically make for elegant code, but you can encourage good coding practices if you lead by example.

Democracy, Dictatorships and Access to Information

The last two weeks have been a painful reminder that, as comfortable as we may have it in the west, democracy is not the default state of the world. Historically speaking, democracy has been a relatively recent and scarce concept. Even today, two of the largest countries in the world are ruled by authoritarian dictatorships. It’s both scary and sad to be reminded that one person’s ego, when left unchecked, can cause an incalculable amount of suffering, and the deaths of tens of thousands, or even millions.

One clear similarity between authoritarian regimes is that in order to maintain their power, they limit and suppress access to information. They can’t tolerate criticism, debate or dissent. They control the narrative by supplying one message and one storyline, and silencing all other voices by any means necessary. Suppressing the free flow of information has the added effect that even if people disagree with the sanctioned narrative, they may feel completely isolated, and so they are understandably too afraid to act. In a way, it’s almost tautological: centralized control relies on preventing the free flow of information, because allowing multiple voices to be heard automatically takes power away from centralized control.

Recently, I’ve been pleased to see that Signal, the end-to-end encrypted messaging system, has gained almost mainstream popularity. People are beginning to care a little bit more about privacy, and to look for alternatives to services like Facebook. However, I don’t think Signal is enough. I’ve heard that this is going to change, but the reliance on a phone number to identify users is anti-privacy by definition. Secondly, while Signal provides an alternative to Facebook Messenger and groups, it doesn’t do much to replace Facebook’s events. Lastly, signal still seems to rely on centralized servers, which makes the service inherently vulnerable to disruption.

In my humble opinion, we need something more decentralized and privacy-conscious. Something like a cross between Signal and BitTorrent, with a little bit of Tor sprinkled in there. End-to-end encryption is great, but it would be nice if there was a way to implement a messaging service without relying on one centralized server to identify and connect people. BitTorrent solves this problem by having a long list of “trackers”. Maybe this concept can be adapted by having many servers which act as dead drops, used by friends to exchange messages while still remaining anonymous. Possibly, this could also incorporate the peer-to-peer component of BitTorrent.

I’m not a cryptography expert and I’m not the best person to solve this problem, but I tend to believe that something like this should be possible, and that it’s a problem worth working on. It should be possible for each user to create a public and private key pair that lives only on a local device. Then, without needing a centralized account, each user can use their public key as their online identity, without even providing a username. Possibly, two users could become “friends” in a decentralized system by sharing a public/private key pair that only they are aware of. This shared key could be created and shared completely offline if desired, over a USB stick, over bluetooth, by tapping cellphones together in a coffee shop, or even using a QR code.

A shared key pair can be used to drop messages on a server or on a peer-to-peer basis without having the users identify themselves or the intended audience when sending the messages. Alice can encrypt a message for Bob, but the encrypted message contains no visible information saying it comes from Alice, or that it’s destined for Bob. Alice and Bob can download many messages from a server without letting the server know which messages they are looking for. If you think of this in a peer-to-peer context like that of BitTorrent, I think it paints an interesting picture. Alice can download many messages from peers and also upload many messages to other peers. Most of these messages are encrypted and she has no access to them. At some point, she writes a message for Bob and encrypts it using their secret key. She injects her own message in the stream and it starts to propagate among the network. Alice doesn’t tell anyone that this new message is her own, but Bob will be able to identify it when he sees it.

In terms of community organization, it may be possible to share details about an event, or to create something equivalent to a twitter hash tag, by broadly sharing a decryption key using a QR code. This would be the equivalent of telling many users on the network to follow your channel/feed.

The system I’ve loosely described above has flaws and could certainly be iterated on, but the key point is, something like that should be technologically possible, and it’s a problem worth solving. Access to information, freedom of expression and safe channels of communication are essential if we want to protect our democracies and the well-being of our species in general.

NoiseCraft Project Update

Two months ago, I wrote about NoiseCraft, the visual programming language slash modular synth that runs in a browser that I’ve been working on in my spare time. This is very much a passion project, but what I’m hoping to achieve with this is to build a small online community for the exchange of synth patches and musical ideas, and to learn a lot more about sound synthesis in the project.

Since the announcement, the project went on to get over 520 stars on GitHub, the app got over 66,000 hits and 155 new users registered accounts at The first thing I noticed is that the number of people who created accounts is much larger than the number of people who shared projects through the app. Maybe that’s just because are too shy to share what they’ve working on. Maybe it’s because I gave people the option to save projects to local files. Regardless, I’m going to keep that option because I think people like the option of being able to own their own project files, to own their data so to speak.

The project made it to Hacker News and I would say the reception to the project was generally good. I was very happy to see all the cool projects that were shared, but I was a bit disappointed to see that the project attracted only a few open source contributions. Multiple people complained that the user interface didn’t behave like their favorite graph editor and suggested I should go try out <paid-commercial-product-i’ve-hever-heard-of> for inspiration, but few people volunteered to pitch in and help. Some of the criticism was definitely valid and helpful, but I have to say, it’s kind of hard running an open source project and after months of work, receiving mostly negative feedback. A lot of people are eager to criticize, but few people take the time to give positive feedback when things work well.

Here’s some of my favorites among the projects people have shared:

In the last two months, I’ve made various incremental improvements to the app. I improved the help page, fixed a number of bugs, made ergonomic improvements to the UI, added a featured section on the browse page, implemented some new node types, and most importantly, got NoiseCraft to work on Firefox. The situation is unfortunately not as good on Safari. With some help from open source contributors, we found that there were at least 3 problems preventing NoiseCraft from working well there. Among other things, Safari’s pointer capture API is broken. This API is several years old at this point and very commonly used, which makes it hard to understand why these obvious problems haven’t been fixed. Safari doesn’t have a public bug tracker that I’m aware of, which makes it hard to even report problems like these.

In terms of upcoming new features, I’ve been making incremental progress towards adding the ability to group multiple nodes into user-created modules. I think that will open up many interesting possibilities. Among other things, I’d love to be able to easily create something like a drum machine in NoiseCraft, with modules corresponding to different sounds or sound effects, which can easily be reused in new projects.

When it comes to attracting new users, I haven’t been aggressively promoting NoiseCraft because there’s always the sense that more could be done to make the app better, and there’s a limit to how much free time I can spend on the project. However, I’ve been thinking that I might, in the next couple of months, create a small beat making contest with a small prize for the best entry.

If you think NoiseCraft is cool and would like to contribute, there’s a number of issues on GitHub marked with the help wanted tag. I’m also very curious to see what you can create with the app :)

Code that Doesn’t Rot

One of the most frustrating experiences, both as an end user and as a programmer, is to try to run a piece of software that used to work perfectly not that long ago, only to find out that it’s now broken, the software won’t run anymore, and it’s not clear why. The software didn’t change, but something broke, seemingly for no reason. This is often due to the phenomenon known as “code rot”, or “bit rot”.

I used to work in a deep learning research lab, and I’ve seen cases where researchers had published code just six months ago, but one or more of the dependencies on which they relied had since made breaking changes. That put us in the unfortunate position of having to troubleshoot someone else’s code. Sometimes you were lucky, and the problem was just that one of the Python packages which their code needed had made breaking changes, and the problem could be fixed by simply editing the project manifest. Sometimes people imported packages which they didn’t really use, and we could completely remove the dependency. Other times we had to resolve conflicts between the multiple Python package managers that could be installed on a given system (pip2, pip3, Conda, and Ubuntu’s apt).

Editing a list of dependencies or wrangling with package managers doesn’t sound too bad, but PyTorch, the deep learning framework which was key to a lot of this work, had this irritating feature that each release needed a specific version of the Nvidia GPU driver to run. Installing an older GPU driver required root access, which we often didn’t have when working on remote compute clusters. Even if you were doing this on a local machine where you did have root access, the process of installing a new GPU driver was quite slow and tedious. Thanks to Nvidia’s unfriendly driver installers, this process couldn’t be automated, and once you were done, you might succeed in getting a specific release of PyTorch to work, but your other projects couldn’t run anymore, because they needed the latest release instead.

Researchers are encouraged to publish their code for the sake of replicability, but there’s not much point if nobody else is able to run said code a few months down the line. As such, we started to encourage those wishing to publish their code to rely on containerization using either Docker or Singularity. That fixed issues such as broken Python packages, incompatible versions of PyTorch or TensorFlow and missing libraries on the host system. However, there was another problem, which is that much deep learning code can’t really run fast enough on a CPU alone to be usable. Most of the code we saw required GPU acceleration. The solution to enable that was to use nvidia-docker, a special version of Docker which allowed code access to the host’s Nvidia GPU drivers. This, however, reintroduced the problem that the code running inside the container needed a special version of the GPU drivers to be installed on the host machine to run correctly. Nvidia’s solution to give people access to GPUs inside a Docker container was to breach the container and expose details of the host system in the process.

How much time do we collectively waste, every year, fixing bugs due to broken dependencies? How many millions of hours of productivity are lost every single day? How much time do we spend rewriting software that worked just fine before it was broken? In my opinion, code rot is a problem that we ought to try to fix, or at least alleviate. Because of fundamental reasons I’ll discuss later in this post, code rot probably can’t ever be fully eliminated, but I think the situation could be made a lot better through more principled and forward-thinking software engineering. At the very least, the situation would be improved if we acknowledged that code rot is a problem, and understood what causes it to happen in the first place.

How can we avoid code breakage? Linus Torvalds seems to think that in general, when compiling software, it’s better to statically link libraries if you can, because very few shared libraries are actually version safe, and with dynamic linking, you’re always adding complexity and exposing yourself to the risk that the system your software is going to get installed on won’t have a compatible version of the libraries you need.

In my opinion, for software that relies on a package manager, it’s better to fix package version numbers if possible. That is, to specify directly in the manifest of your package which version of each dependency to use. The reason for this is that you unfortunately can’t trust newer version of your dependencies not to make breaking changes, and often, one broken dependency is all it takes to render your software broken. In some cases, programmers may avoid specifying fixed version numbers because package managers such as Python’s pip don’t support having multiple versions of a given package installed at the same time, which could mean that the version of a package you request could clash with other software already installed on a given system. This is a shortcoming that needs to be addressed by package manager if we want to build reliable software.

There’s another problem though. Part of Python’s appeal is that it makes it easy to link with C code through its Foreign Function Interface (FFI). This is part of what made Python so popular, because it makes it easy for anyone to write a package to interface with commonly available C libraries and benefit from the strength of the C ecosystem. However, this convenience comes at a cost. The FFI is essentially a trap door through which software gets access to external dependencies that can’t be controlled by the package manager, which vastly increases the risk of code breakage. Third party package managers such as Conda try to address this by managing the installation of external libraries as well as Python code, but this can potentially introduce conflicts with Python packages installed through other means.

In my opinion, the most practical solution to manage with these problems in the real world is to take a conservative and minimalistic approach to software design. Purposefully minimize dependencies if possible. Don’t add new dependencies unless the value added is really worth the added cost of potential code breakage down the line. Avoid external dependencies as much as it’s reasonable to do so, and when you do choose to rely on external packages and libraries, pick libraries that have a long history of being stable, well-maintained, backwards compatible and easy to install. Remember that it can take just one broken dependency for your software to fail to run on a user’s system, and if your software breaks, they might not even tell you it’s broken.

Good software engineering practices can go a long way towards minimizing the risk of code rot, but I think it’s also valuable to ask ourselves what is the cause of code rot in the first place. Could we possibly build software that doesn’t rot? One interesting observation is that such software does exist, in a sense. People still write software for retro gaming platforms such as the Super NES and the Atari 2600. These platforms are essentially frozen in time, with a fixed set of devices and limited I/O capabilities. The fixedness of the platform, its simplicity, and the impossibility of relying on external software packages means that any software you write for it is unlikely to be broken by changes in the platform itself.

The root cause of code rot is change. The world itself is changing, and so is software. As such, the only way to be completely protected from code rot is to target a platform that never changes. Not just the platform itself, but every interface the platform has to the outside, every device, file format and networking protocol. We can’t stop the world from changing, but we can try to build software on more stable foundations. Like San Francisco’s Millennium Tower, modern software is build on soft soil that keeps shifting under our feet, but it doesn’t have to be that way.

Although the world does change, many elements of the computing world remain fairly stable. Computer keyboards have been around since the 1950s. Color displays have been around since the 1980s. Touch devices have been commonplace for over a decade. IPv4 was introduced in 1981 and IPv6 in 1995. If I’m writing a program that only needs to read input from a keyboard and pointer device, and display pixels on a screen, there’s no real reason why that program ever needs to break. The APIs for getting input from a keyboard and rendering frames to a display can be extremely simple. The world will keep changing and new I/O devices will be invented, but even 200 years from now, the concept of a keyboard and a color display should be easy to understand and interface with.

Recently, I’ve been thinking about the design of virtual machines. If we want to create software that doesn’t break, maybe what we need is some kind of executable code archival format. A minimalistic virtual machine with a small set of I/O devices that are interfaced through a small API surface, a small RISC instruction set, and an overall design that’s generally kept as simple and stable as possible. Something like the modern equivalent of a Commodore 64, with a high-resolution color display and the performance of a modern machine. I don’t think something like that would be applicable for every use cases, but I surmise that much of the software we use only really needs to interface with the outside world in fairly simple ways. For example, it needs to get input from the user in terms of mouse clicks or touch devices, it needs to draw pixels to display a user interface, and it might also need to read and write files, and maybe access the network. That software doesn’t fundamentally need to link to any external libraries, everything can be statically linked, it just needs simple, stable interfaces to access the outside world.

The idea of a Virtual Machine (VM) isn’t new. The Java VM tried to achieve this but largely failed. Sun Microsystems coined the famous “write once, run anywhere” slogan, which people began to jokingly mock as “write once, debug everywhere”. In my opinion, the area where most VM designers go wrong is that they tend to expose too many APIs, and each of them has an API surface that is much too large. When an API has a large surface, it’s very easy for subtle bugs and corner cases to creep in. You almost inevitably end up with different implementations of the API behaving in different ways. The Web Audio and Canvas HTML APIs are good examples of this. APIs with a large surface are created because VM designers believe this is more convenient for programmers, and will yield better performance. However, this convenience has a cost, in that it makes code breakage much more likely. Fundamentally, to output audio, one should be able to output a simple list of floating-point samples, and to draw pixels, one should be able to output a grid of pixels. It doesn’t have to be much more complex than that, and if the APIs were kept simpler, they would be much less likely to break.

In order to minimize the risk of code breakage, API boundaries are not the only thing one has to think about. If your software interfaces with the outside world, you also need to think about file formats and networking protocols. In this area, choosing mature, stable, open standards is generally preferable. I don’t know if my idea of a code archival format, or a virtual machine designed for maximum API stability will see the light of day. For the moment, it remains a thought experiment, but in the meantime, I encourage every developer to think about purposefully minimizing dependencies, about designing smaller, more stable APIs and about packaging their software in a way that will maximize its useful shelf life. Engineering more robust and durable software is one way in which you can improve the lives of a large number of people in small ways and help reduce electronic waste.

The Internet Changed My Life

I’ve seen multiple discussions online as to the negative effects of the internet on society. There’s definitely harmful content online. It makes me sad to see the internet being used as a tool to spread anger and hate, and to further the political divide, but today I’m going to share a personal story about how, in the late 90s and early 2000s, the internet changed my life.

When I was a kid, my mother worked as a journalist. She would often bring me to book launches and events of the sort. I distinctly remember being offered Perrier water to drink and hating it (how could anyone drink this?), and being bored out of my mind. Book launches were one of the worst places you could possibly bring a kid, but she often didn’t have a choice, being a single mom with no father in the picture. She was well connected and had a wide circle of friends. Her income was modest but we were doing alright. We lived in a fairly roomy two bedroom apartment in a co-op smack in the middle of downtown Montreal.

Unfortunately, things took a turn for the worse. My mother started to complain about the neighbors being too loud. Just a little at first, but eventually, it became quite obsessive. My mother was complaining but I never heard any noise. At first, I couldn’t understand what was going on or why she was so upset. It turns out these were just early symptoms of the development of her mental illness. Over the course of two painful years, she lost it all: the job, the connections, the friends, the apartment, the little savings that she had. Even her own sister decided to cut contact with her.

Fast forward to the start of high school, I was 12 years old, and my mother was working as a cook. Unfortunately, even though she was way overqualified for that job, she lost it too, and we ended up on welfare, living in a much smaller apartment with no windows in the living room, and a black mold problem. On a social level, things weren’t going too well for me either. The other kids at school would pick on me and I’d often get into physical fights. I got suspended twice and was nearly kicked out of school. I can’t say for sure why I got into trouble so much more than my peers. Part of it was probably just that I was a nerdy kid, and teenagers are assholes, but another part of the problem is likely the belief system I grew up with. My only parent would repeatedly tell me that the world was full of bad people who are out to get you and can never be trusted. Being raised with that kind of belief system doesn’t exactly help you make friends.

Sometimes, I’d get home from school and my mother seemed to be doing alright that day. I’d settle down, sit at my desk and get started on homework, but then I’d suddenly jump, surprised by a loud shriek. My mother would suddenly become angry, and loudly shout back insults at the voices in her head. She was subject to extreme, unpredictable mood swings. One moment she’d be kind, the next she’d be angry. I tried to explain how disruptive and painful this was for me, but no amount of explaining seemed to help. I couldn’t find peace anywhere. Not at home, not at school, sometimes not even in my sleep. I felt truly alone.

After my first year of high school, the summer came. I had few friends, and the friends I did have were much wealthier than me. I didn’t have an allowance so I couldn’t ever go with them to shop, or to the movie theater or even to eat at a burger joint. I had to wear clothes purchased at the Salvation Army which mostly looked ok but other kids occasionally made fun of. I felt like living in poverty contributed further to my isolation. I spent most of that summer alone. I’d get out of bed and just lie on the couch, feeling bored out of my mind, with no energy to do anything. My mom became worried about how apathetic I’d become and took me to see a doctor. We did some blood tests, and everything came back normal. Looking back on it, I think what I was experiencing was a major depressive episode. I was still just a kid, and I had hit rock bottom.

I was very interested in computers, but our aging 386 PC had just died, which contributed to my feelings of despair. We were poor, but as tortured and dysfunctional as she had become, my mother still deeply cared about me and always did the best that she could to be a good parent. She knew I loved computers, and she knew they were useful for school work, so she took some of the little money that was left in her retirement account and bought us a brand new Pentium computer. We couldn’t afford any software for it, but that was a solvable problem.

Around the same time, my best friend got internet access at home through AOL. He was nice enough to share his access information with me and I started logging in through his account. I was instantly hooked. There was so much content, so much to read, chat rooms with so many people to talk to. Soon enough, I got an angry phone call from my friend. I’d used up his 100 hours of monthly internet access and his access was cut out until the next billing period. Oops.

I started doing the leg work of convincing my mom that we should get our own unlimited internet access subscription. The cost was 28 dollars a month, which, out of the $800-900 welfare cheque she was getting, was a lot of money. I told her this would be very useful for school, we’d have access to so much information, and I could play video games online, I’d finally have something to do. It took a lot of convincing, but I think she saw how passionate I was about the whole thing, and she eventually accepted.

In 1998, I got internet access at home, and I feel like this was a genuine turning point in my life. From that point on, my life started to gradually improve. It wasn’t all uphill, there were lots of ups and downs, but I was never bored again. There was always something to read, something to learn. I could play video games online and quickly started making online friends. My feelings of loneliness were alleviated because I always had people to talk to. English was my second language, but I became almost fluent very quickly. As silly as it might sound, through online chats and by making friends online, I also started to develop some much needed social skills and a better idea of what normal, healthy human interactions could look like.

I’m not sure how old I was exactly, but not that long after I got internet access, I decided to do some online searches about mental illness. I found a webpage that described the symptoms my mom had. She was a textbook case of paranoid schizophrenia. She matched the description perfectly. I took it upon myself to have a conversation with her and try to explain, as gently as I could, that she needed to go see a psychiatrist to get some help, for both of our sake. Unfortunately, that conversation went about as poorly as you can imagine. She got extremely angry, screamed at me, and locked herself in her bedroom. In her world, it wasn’t her that was crazy, it was everyone else.

I was at my wits’ end and I thought about reporting myself to child protection services, but a few days later, on the evening news, I heard a story about children in foster homes being molested for years and living in horribly abusive conditions. I made the cold calculation that as painful as my life was, it was probably better than rolling the dice with the child protection services. I had food, shelter, clothing, access to a decent education, and most importantly, internet access. Devil you know, devil you don’t. I realized that the only way forward was stoicism and hard work. I’d need to succeed in life so that I could be independent.

Through the internet, I learned various computer maintenance skills, eventually buying new hardware and upgrading our home computer. I started to learn about programming. I connected with a guy who went by the nickname SteveR, a tech professional who became my friend slash mentor, and answered many of the questions I had about C++ programming and video game development. My passion for computers, technology, and all the things I could learn about and people I could meet online are a big part of what kept me going. I always had something positive to focus on and fill my time with.

Fast forward a few years, around the time I was 15, and I was running a side-hustle of sorts. I’d learned enough IT skills that I was starting to become a competent computer technician. My friend’s parents were paying me $20 an hour to do tasks such as hardware upgrades, installing newer versions of Windows, backups, installing wifi routers and troubleshooting various problems. I didn’t have a car, so they’d either come pick me up or bring their computers to our apartment. I often got to keep the spare parts after computer upgrades, which I’d either use to upgrade my own machine or go trade at the nearby computer store. I bartered my 14” monitor and a graphics card for a 17” monitor. I don’t know how realistic it would be to do that today, you certainly couldn’t barter computer hardware at Best Buy, but I think the store owner had a soft spot for me, he respected the hustle.

As soon as I reached age 16 and was legally old enough to be employed, I decided to look for a part-time job. My mom suggested that I should drop off my CV at the local computer store. I thought that was a bit silly (who would hire a 16 year old for this?), and I felt even more silly when, after dropping off my CV, the owner told me he wasn’t looking to hire anyone. However, a few weeks later, the next time I came by hoping to barter some parts, the owner said that he was now looking to hire someone, and I could have the job if I wanted it. I didn’t have an allowance, but it didn’t matter anymore. I earned my own money, and with that came a little bit of freedom and hope that I could build myself a better future.

We were never able to afford cable TV at home, but eventually, my mom grew tired of the phone line being constantly in use, which gave me good ammunition to argue that we should get high speed internet. We eventually got DSL and with that, I was able to download movies and TV shows. This gave me access to more entertainment, but also helped me become even more fluent in English, which I knew would be important for a career in technology.

As I progressed through high school, my mental state improved, but I still felt very lonely. What I lacked in terms of real-world interaction, I tried to make up for with online friendships. I spent some time hanging out on various IRC channels. One of the channels I hung out in was simply called #montreal. It was mostly an endless torrent of stupid jokes and shitposting, but one night, I noticed something strange. Among all the stupid comments, one message stood out. A woman had written “I’m about to kill myself and I’d like to talk to someone before I go, message me”. The other IRC users ignored her, and she repeated her message one more time. I messaged her. She explained how she felt lonely, alone and unloved. I told her that I very much could relate, and that whatever she wanted to say, I was there to listen. I tried to say nice things to her, to explain that things were probably not as hopeless as she thought, but it was no use. She said she had just swallowed a bunch of pills, and she quickly logged off.

It was a distress call, but she didn’t really want to hang around and talk. I didn’t have any information about her, not her name or her phone number or address, but when she had logged into the chat, the IRC server displayed her IP address. I felt very awkward and was afraid of not being taken seriously, but I dialled 911 and explained the situation to them. I gave them her IP address (which I had them repeat back to me), the name of her internet service provider and the time when she was logged on. I told them that if they called the ISP and they gave them the IP address and the time, the ISP would know her home address. The woman on the phone said that they would take it from there.

The next day, I had the TV on in the background and the evening news program was just starting. At the start of the bulletin, they gave a quick outline of the stories they were going to cover. Among those stories, the news anchor read something to the lines of “an internet user saves the life of a young woman in distress”. Then they cut for an ad break. I was very excited to hear the actual news story, but suddenly, my mother called out “dinner is ready!” and insisted that I come and sit down to eat. I got distracted, and I never did hear the full news story (ha!). Where is that woman now? How is she doing? Is she still alive? I’ll never know, but in that moment, I was able to be present, and to do something to help somebody else, and I felt proud of that. It gave me hope that I could make a positive difference in the world.

In high school, I was a B+ student at best. I was never particularly motivated, and most probably too (di)stressed to thrive, but by the time I made it to university, I’d been programming in C++ for three years and had a huge head start on everybody. I had gotten accepted into a computer science program at a local university, and I decided that since computers were my turf, I was going to show everyone what I could do by getting the best grades. I was going to beat everyone without getting into a fight. I completed my undergraduate degree with a 3.97/4.00 GPA. Out of 30 courses, I received 27 As and 3 A-minus grades.

I don’t want this to read like a story about how I overcame every obstacle alone and pulled myself up by bootstraps with no outside help. I struggled a lot along the way but the reality is that as challenging as my life situation was, as lonely and misunderstood as I felt at times, there was luck in my misfortune, and I did receive help. My mom was mentally ill, but despite this, she didn’t suffer from alcoholism or any other addiction. She was always able to cook, pay the bills, and perform the most basic functions a parent needs to do. The situation in my home was often very tense, but there was never physical violence. My mother, being university educated herself, cared about my education and genuinely wanted me to succeed. She invited me to keep living with her during my university studies to save money. I sure wanted to get the hell out of there, but it made financial sense to stay a little bit longer.

Right around the time that I was starting university, after a few years on a waiting list, we got access to a subsidized apartment with more sunlight and no mold. This apartment was a 30-minute walk away from the university which allowed me to get some exercise every day. Thanks to Canada’s low tuition costs, I was able to earn enough from summer jobs to pay for tuition and not have to work during the school year, which allowed me to better concentrate on my studies. The classmates I had who were forced to work during the school year understandably struggled with the heavy computer science curriculum.

On the internet, I was able to access resources about psychology and how to cope with trauma, which I also found helpful. YouTube became available in 2006, and through YouTube, I’ve watched many lectures from leading psychologists about depression, PTSD, meditation and many other interesting topics. I opened up about my suffering and received support and valuable advice from friends I’d met online. I’m not going to pretend that being your own DIY therapist is the key to better mental health. I was lucky, through my university, to get access to professional therapists at discounted rates, which helped me begin my own healing process.

This is part of the story of how the internet changed my life for the better. I’m an early millennial and I was raised online. Through the internet, I found friends, support, and the human connection that I was lacking in real life. I also found valuable information that helped me help myself and sometimes help others. The key with information is always to effectively filter the good from the bad, which is a genuine life skill unto itself. My life today isn’t perfect, but it’s better than it’s ever been. My message to all the people out there who are struggling is to believe in yourself. If you help yourself and you let others help you, things are never hopeless.

What’s Could You Use Tesla Bot For?

Like many of you I’m sure, I watched the TeslaBot announcement and felt that this presentation was premature. They didn’t even have plastic mockup of the robot to show, just some guy dancing in a costume. I’m very excited about the potential of general purpose robotics, but it seems like Elon Musk is once again underestimating the challenges in this problem space. Tesla’s Full Self Driving project is barely getting to the point where it might become useful, and now they’re going to attack a problem 100 times harder. Well, okay then.

That being said, as an investor in Tesla, I’m still excited about the project. Tesla is profitable and has 16 billions in cash on hand at this point, which means they can throw more money at the problem than individual company or research group ever has. The company also has their own lightweight, low power neural network accelerator, and they’ve successfully designed high efficiency motors and battery systems. Their extensive manufacturing expertise also means that, if they were to mass produce such a thing, they really could bring the cost down.

My own belief is that the only way to ever build general purpose robotics is to have engineers working on the problem. It’s one thing to have academics working on toy problems in simulation, but real-world robotics is where the rubber meets the road. In my opinion, there is great power in incremental improvements. The way to bring forth general purpose robotics is build robots that can perform some basic tasks in a narrow real-world domain, and then gradually extend their abilities and circle of competence. A lot can be achieved through trial and error, and iterating on a design to keep improving it.

Most engineers would probably tell you that building the Tesla bot, the physical component of it, is not the hard part. We know how to build the hardware, or at least, we can figure it out fairly quickly. The hard part is building the software that drives it. A general purpose humanoid robot will have to have a detailed understanding of the real world around it, and be able to handle many more special cases than a self driving car. Even the “simple” problem of teaching a robot how to reach and grasp teacup without spilling the tea or breaking the cup is a hard problem. Brewing the tea is even harder. Doing that in a house you’ve never visited before is harder than rocket science. So where does that leave us? Where do you start? What could you do with a Tesla bot?

In my opinion, there are simpler, easier, better defined problems to attack before you try and put a humanoid robot in a home or office environment. It so happens that Tesla is looking at building an automated network of so called robotaxis, the self-driving equivalent of Uber and Lyft. In order to build something like that, you’ll need a workforce to perform tasks like routinely cleaning both the inside and the outside of the self-driving cars. This is a simple, repetitive task that requires only limited communication, which makes it an ideal real-word use case for the first humanoid robots.

Tesla could build a Tesla car wash which would gradually expand into an automated robotaxi service station. The problem domain of cleaning the inside and outside of Tesla’s robotaxis is interesting because it’s both challenging but also fairly repetitive and predictable. You can perform the task with a limited number of basic tools, and it’s unlikely to put anyone’s life in danger. The environment in which the task is performed can be somewhat controlled and the tools can be standardized, but there’s also going to be an interesting amount of diversity in the scenarios encountered. Lastly, it should be fairly realistic to have humans provide a decent amount of demonstration data by remotely piloting the robot or through video footage.

This thought experiment has me optimistic that even though fully general purpose robotics could take another 20 years or more to materialize, it’s not impossible to think that something like a Tesla bot could start to become useful in less than 5 years. It’s a narrow problem domain, but it’s easy to think that you could start by having the robot wash the outside of the car, and then expand to having it clean the inside. You could start with just one model of car and then expand to more. You could gradually add new features such as inspecting cars for damage, charging cars, and eventually changing tires or even performing more advanced maintenance. You could even laterally expand to other similar domains such as cleaning Tesla’s cafeterias or factory floors.

This is a hard problem domain, but it’s not impossible to find a starting point that we can realistically tackle, and once the robots start to become useful, a virtuous cycle can begin. As Tesla begins to mass produce and iterate on the robot’s design, the cost can come down. This can allow academic partners and various startup to buy them at affordable prices and begin to do research of their own. The sophistication of the robot’s understanding of the world can improve, and its circle of competence can expand. The most important part is to get the ball rolling.