Skip to content

Minimalism in Programming

I’m 32, and I’ve been programming actively for over 16 years at this point. I don’t have a lifetime of experience doing this, but over the years, I’ve come to develop a certain style, a kind of philosophy or methodology that I try to apply in everything I do. Crucially, I would say that I’m a minimalist. I like to build things that are only as complex as they need to be to accomplish their purpose. I like to distill ideas to their simplest form.

Much of what I will discuss in this post may seem like common sense to many of you. I’m probably not the first one to tell you about the principles of KISS and YAGNI. Unfortunately, I think that the art of building simple, robust engineering is something that is rarely taught in universities, poorly understood, and often disregarded. We live in a world full of bug-ridden, poorly written software, with a thousand useless bells an whistles. I don’t believe it has to be this way. In my view, much of the bugs we encounter could be avoided if more programmers followed some basic principles to minimize complexity.

Back when I was a teenager, back in the early 2000s, one of the first programming projects I embarked on was some ambitious 3D game. I recruited several people to work on this project with me. We produced a lot of art, and we did implement a game engine and even a map editor. We had some nice screenshots to show. Unfortunately, no game ever came out of it. One of the main issues is that there was a lack of focus on my part. I wanted to build something more awesome than Unreal, Quake 3 and Half-Life, and I thought I needed killer tech to do this, but I didn’t really have a specific game in mind. I had no specific goal, and so no concrete plan. I would guide the project in whatever direction seemed most interesting at the moment. Every two weeks, I’d start on some new feature for the game engine, but never quite finish it. We most likely could have build a game, if I’d been willing to aim at a simpler, more realistic objective.

These days, before I even start on a new project, I try to spend some time doing some research to convince myself that this project is worth doing, that I have the time to do it, and that I can set realistic goals. I try to start small. I ask myself what is the smallest, simplest version of my idea that I could implement, with the least amount of features, and I try to plan out the steps I will need to complete to get to that. Simply put, the first step, in my view, is to clearly outline what the Minimum Viable Product (MVP) is going to be. Defining an MVP helps me stay focused, and it also ensures that I have a goal simple enough that I can be sure I’ll stay motivated long enough to get there.

Many people make the mistake of thinking that if they don’t immediately account for all the features they could possibly want to add to a project from the beginning, they might paint themselves into a corner, unable to refactor the code, unable to bring the project where they ultimately want it to be. My counter-argument would be that refactorings are inevitable. You will make design choices that turn out to be wrong. You will need to change your code. You simply can’t account for every possibility and every single interaction from the beginning, because there are too many unknowns. If you start with an MVP, you will gain a lot of insight in the process. You will also have a working product that is very simple, and so very easy to refactor.

Trying to build simple products will help you keep people motivated, gain insights, and ultimately reach your goals. It might also help you avoid bugs. Less code means less corner cases, less things to test, less that can break, and less to debug. This is something that good engineers understand very well. When you minimize the number of moving parts, when you minimize complexity, you minimize the chances that your product, and your project, will fail.

I think most of what I’ve said so far is commonly accepted wisdom among seasoned programmers. What I’d like to bring up next is that minimizing the complexity of your software, eliminating possible points of failure, is not just about simplifying the code you write. In my opinion, it also applies to the things your software stands on. The software you build necessarily makes a number of assumptions, and has external dependencies. Most programmers, it seems to me, follow the “don’t reinvent the wheel” philosophy. If something has already been implemented, you should just use it, never implement your own. This is seen as a way to minimize the complexity of your software. The problem is that not all external dependencies are created equal.

Every library that you import is a piece of software you don’t have control over. It’s something that needs to be built and installed in order for your software to runs. It’s a black box with its own many dependencies and possible points of failure. How often have you tried to install a library or piece of software and found that it was broken out of the box? If your software has 15 external dependencies, then quite possibly, over the next year, one of these will break, and your software will be broken along with it. If you’re programming in Python, chances are that your software will break several times over the next few months, as packages are being changed and broken under your feet.

When I write software, I try to minimize the number of dependencies I rely on. I do this both to minimize possible points of failure, and to make sure that people installing my software won’t have a terrible time getting it to work. When I have to rely on external dependencies, I try to pick the ones that are more established and well-maintained rather than obscure ones. Sometimes, I will “reinvent the wheel”, when I judge that the effort required is small enough. Obviously, this doesn’t always make sense. If you roll your own crypto and you’re not a crypto researcher, you deserve to be slapped upside the head. However, if you need to load textures in your indie game, you very well could implement a parser for 24-bit TGA images instead of relying on some library which itself has 50 external dependencies. This can be done in less than 100 lines of code.

There are other ways in which you can minimize points of failure and make your software more reliable. You can prefer simple file formats, prefer open standards, and avoid proprietary solutions. Ultimately, the best way to keep things simple, however, is to simply have less features. I’m a minimalist. I prefer to build software that does one thing and does it well. Ultimately, you’ll never make everyone happy, you’ll never satisfy every use case, not without contorting your design into something unmaintainable and fragile.

 

Advertisements

My new Job in RL & Robotics Research

After I completed my PhD in early 2016, I took a job with the GPU compiler team at Apple and moved from Canada to the United States. I wasn’t sure if Silicon Valley was for me, but I figured that if I was going to try living in another country, it was best to make that kind of move when I was young, without property, children or a life partner.

My job at Apple was in many ways a dream job. It was sunny almost everyday, I worked on a small team of highly qualified friendly engineers, had an awesome boss, ate lots of insanely great organic food, and spent my work days sitting in a comfy Aeron chair. Unfortunately, I wasn’t quite happy there. For one thing, I didn’t quite mesh with Apple’s secretive culture. I didn’t love the culture of Silicon Valley that much either, but the biggest problem was that the isolation got to me. Living in an American suburb far from everything I knew, and dealing with illness in the family back home, I ended up having to take antidepressants for the first time in my life. I decided to move back to Montreal, the city I love, in the hope of living a healthier and happier life.

It’s been nine months since I started my new job as a staff member slash research assistant at the Montreal Institute for Learning Algorithms (MILA). It’s one of the biggest (if not the biggest) university research labs focused on artificial intelligence, headed by Professor Yoshua Bengio. This job doesn’t come with a Silicon Valley salary, but in my view, it’s a good mix of the perks I could get in an industry job, combined with the freedom that comes with academic research. The schedule is very flexible, the work is highly experimental, and best of all, as a research assistant, I have the opportunity to publish, but not the obligation.

You might be surprised at the current shift in career plans. I did my PhD in compiler design, so why am I working in a deep learning lab? Obviously, there’s a lot of excitement surrounding machine learning and AI right now. I, like many others, believe that this technology will transform the world in a million ways, most of which we haven’t even begun to imagine. That’s one reason why I’m here: I’ve shared that excitement for a long time. However, that’s not the only reason. Fundamentally, I like research. This is a very active area of research. Compilers, as cool as they are, are not a very active research topic anymore. Most industry compiler jobs revolve around maintenance, and implementation of tried-and-tested ideas. Most academic compiler research is focused on incrementalism. There unfortunately isn’t that much pioneering.

I feel very fortunate that my new job has allowed me to pick which projects I would get involved in. I’ve chosen to focus on projects in the areas of reinforcement learning and robotics. Reinforcement learning holds a lot of promise as a technique for teaching algorithms new tricks. Robotics has fascinated me since I was a child, and offers me the opportunity to tinker with electronics and more concrete projects. Another great perk of this new job of mine, is that being an academic lab, they fully embrace the open sharing of ideas and information. I will be allowed to blog and discuss in detail the projects that I am working on. Stay tuned!

 

We All Live in a Bubble

Almost exactly two years ago, Star Wars: The Force Awakens came out in theaters. Hundreds of millions of people (including several of my own friends) were eagerly awaiting this release. I, on the other hand, was completely oblivious. I only found out about the existence of this movie about a week after its release, when I overheard spoken conversations during the holidays. I was surprised and disappointed at my own ignorance. Clearly, I’d been living under a rock.

I am still, in many ways, living under a rock, in my own bubble. I know quite a few things about compilers, machine learning, and the soap opera that is American politics, but I’m largely unaware of what goes on in municipal and local politics, for instance. I also know very little about local (Quebecois) artists and musicians. To tell you the truth, it’s not that I don’t want to know about these things, it’s that this information never makes its way to me, and I don’t go out of my way to find it.

When I was a child and a teenager, I was in many ways more informed about what might be termed “mainstream” culture. Every night, me and my mom watched the evening news and various talk shows. Much of the information we got about our local culture was delivered to us in this form. Today, it seems we celebrate the death of television, the cutting of the cord that came with the arrival of the information age. This is the end of this dominant thread of culture that was imposed upon us, in favor of access to a seemingly unlimited amount of information.

We have the freedom to choose what we’re exposed to. The information doesn’t come packaged and predigested in evening news bulletins ready for mass consumption. The downside of this, however, is that to some extent, we’re all living in our own bubble. There is no longer a common thread of culture that I share with everyone I know. We each can pick and choose what interests us, and ignore all the rest. The thing is, to some extent, we don’t even make that choice.

Facebook, Google, twitter and reddit run some ranking algorithms which we only see the end result of. If I had some input, I would tell Facebook to please show me more information about local news and politics, because I’d like to stay in the loop. Unfortunately, they never asked for my opinion. They’re not in the business of making sure that I’m a well-rounded individual, they’re in the business of captivating my attention, and selling some share of it to advertisers. The unfortunate truth is that in some ways, the internet isn’t so different from television: which information we get or don’t get to see, is a choice that’s increasingly being made for us.

I think the internet has brought us many wonderful things. We have more access to information than ever before. We also have the ability to reach many more people than we ever could. However, it seems to me that our ability to choose is also contributing to worsening social isolation in the modern world. We’ve been given the freedom to find threads of culture online that cater to our precise tastes.  That can be wonderful and enriching, but there is also a huge downside: it also means that people have the freedom to pick a cultural bubble online that is as comfortable as possible, and never make the effort of stepping out of it. The downside is the confirmation bias.

Some political commenters have said that modern politics are becoming increasingly polarized. More and more, things are being spun as left versus right, liberals against conservatives. There’s a lack of empathy, with each side portraying the other as morally bankrupt and unsalvageable. I think that the internet and the attention economy may be largely to blame for this situation. With the internet, no matter how radical your beliefs are, you can find people and sources online who will tell you exactly what you want to hear: that you were right all along, that those who disagree can only be sick individuals, and that you can rest comfortably in your sense of moral superiority.

Google, Facebook, twitter, etc. are not concerned with your personal growth, your edification and rigorous fact-checking. They’re also not concerned with your long-term well-being and whether you have healthy relationships with the people who surround you. They’re happy to spoon-feed you a diet of whatever they think you like and will keep your attention the longest. I would say that it’s up to you, me, and all of us to make the uncomfortable effort of standing up, and stepping outside of our cozy little bubble. However, I’m also quite worried that this simply isn’t going to happen, that most of us are going to remain addicted to the information equivalent of shitty french fries, reconstituted chicken meat and flavored sugar-water.

 

The AI Risk Isn’t What You Think

Recently, a number of prominent figures, including Elon Musk, have been warning us about the potential dangers that could arise if we can’t keep artificial intelligence under control. The fear surrounding AI dates back a long time. Novels and stories about robot takeovers date back as far as the 1920s, before the advent of computers. A surge of progress in the field of machine learning, and the subsequent investment of hundreds of billions of dollars into AI research by giants such as Google, Amazon, Facebook and Microsoft, has brought this fear back to the forefront. People are waking up to the fact that the age of AI and robots is coming soon, and that self-aware AI could very likely become a reality within their lifetime.

The 2014 book Superintelligence: Paths, Dangers, Strategies by Nick Bolstrom embodies this fear. In his book, Bolstrom details multiple scenarios in which AI could spiral out of control. I believe that the author has achieved his goal: he has successfully scared many researchers into paying attention to the existential threat surrounding AI, to the point where AI safety is now a serious field of research in machine learning. This is a good thing. However, I think that Bolstrom’s book is in many ways alarmist, and detracts from some of the bigger, more immediate threats surrounding AI.

Much of the Doomsday scenarios in the Superintelligence book are centered on the idea that AI entities will be able to rapidly improve themselves, and reach “escape velocity” so to speak. That they will go from human-level intelligence to something much beyond in a ridiculously short amount of time. In many ways, I believe this portrays a poor understanding of the field of machine learning, and the way technology usually progresses. I see at least three factors that make this scenario unlikely:

  1. While the idea of an AI entity rewriting its own machine code may be seductive to sci-fi authors, the way deep neural networks operate now, they would be hard pressed to do such a thing, particularly if they weren’t designed with that purpose in mind.
  2. Currently, machine learning researchers are struggling to put together enough computational power to train neural networks to do relatively simple things. If an AI became self-aware tomorrow, it probably couldn’t double its computational power over night, because doing so would require access to physical computing resources that simply aren’t there.
  3. Sudden explosive progress is not the way any past technology has progressed. As rapidly as computers have evolved, it took decades and decades to get from the ENIAC to the computers we have now. There is no reason to think that AI will be incredibly different. So far, the field of machine learning has seen a fairly gradual increase in the capabilities of the algorithms we have. It took decades to get to where we are now.

Silicon Valley likes to tell us that technological progress goes at an exponential rate, but fails to deliver any real evidence backing this dogmatic belief. In the case of self-aware AI, I think a more likely scenario is that we will be building machines with increasing levels of awareness of the world. We’ll build robots to clean up around our homes, and the first ones will be fairly stupid, limited to a small set of tasks. With never generations, they’ll become capable of doing more and more, and understanding more and more complex instructions. Until, eventually, you’ll be talking to a robot, and it will understand you as well as another human being would.

In my opinion, the advent of self-aware AI will require several more breakthroughs in machine learning. It may also require several generations of hardware that is designed with the sole purpose of accelerating neural networks. The good thing is that if self-aware AI takes a long time to emerge, the first general-purpose AIs will have a fairly limited understanding of the world, and limited computational capabilities. This means those first AIs will simply not be capable of taking over the world. It also means we may have several years to test a number of fail-safe mechanisms between the time where AIs start to have a useful understanding of the world, and the point where they are genuinely dangerous.

I think that, in some ways, the focus on the existential threat surrounding AI detracts us from a bigger, more immediate danger. AI is an immensely powerful tool. In the hands of giant corporations like Google and Facebook, it can be used to sift through every text message and every picture you post online. It can be used to analyze your behavior, control the information you see. The biggest risk posed by AI, in my opinion, is that it’s a tool that can be used to manipulate your life in ways that are useful to those who control the AI. It’s an incredibly powerful tool which is controlled by a very small few.

 

Zeta’s JITterpreter

About six weeks ago, I made ZetaVM open source and announced it on this blog. This is a compiler/VM project that I had been quietly working on for about 18 months. The project now has 273 stars on GitHub. This is both exciting and scary, because so much of what I want to do with this project is not yet built. I really want to show this project to people, but I also find myself scared that people may come, see how immature/incomplete the project is at this stage, and never come back. In any case the project is open source now, and I think it would be good for me to write about it and explain some of the design ideas behind it, if only to document it for current and potential future collaborators.

One of the main goals of ZetaVM is to be very approachable, and enable more people to create their own programming languages, or to experiment with language design. With that goal in mind, I’ve designed a textual, stack-based bytecode format that resembles JSON, but allows for cycles (objects can reference one-another). Functions, basic blocks, and instructions in this bytecode format are all described as a graph of JS-like objects. This is very user-friendly. Anyone can write a Python script that outputs code in this format and run the output in ZetaVM. It’s also very powerful, in a LISP-y code-is-data kind of way: you can generate and introspect bytecode at run time, it’s just made of plain objects and values that the VM can manipulate. ZetaVM has first-class bytecode.

The downside of all this, as you can imagine, is that it inherently has to be absolutely dog slow. Or does it? The first version of the Zeta interpreter traversed the graph of bytecode objects the naive way, and it was indeed dog slow. I’ve since written a new interpreter which removes the overhead of the object-based bytecode by dynamically translating it to an internal representation (think dynamic binary translation). The internal IR is compact and flat, executing it involves no pointer hopping.

The new interpreter generates code on the fly, which means it is, by definition, a Just-In-Time (JIT) compiler. It’s architecture is based on Basic Block Versioning (BBV), a compilation technique I developed during my PhD (talks and papers on the topic are linked here if you’re curious). BBV has the nice property that it generates code lazily, and the generated code naturally ends up compact and fairly linear. This is not done yet, but BBV also makes it possible to specialize code to eliminate dynamic type checks very effectively, and to perform various optimizations on the fly.

You might be wondering why I’m bothering with an interpreter, instead of just writing a JIT that generates machine code. One of the motivating factors is that Zeta is still at a very early stage, and I think that an interpreter is a better choice for prototyping things. Another factor is that it occurred to me that I could potentially make Zeta more portable by having the interpreter do much of the compilation and optimization work. The interpreter can do type-specialization, inlining and various code simplifications.

The interpreter will be designed in such a way that the internal IR it produces is optimized and in many ways very close to machine code. It should then be possible to add a thin JIT layer on top to generate actual machine code. The resulting JIT will hopefully be much simpler and easier to maintain than if one were compiling directly from the raw object-based bytecode format. Another benefit of this design is that all of the optimizations that the interpreter perform will not be tied in with the specifics of x86 or other architectures, they will remain portable.

At the moment, the new interpreter is at a point where it lazily compiles code into the flat internal format, but performs no other optimization. This was enough to get a 7x performance improvement over the naive interpreter, but the current system is still quite a bit below the performance level of the Python interpreter, and there is definite room for improvement. Some of the first optimizations I would like to introduce are the elimination of redundant branching instructions, and the use of inline caching to speed up function calls.

The elimination of redundant branches is fairly easy to do with BBV. Code is lazily compiled, and appended linearly into a buffer. When generating code for a branch, if the block we are jumping to is just about to be compiled, then the branch is redundant. BBV will naturally tend to generate code that flows linearly along hot code paths and branches out-of-line for infrequent code paths. That is, the default path often comes right next and requires no branching.

Inline caching is a classic technique that was pioneered by the Smalltalk and SELF VMs. It’s used, among other things, to eliminate dynamic property lookups when polymorphic function calls are performed (see this excellent blog post for more information). Currently, ZetaVM, when performing a function call, needs to read multiple properties on function objects. For instance, it needs to find out how many arguments the function has, and what is its entry basic block. These property lookups are dynamic, and relatively slow. The end result is that the call instruction is very slow compared to other instructions.

Most call instructions will end up always calling the same instruction. Hence, dynamic overhead on function calls can largely be eliminated by caching the identity of the function being called by a given call instruction. That is, one can cache the number of arguments and entry basic block associated with a function object the first time a call instruction is run, and then reuse this information when calling the function again, provided we keep calling the same function. This information will be cached in line, in the instruction stream, right after the call instruction opcode, hence the name inline caching. I anticipate that with inline caching, function calls in Zeta can be made several times faster.

ZetaVM, my new compiler project

Like many of you I’m sure, I’ve wanted to create my own programming language for a long time. I think it’s a common aspiration for many programmers, to create a language with all the features we love and none of the ones we hate. Our own “ultimate programming language”, created in our own image. What I’ve come to realize over time, however, is that I’m actually not quite sure what should go into my ultimate programming language. I have a lot of ideas, but they don’t necessarily fit into a coherent whole. It’s also the case that my ideas about what the “ultimate programming language” should be like keep changing as I gain more programming experience and get exposed to new ideas.

My PhD was in compiler design, and this is something I truly enjoy playing with. As such, I’ve decided that my next programming language project wouldn’t be a programming language per-se, but actually a compiler, a platform to create new programming languages. I’m doing this in part because I enjoy it, and it’s something I feel confident I’m good at, but also because I think I can build a platform that will make it much easier for myself and others to do programming language design and experimentation. ZetaVM is going to have, as one of its main design goals, to make creating new programming languages very accessible. It will make it possible for anyone who’s mastered a language such as Python to create a language of their own, in less than 2000 lines of code. Not only that, but ZetaVM will instantly make your language decently fast. It will have JIT optimizations suitable to languages such as Python/Lua/JavaScript, and will instantly give you fast objects, type-specialization, etc.

ZetaVM is a virtual machine for dynamic programming languages. It will provide native support dynamic typing and most common data types found in Python/Lua/JS/Ruby, such as strings, extensible objects, extensible arrays. What makes it particularly easy to get your own language running on this VM is that Zeta’s Intermediate Representation (IR) is representable as a textual format similar to JSON. This makes it fairly trivial for you to write, say, a Python parser for your new language, and generate Zeta IR in a textual format at the output. You don’t have to worry about implementing dynamic typing, or register allocation, or garbage collection, or arrays and objects, all of that is done for you. I’ve created a simple language called Plush (JS and Lua’s bastard child), which demonstrates how this can be done, and serves to help me bootstrap and test the system.

Beyond making it easy for myself and others to create programming languages, Zeta will be a platform for me to try some bold new ideas in the design space of programming languages and JIT compilers. I would like to try and tackle one of the biggest issues plaguing programming languages today, which is that of code rot. My goal is to eventually freeze the IR and APIs provided by Zeta, so that code that runs on Zeta today might have a chance of still working in 20 years, without any changes. This goal is ambitious, but I have some ideas which I believe might make it work.

Finally, one big disclaimer I should give is that Zeta is still a young and immature project. In its current state, Zeta is experimental and will have many breaking changes, as most new languages/platforms do. Zeta also currently only has a naive interpreter which walks the object-based IR and is dog slow, about 200K instructions per second. I’m currently working on an interpreter that will compile the object-based IR into a lower-level internal IR. This interpreter will use Basic Block Versioning (BBV) and self-modifying code. I believe it should realistically able to reach speeds of 100MIPS within the coming months. My plan after that is to build a lightweight JIT which will sit on top of the optimizing interpreter and compile the internal IR to machine code.

Technodiversity

The year is 2048. Migration to IP-V6 has just been completed. Every object in the world, including your coffee mug and the chair you’re sitting on, comprises a manycore RISC-V cluster, running Linux, with its own unique IP address. Haskell, because of its mathematically provable superiority, has come to supplant every other programming language. Writing code in any language other than Haskell is now a misdemeanor, punishable by up to 64 days of bandwidth throttling and a fine of up to 0.125BTC.

Wouldn’t things be much simpler if every computer system was built on the platform, ran the same operating system, and every program was written in the same programming language? I think that in many ways, this is the wet dream of many programmers. No more dealing with cross-language boundaries, portability issues, and multiple incompatible implementations of subpar standards. Things could be much simpler and more efficient than they are now.

The biggest problem, however, is that in a world where every computer system runs the same software, the same version of the same operating system, on the same hardware, every computer system has exactly the same bugs and security weaknesses. Given that some security flaw exists, a single computer virus could potentially contaminate every computer system in the world in a very short amount of time. In today’s world, this Hollywood-like doomsday scenario seems extremely implausible. The real world is too messy, or too diverse, for it to be practical to build a single virus or worm that could infect every system.

In a lot of ways, the chaos of the technological world resembles that of nature. Like animal species competing for survival, various technologies and standards compete for funding and mindshare. In nature, species specialize to exploit new environments. Diversity, in the technological world, exists in part because specialization makes systems more efficient, which allows the exploitation of new market niches. In nature, genetic diversity, or biodiversity, makes it near-impossible for a single virus to wipe out the entirety of life on earth.

Today’s technological world is definitely messy, but we can take comfort in the fact that competition really does foster innovation. The technological diversity that exists right now is a sign that we live in a thriving ecosystem, as opposed to one that is stagnant. With self-driving cars and a variety of home automation devices on the horizon, we can also take comfort in the idea that technodiversity may actually be keeping us safe.