A Radical Introduction to Programming

July 10th, 2012

Java for the Masses

A few years ago, when I was just starting my M.Sc. in computer science at McGill, I had to teach the COMP-202 Intro to Computing class. This is essentially an introductory programming class. The students enrolled were mostly first year computer science or computer engineering undergrads. The language I was to introduce to them was Java. During the first class, I presented the students with a typical Java "Hello World" program. I thought this would seem trivial and obvious to everyone, but shortly after I showed them the program, there was a storm of questions:

Student 1: "Why does it say class there?"

Student 2: "What's public?"

Student 3: "What's static void?"

Student 4: "What's main?"

Student 5: "What's System.out?"

Student 6: "What are those brackets doing there?"

Student 7: "What's bytecode?"

It quickly became clear to me that Java was not an ideal language for beginners. Its verbosity was cumbersome and explaining example programs required the students to have faith in my teachings: "I can't explain all the details right now, but you'll have to trust me, this isn't relevant to the example, it will be explained later". A few weeks into the class, it pained me to realize that while about 60% of the class did quite well, the remaining 40% struggled with every programming concept I introduced. For some people, programming seemed intuitive and easy, but for others, it remained foreign and mysterious. Having only limited teaching experience, I assumed that I was the one at fault, until I found myself in a teacher's assistant position for a similar class at another university, where I witnessed a similar scenario.

Many computer science departments across the world have chosen Java as their introductory language of choice. Other common choices are Python, Scheme and ML (mostly taught in Europe). My own university will likely soon be teaching JavaScript as an introductory language. These languages were chosen because they are believed to be simple and relatively easy for beginners to understand. At the very least, easier and less confusing than a language such as C, C++ or Fortran. Why is it then, that in most introductory programming classes, a large proportion of the class struggles? Why do "easy" languages still seem so difficult for some students to grasp?

I believe that one of the main issues when teaching a language such as Java, Python or Scheme to newcomers is that these languages are too high-level. Programming is all about getting computers to accomplish work. To properly understand programming requires some mental model of the computational engine that will perform this work. It shouldn't be surprising, then, that so many students struggle. Most introductory programming classes teach nothing about the way computers work. They are taught by example using high-level languages that are fairly disconnected from how computers work internally. It remains very difficult to teach programming to students when you can't fully explain to them how source code relates to computation and instead have to rely on handwaving.

Odd Beliefs in Computer Science

Many computer science professors hold the deep-seated belief that it's not important at all to understand how modern computers work.The great Alan Turing has taught us that all universal Turing machines can simulate each other. Therefore, CPUs, clocks, registers, RAM, ALUs, logic gates and any other implementation details are completely irrelevant. Deep down, computer science is only pure math, the structured manipulation of symbols with abstract machines. Programmers need only understand the semantics of their language of choice. Turing would surely agree... Well, I actually found some interesting quotes about Alan Turing recently:

"Although a mathematician, Turing took quite an interest in the engineering side of computer design." - Maurice V. Wilkes

"He was particularly fond of little programming tricks (some people would say that he was too fond of them to be a good programmer) and would chuckle with boyish good humor at any little tricks I may have used." - James H. Wilkinson

You may not know that Alan Turing, after his time at Bletchley Park, worked on the design of a real-world general-purpose computer, the ACE, which was ahead of its time with fast in-CPU registers, microcode, a subroutine stack and floating-point arithmetic. Turing knew how to use a soldering iron and dabbled into electronics. He was very much interested in the implementation details of real computers. Why? Because it mattered. Turing wanted to build something: a computer that could simulate a mind. Without actual computers and programming languages, computer science would be nothing more than a pipe dream.

Find me a highly-skilled programmer, someone with at least a few years experience, someone you admire for their coding prowess, and I can guarantee you that this person has at least a fair understanding of how computers work internally. They have an accurate mental model of computers, their capabilities and the semantics of the code they write. This understanding is part of what allows them to be masters of their craft. I'm not saying that you have to have programmed in C, assembly or VHDL to be a good programmer, of course not. I'm saying that any skilled programmer could likely manage if they had to.

Focus on the Fundamentals

Perhaps the right thing to do would be to focus on teaching intro to programming classes with something lower-level than Java. It might actually make sense to teach people assembly as a first programming language. I can hear people cringing at the thought of this idea. Assembly seems too complex, too low-level, too archaic, too arcane, tied into the implementation details of actual hardware (not portable), and not that useful in real-world programming tasks. Teaching assembly to newbies might seem like some anachronistic and weird idea to many of you.

At the same time, there are advantages to assembly as a teaching language. When you're learning assembly, you're learning how an actual computer processor does what it does. You're learning about the instruction pointer, basic arithmetic operations, branching, memory layouts, the stack and how it can be used to implement recursive function calls. There is no compiler, there are no complex semantics. Everything is as plain and clear as it could be. For each instruction, the processor performs a precisely defined unit of work. The computational model is clear and unambiguous, and the root of such basic programming concepts as control flow and integer arithmetic is obvious.

You might still think I'm crazy. x86 assembly has hundreds of instructions and many ugly oddities that are purely due to historical accidents. There are simpler assembly languages out there (ARM, MIPS), but they are still fairly difficult for the newcomer to grasp. Here's the thing, if you're going to teach assembly, you don't necessarily need to bother with these complexities. It should be possible to design a simplified assembly language for a virtual processor that still exposes the interesting concepts to be taught. Something with the following features, for example:

Minimal instruction set with branching, stack manipulation and integer arithmetic (15-30 instructions total)
32 truly general-purpose registers, all of the same size (e.g. 32 bits)
Instruction and stack pointers
Large, uniformly addressed memory space (e.g. 4MB)
Simple calling convention, push everything on the stack, instructions can be provided to save and restore all registers at once.
All instructions operate on general-purpose registers in the same way
All instructions can take a constant where a register operand can go
Simple and obvious instruction syntax, favoring longer names instead of cryptic mnemonics (e.g. r3 = shift_left r1, 2)

Such an assembly language could be implemented in a simple virtual machine, for example in a web browser, to make it widely accessible. Edited code could be written to memory and run instantly with only one click. I/O operations would be kept minimally simple, with direct memory mapped output to video and sound buffers, and some simple library routines to make it that much easier. Useful learning tools such as real-time visualization of register contents and a color-coded view of the machine's memory space could also be provided.

Essentially, students would be programming a fancy "Commodore 64", with all the teaching benefits of direct interaction with the hardware, but without the discomfort and impracticality brought by the restrictions of 1980's hardware and what was practical to implement at the time. The teaching system would be simplified wherever possible while still resembling a real machine in all important aspects.

Because such a simplified assembly language wouldn't have all the complex syntax and the long feature list of a high-level programming language, it should actually be possible to teach it to students fairly quickly (perhaps within only a week or two). The true benefit is that teaching such a language also implies teaching students about a realistic computational model. Once the students understand this, teaching them about high-level languages should be much easier and much more rewarding.