Skip to content

Debugging by the Seat of the Pants

October 24, 2012

Two people recently asked me what IDE or other tools I used to develop. My answer was that I only use a compiler, GNU make and a text editor. What seemed most surprising to these people was that I don’t use a debugger. I never have. I don’t even really know how to use one. This seems to really puzzle some people. I’ve written fairly complex software, including compilers, a 3D game engine and a type analysis. As every other programmer out there, I have to find and fix bugs on a daily basis. Most of these are trivial, but others complex and hard to find. In this post, I’ll discuss some of the strategies I employ to find bugs without special-purpose debugging tools.

Debugging is very much detective work. Your program isn’t behaving the way you expect. The goal is to find what the root cause of the problem might be, and the best way to go about this is by reasoning logically about the problem. One possible technique is to proceed by elimination, or, as Sherlock Holmes would say:

“When you have eliminated the impossible, whatever remains, however improbable, must be the truth”.

That is, you can use your instinct, your hunches, to make an educated guess as to what parts of the program’s code might be causing the problem to happen. Most of the bugs I encounter are simple, and my intuition usually leads me right to the cause. Some bugs are unfortunately trickier. One of the worst mistakes new programmers make when it comes to debugging is to assume that some part of their code is perfect and cannot possibly be the cause of a given bug. Wrongly eliminating a possible cause can cause one to search for hours in the wrong places without ever finding the said bug. Debugging can begin with a hunch, but you should never discount a possible cause until you have verified that the cause is actually elsewhere.

The question, then, is how to go about narrowing down the possible cause of a bug. When using a debugger, one can install breakpoints in a program to stop the program and step through its execution, examining local variables. I simply use print statements to accomplish the same thing. By placing print statements at various points, I can see where the program execution stops (in the case of a crash) and also print local variables and data structures. By iteratively inserting and removing print statements, using my intuition in the process, I can narrow down the probable cause of a bug very quickly. The key is to determine where the state of the program stops conforming to my expectations and becomes corrupted, progressively narrowing down which piece of code caused this to happen.

You might still think that not using a debugger is silly. Why not use the tool if it exists? For one, I believe I can find bugs just as fast using the techniques I developed over the years. There is also a case to be made that being able to debug without a debugger is actually very useful for someone who works in compilers, because often, there is no debugger available. I’ve had to debug x86 machine code generated by my own JIT compiler, and to use a debugger there, I would have had to write one. Instead, I implemented a system to print out numerical values, and this was enough to debug my system. Finally, debugging complex software sometimes requires special techniques that go much beyond what a debugger does.

I recently worked on a dataflow type analysis for JavaScript which involved two simultaneous, interacting fixed point computations and millions of dataflow facts being propagated in the system. Most of the bugs I had to find in this case weren’t crashes, but type inference errors caused by complex interactions in multiple areas of the system. Sitting down and single-stepping my program in a debugger would not have revealed any useful information there. Instead, I had to write one-off, special-purpose pieces of code to detect inconsistencies as early as possible and allow me to get different views on the state of the system while it runs. Debugging complex systems can be difficult, and I often find that the best strategy is to work to prevent latent bugs using unit tests and assertions, but that’s a story for another time.

From → Programming

4 Comments
  1. I really don’t like source-level debuggers since they make me think much less about the code and the bug in question. _Not_ using a debugger forces me to “think before starting debugging”, and as a bonus, I usually understand my code much better and find a number of unrelated (but important) bugs too.

    I also don’t know how can source-level debuggers be useful in the ugly Heisenbug cases where several sources of parallelism are involved. For example, a combination of kernel threads, multi-core CPUs, inter-processor interrupts, implicit and hidden SMI interrupts, timers, etc makes the point of debuggers already moot. The only answer then is to build a mental model of the code and work your way out.

    But _binary_ debuggers (like the ones provided by Bochs x86 emulator) saved the day for me twice. I once faced a bug that costed full six days to solve[*]. Using the “elimination method” discussed in your post above, the problematic area in question was reduced to a single x86 opcode, but the bug was still there! I also couldn’t find any problem in the compiler-generated assembly.

    But using a binary debugger made me notice that GCC kept the leaf C functions local state below the stack pointer! Thus rendering the entire kernel’s leaf functions interrupt-unsafe, and I just couldn’t believe my eyes. Searching the AMD64 ABI document, I found that the x86-64 ABI designers used that trick to mitigate the effects of 64-bit pointers and the resulting larger code size.

    A usual disassembly was not useful in my case, and that’s where the binary debugger saved the day, because the bug was compiler-generated in far and unexpected areas that differed from run to run.

    [*] http://forum.osdev.org/viewtopic.php?t=21720

  2. David Bruant permalink

    When I went to Engineering School, I was taught to never do “printf-based debugging”. I thought it was bullshit. Why fire off a debugger if a couple of well-placed printf can give me enough information to find where the defect is?
    Going to JavaScript where there is no compilation phase made this idea better. Place a couple of “console.log” press F5 to reload and here you go.
    At the same time, webdev debugging tools improved a lot and it’s really easy to use a stepping debugger. It turned out to be extremely useful when debugging a website for someone else, where I did not have the source (or the time to setup a local server).

    One point missing in your article is that rarely-but-sometimes, debugger go beyond what they’re expected to do and insert Heisenbugs. Happened to me no later than yesterday: http://code.google.com/p/fbug/issues/detail?id=6027

    Thanks for your great article!

  3. Well said! I also use print statement debugging, and it works really well. We’re in good company too – many of the top programmers interviewed in the book “Coders at Work” use the same method. The book is really interesting for a lot of reasons, and I really recommend it. My review of it at Amazon: http://www.amazon.com/review/R2OV0TG7MJGXGL/

  4. Don permalink

    Reading this article made me realize what’s wrong with debuggers. Their flaw is this: they only show you a snapshot of one moment in time. That is almost never interesting; what you really want is history: time displayed spatially. The most useful thing you can get from a debugger is a stack trace. And what that is, is history displayed spatially.
    My favourite debugging tool is code coverage. In D, I use the -cov option. Telling you how many times each each line has been executed is absolutely brilliant.

    printf debugging gives you history turned into space. That’s why it works better than a debugger.

Leave a comment