« onay oosgnay | Main | solving the uNPsolvable? »

The Dreaded Heisenbug

Yay I used a bayes decision tree to isolate a bug today in a fraction of the time it would have otherwise taken.

About 48 hours ago I started work on repairing a Heisenbug. For the less geeky of you, Heisenbugs are a rather nasty class of software fault that “disappears or alters its characteristics when it is researched”.

Most bugs are generally the result of only a single input (or knob or button or whatever) being set to a single value (or range or whatever). Software testers live by this assumption, and 99% of the time it is true, so true that we tend to forget (or is that ignore?) the 1% of bugs that can’t be explained so easily.

After tearing my hair out all yesterday and going to bed feeling somewhat defeated, this morning I woke with a fresh mind, a new day and a small suspicion that perhaps this bug fell into that 1%.

After poking and prodding at my adversary for most of the morning it was pretty clear that this was occurring probabilistically and that some pairs of input combinations made the bug occur more frequently.

Turns out the randomness was due to a threaded race condition and a combination of three inputs being in a certain range tended to make it occur more frequently – knowing those settings (which fell out of the decision tree) was thankfully enough to explain why the bug was occurring.

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)