Quantcast
Channel: Miscellaneous-B » coding
Viewing all articles
Browse latest Browse all 10

Debugging lessons

$
0
0

For the past 6 weeks or so I’ve been trying to track down an elusive bug in my SNOMED classifier. The difficulty has been that it only manifest with very large input sets (I only managed to reduce it down to about 350,000 concept definitions). This meant lots of large data-structures and long chains of inferences needed to be traced backwards; tedious and time-consuming work.

Today I found the problem. As I had begun to suspect, there was a simple error in an underlying data-structure.

The lesson? Write unit tests carefully! It turns out that although I had written a test for the faulty method, the particular data-set I used in the test special-cased around the bug. What I should have done was use multiple data-sets (pretty obvious) and made sure they were more realistic (in this case I had used a single contiguous set of bits). If I had done this originally, then I would have found the problem much much earlier.

[Update: Ironically, I originally used java.util.BitSet instead of my hand-rolled data-structure but was running into memory usage problems so I replaced a bunch of Maps and Sets with my own versions optimised for their particular usage in the algorithm. It turns out that for this particular case, the java.util version is entirely adequate. There's another lesson here :-)]


Viewing all articles
Browse latest Browse all 10

Latest Images

Trending Articles





Latest Images