Wednesday, November 3, 2010

Frustration and more frustration

It's taking a great deal of personal restraint to keep from setting my laptop on fire at the moment.

After many months of *ahem* time off from my Ph.D. work, I've been getting after it over the last month.  For the past couple of weeks, I've been trying to see why some data I generated for my prospectus looked very odd. Frustrated, I did the runs again tonight and everything turned out fine and look as expected--which was unexpected. Flabbergasted at the wasted time, I decided to poke around and see what possibly could have changed to fix it--I don't like miracles when it comes to software. Here's the relevant SVN log (log of changes for the unfamiliar):


------------------------------------------------------------------------
r306 | rmay | 2010-03-21 13:23:45 -0500 (Sun, 21 Mar 2010) | 1 line


Fix missing factor when calculating unattenuated power.
------------------------------------------------------------------------
r307 | rmay | 2010-03-21 14:55:43 -0500 (Sun, 21 Mar 2010) | 1 line


Fix jarbled output fields due to bad ordering of dimensions. (Due to me ignoring what was done in arps_reformat.)  Also work around numpy bug when trying to save attributes in pupyere with a single character.
------------------------------------------------------------------------
r308 | rmay | 2010-03-22 13:29:58 -0500 (Mon, 22 Mar 2010) | 1 line


Change units back now that numpy has been fixed.
------------------------------------------------------------------------
r309 | rmay | 2010-03-26 16:44:45 -0500 (Fri, 26 Mar 2010) | 1 line


Fix bug generation of 2-moment interpolation coordinates. Also improve interpolation by using points logarithmically (instead of linearly) distributed for number concentration.  Also fix problem where we divide fall speed by 0, results in bad values for velocity (and phase, breaking time series generation).
------------------------------------------------------------------------
r310 | rmay | 2010-03-26 16:45:28 -0500 (Fri, 26 Mar 2010) | 1 line


Make commas_reformat copy out the model's reflectivity by default.  It's a useful diagnostic.

r309 is the change that I'm pretty sure fixed the bug. Now, since I'm such a proponent of reproducible research and good scientific software packages, I went to the effort of putting the version of my code that generates the data files. Here's what I see when I look at one of those files:

:VersionNumber = "0.8.dev306" ;

That 306 represents the SVN revision number. Bonus points if you realize what's wrong there. If not, what that means is that I did all my nice data runs for my General Exam without actually using the most recent (and "correct") version of the code. Oops.

While I'm pretty torqued off at myself for wasting quite a bit of time chasing a ghost (and for some reason not installing my newest code before running), not to mention using bad data files for some recent plots, hopefully this provides a good use case and motivation for some good scientific software practices. Namely, using version control and and putting version information in my data files allowed me to at least diagnose what went wrong.  Without this information, I would have simply had to chalk up the fact that my code is producing the right answer to "magic"...or the code gnomes. If you're not already using version control and putting sufficient information into your data files, you need to start right now.

No, I'm serious. Do. It. Now. I'd recommend either:
This event reminds me that I don't take notes on my research worth anything. If I'd had notes, maybe I would have read them and seen something about an important bug fix that I made 7 months ago.  To rectify this, I'm going to start using this blog to log and take research notes.  If nothing else, it should be entertaining to go back and read when I'm done, to laugh at all the times things blew up. Or cry.

4 comments:

Robert Kern said...

Code gnomes is a perfectly cromulent explanation.

WeatherGod said...

I am gonna have to work code gnomes into my next advisory meeting. As an aside, I have started using git tags to indicate important milestones. Very useful for situations like knowing what version of the code was used for that manuscript draft you sent 4 months ago, or even for the published results from 3 years ago... ::ah-hem::

John Leeman said...

Agreed... While science does have an element of magic it shouldn't be in the code! Sadly it happens in some pubs.

WeatherGod said...

@John: Magic in pubs? I'd pay money to watch a magic show while drinking!