does the program exhibit the buddha way?
Mood: hopeful
Posted on 2006-10-04 07:37:00
Tags: netflixprize programming
Words: 402

I was trying to track down a particularly odd bug at work yesterday and I suddenly realized I was effectively playing a game of Zendo with the program. I had my theory about what sequence of events made the bug show up in the program (i.e. what koan of events had the buddha way), and I pretty early on got a repeatable koan that always exhibited the buddha way. After some time trying to look at the code to figure out what was going on, I had my theory about what made a sequence of events exhibit the buddha way and I started trying to make it happen some other way. Just before I left I discovered something completely unexpected that seems to determine whether a sequence of events exhibits the buddha way based on something entirely unexpected. That's the thing about debugging versus Zendo - the rules are a lot looser! Anyway, I miss playing Zendo, but I guess I can have my own little game at work :-)

I spent some time working on Netflix Prize stuff last night - I tried to parse all the data but it ran out of memory, so I closed Firefox, tightened up my class that was representing each rating (12 bytes to 8 bytes makes a big difference...), and got it to successfully parse all 100 million+ entries. The only slight concern is that that doesn't leave a whole lot of memory to do any sort of analysis (I only have 1G, 800M or so of which will be used up by the ratings), but I think I can do just O(n) stuff in space and get by. Or, I could buy more memory (woot!), but the last time I tried to put more than 1G of RAM in my computer, bad things happened. Maybe I'm just being superstitious.

The other problem is that it takes around 40 minutes just to read in all the data. I had it calculate the average rating over all ratings last night and it gave me something reasonable, so I was thinking of writing some code to do the ratings on the test set and just rate everything the average and submit that. This would give me a good baseline - although they give you that baseline already, I think, it would be nice to have an official entry submitted, even if it's crap. So maybe I'll get to that tonight...


This backup was done by LJBackup.