Tag netflixprize (14)
Music: Smashing Pumpkins - "The Beginning Is The End Is The Beginning"
Posted on 2008-11-24 10:31:00
Tags: netflixprize math links
I'm actually feeling a bit down from finishing my talk - I guess I was looking forward to it more than I thought? It's weird. Maybe I need to work on whereslunch.org some more or something. Anyway, to cheer myself up:
arena rating graph
Posted on 2007-11-09 14:51:00
Tags: netflixprize math projects worldofwarcraft
Went out to lunch today with WoW people and we talked about this neat idea for an Arena rating graph. So you start at a 1500 rating and every match you play you gain or lose 1-29 points, and it would be neat to see the progression of that as the season goes on. Also nice to see what composition of teams we're good against and which we're not, and the ratings of the teams we win or lose to. I'm having to try really hard not to just start writing down ideas here since it's not work-related :-)
My first thought was to do the graph in Flex, but I forgot that Flex Charting is a separate package that costs money. So I did some research and there are lots of options for this sort of thing (found this excellent article on interactive charts and graphs via del.icio.us). AmCharts looks like the best option, with FusionCharts Free close behind. I'll have to download and try them out at home; of course the hard part will be figuring out how to display all that data in a useful way.
Here's an excellent article on how Simon Fink's team got into the top 10 for the Netflix Prize - lots of juicy juicy math. Makes me want to get back into that sort of thing, which would make three active projects at once, which is too much.
Posted on 2006-11-06 08:59:00
Tags: netflixprize projects proandcon
(I've been kinda unhappy and antsy the last few days, so the below is as much for myself as for anyone else...)
So I'm generally happiest when I have a side project that I'm working on outside of work, especially when work stuff is boring (i.e. now). I have been working on the Netflix Prize, but as of this evening (when my last run finishes and produces the same unimpressive results) I'll be done. Things that made it a good project:
- I used C++, which I haven't used in a little while
- The whole contest framework gave me a way to quantify how well the project was going
- Using real data with real movies is always neat
Not so good things:
- Because there was so much data, the nature of the project was to work on something for a few minutes or an hour and then leave it running for days. This meant at any given point in time I probably couldn't work on the project, and my computer was tied up so doing other things was painful.
- I like projects that other people can use (vanity?), but there's not much one can use about this one. I'm thinking of putting up a related movies finder but that might violate the TOS of the contest.
- Because people are so good it became pretty darn hard to get on the leaderboard.
- It was fun to play around with algorithms, but apparently you need a good idea or to do a lot of research to find a good one, which I didn't have/do.
Anyway, so like I said I'm putting this project to bed this evening. My question is, what will I work on next? I have some ideas stored up but I'm not thrilled with any of them.
LiveJournal backup - provide some way to backup all LJ posts and comments to those posts.
Minus: There's already a decent way to do this, except for comments.
Plus: This would be potentially useful for myself and others.
Minus: I have a hard time figuring what outputs to produce: one giant page with tons of entries and comments? A zip file with pages for each month?
Minus: Because of friends-locked posts, you would only want to view it on your own computer (and not publish it or anything) unless I somehow tied LJ users in with viewers of the page, which is not going to happen.
Plus: Ooh, I could provide some interesting statistics on moods and such. Or even some kind of randomish text from your LJ posts. Maybe.
Some kind of World of Warcraft mod
Plus: I've already done this, so I have the basics down at least.
Minus: I've already done this so it wouldn't be as interesting.
Minus: I don't have an idea of what kind of mod would be useful - there are lots of them out there already.
Minus: Developing for WoW is kind of a pain, since you have to open WoW a lot and develop at the same time, which is kinda slow.
Minus: I'm already spending a lot of free time in WoW...on the one hand this means I might be using whatever mod I make a lot, but I'd also prefer a non-WoW-related project.
Adding annotations to the baseball Win Expectancy Finder graph.
Minus: This project already exists, so it's less interesting to add little features to it.
Minus: It's not baseball season anymore, making this seem less appealing.
Minus: The main problem is coming up with a placement algorithm for annotations so they don't run into each other or the graph, which sounds pretty hard.
Add some kind of Getting Things Done style tickler list to my todo list
Plus: I kinda use the todo list, and working on it might be more incentive for me to use it again.
Minus: ...but I don't really use Getting Things Done stuff anymore. Without that, the todo list is pretty good as it is.
Minus: Again, not a new project.
Posted on 2006-11-01 09:29:00
Last night djedi and I went down to Fell's Point in Baltimore for Halloween festivities. Met up with his boss and we wandered the streets for a while. It was fun! Lots and lots of people there, and lots of good costumes. Off the top of my head, I remember seeing two different Duffmans, two Homers, Mario and Luigi, the Three Musketeers (who I had seen earlier at work!), Rainbow Brite(sp?), Jem, a bag of Jelly Bellys, Christopher Reeve (a guy in a wheelchair with a Superman costume...clever or tasteless?). I even saw Aqua Teen Hunger Force! (I took a picture - we'll see how it turns out) We also saw no less than four people wearing wall clocks as a necklace - like Flava Flav back in the day, right? Anyone know why this was so popular? It was very odd...
Also, we had our first ever trick-or-treaters! Our apartment complex had distributed bags with a few pieces of candy and instructions to "hang the necklace on the door" if you wanted trick-or-treaters. We're still confused by this, as (I can't stress this enough) a plastic bag is not a necklace! But we carved a pumpkin and put the fake candle-like light giving object that came with the carving kit inside (irritatingly it doesn't stay lit up but blinks, and not too frequently either) so apparently this was enough for these parents and kids to knock on our door. (although they mentioned the thing about the necklace...)
I'm still working on the Netflix Prize - I implemented a new scheme that used both movies and users, and ran it, and it did no better than using just movies (.98 or so - stopped it before it was done). So I tweaked some of the parameters and had it look at 8000 similar movies instead of just 300. This means it's going to take like 10 days to run completely. If that doesn't work I'll probably give up for good. Oh well!
I'm feeling a bit achy and such today - hope I'm not getting sick. Might be allergies, but it's hard to say.
Nintendo announced the games available on the Wii Virtual Console - looks like good stuff!
Posted on 2006-10-24 08:59:00
So, my best result on the netflix prize scored an RMSE of .981 on the probe set (this was using the movie correlation). My attempt to use user-based correlation resulted in a hideous 1.03 RMSE, which is still better than just taking the average of each movie and using that rating for each user, but not by much. So there is a little data in there, and this morning when I should have been showering I cooked up a little script to take a weighted average of the two results (obviously weighted towards the movie correlation one since it scored much better). The outputs were not horribly encouraging - I managed to get the RMSE down to .976 or so, but that's it. The last person on the leaderboard has an RMSE of .9597 right now, so I'm way off...
I do have one more way of calculating user correlations that is running now, but assuming that doesn't yield fabulous results I'll probably give up and reclaim my computer within the week. It's a little disappointing for it to end this way, but I did give it a good shot and even made it on the leaderboard for a short amount of time. And I had fun, and kept up my C++ skills a little. So it wasn't a waste!
Also, I've read that the movie correlation data is kinda interesting (which movie did people like around as much as Miss Congeniality), so I'll probably cook up a little script to show that data somehow. That would be fun.
I read a paper that suggests including data from IMDB about the movies (actors, directors, etc.) could improve things, but there are 17770 of them and only one of me, and I'm short on free time as it is.
a little discouraged
Posted on 2006-10-21 13:32:00
Since my last jubilant entry I've fallen off of the leaderboard, and the highest RMSE on the leaderboard is .9668 as of right now, which means getting on has become pretty difficult. My latest attempt at user correlations is still running (it's a lot slower to predict ratings because it has to load a file for every prediction) and it doesn't look like the RMSE is going to be very good (1.035 and it's a little less than halfway done).
I have a few tweaks and one more big idea to try, but they're going to take time, and more time means harder to make it on the leaderboard. Still gonna take a shot at it, but I guess I'm less optimistic than I was before. Oh well.
On a random note, I was just at the gym, and a woman two down from me had the TV on to a cooking show. Granted, it seemed like it was vaguely healthy cooking, but it still seems a little self-defeating :-)
on the leaderboard!!
Posted on 2006-10-19 19:15:00
So I submitted my last entry and it got an RMSE of .9748!! (which is better than on the probe set) So I'm on the leaderboard! (look for teamgreg!)
crunch, crunch, crunch (data)
Posted on 2006-10-18 09:35:00
So the Netflix Prize just changed a rule - now instead of only being able to submit once a week (see my earlier screwup), starting tomorrow you can submit once a day! This is good in general, but given the progress that people are making, it leads me to believe that the contest might be over in January (the earliest it can be over under the rules). Neat that people are doing so well, but I feel a bit outclassed. At least they added more people to the leaderboard so my next submission has a shot of making it on there!
But for now, I'm continuing to crunch data. In my WoW-addled state last night I started computing the correlations between all pairs of users and storing all that data, before realizing that since there are around 500,000 users this would be 250 billion lines of data, which I don't have the hard drive space for and, given how slow file IO seems to be, would take forever. Lo and behold this morning just under 2500 users were done, which would mean it would take around 9.5 weeks to finish. So now I'm only storing the top 100 results for each user - we'll see how big a win that is when I get home. I might have to go to a probabilistic approach if even that is going to take too long (right now it still calculates all pairs of correlations, just doesn't write them all to disk). Had a conversation with djedi this morning to solidify some ideas about how to manage that data...
After work we're planning on going to LL Bean and getting a real jacket, although it's nice weather out today (high of 77!).
Posted on 2006-10-17 08:41:00
So my new netflix entry finished overnight and I submitted it. Unfortunately, I didn't bother to use their format checker, and there was a problem (it predicted a rating greater than 5), so I fixed that in my prediction program and am rerunning it now. This means I can't submit again for another week, which is disappointing. Maybe I'll make some more progress with user-based correlation by then...
On the other hand, I lost 1.6 pounds this week!
On the way in this morning, WAMU (our local NPR station) was doing a fund drive and they had Diane Rehm (who does a local show) talking about stuff. She's old and her voice is kinda slow and irritates the heck out of me. I eventually had to change the station. Kojo Nnamdi (isn't that a great name?) is another guy who does a DC politics show, which is always interesting, but more importantly his voice makes me calm and happy. He sorta sounds like Morgan Freeman...
Also, apparently North Korea is taking the UN sanctions as a "declaration of war" which is both ridiculous and seems like really bad news.
Posted on 2006-10-16 18:40:00
(one downside of this netflix prize work is that my computer is heavily bogged down most of the time, so I don't get to check LJ, etc. as much as I'd like to...)
Breakthrough! So I've been working on some movie-based correlation ways of predicting ratings (if movie A is "like" movie B, and user U likes movie A then she'll probably like movie B). After some tweaking, I got some probe data to have an RMSE of 0.981815, which is just barely off the leaderboard (although the real data will be slightly different, so maybe I'll make it on!) I just finished computing the correlation scores for the real data (took about 24 hours of computer time) and tonight (after WoW) I'll start doing the ratings for them, so I might be able to submit as early as tomorrow.
I have some ideas of what to try next, but it's nice to see that I'm definitely making progress. It would be awesome if I were on the leaderboard again :-)
progress on the netflix front
Posted on 2006-10-11 08:39:00
So, although djedi's parents were here most of the last week, I managed to make some progress on the Netflix Prize. (probably at the cost of being a bit rude - sorry!) Most of the time I spent was building up a test framework so I could easily test rating schemes, so there wasn't a whole lot of note. The one big things was that I restored the data provided to us in a binary format, so it takes up 500 MB on disk instead of 2 GB, which means the loading time is down to 15-20 minutes instead of 40 minutes.
So the way it works is that there is a real set of ratings to be made that counts, and there is a "probe" set of ratings to be made. The "probe" ratings are all already known (they're included in the 500 MB of data), so it makes it easy to exclude that data, run your algorithm, and then see how good your data is. Last night was the first time I got to actually run some algorithms and see how they did on the probe data. The scoring method used is RMSE (root mean squared error), so lower is better. For comparison, Cinematch (Netflix's algorithm) scores 0.9514, if you get below 0.9419 (1% improvement) you can win $50,000, and if you get below 0.8563 (10% improvement) you can win $1,000,000. (here's the current leaderboard - note that someone qualifies to win $50,000 already!)
Just taking the average movie rating for all movies and applying that gives an RMSE of 1.13 or so. Taking the average movie rating for each movie and using that on a per-movie basis gives 1.05. The two other things I tried were first to take the average movie rating and average user rating and average them - this gives a modest improvement to 1.015. I then weighted them by the inverse of their standard deviation (since a higher standard deviation would mean there was more variability in that data and thus it would be less reliable), but that only improved it to 1.013.
So now I'm calculating the correlation between every pair of movies using the dot product. (I ran this overnight and it took around 7 hours, but I need to store the data in a different format so I started that before leaving for work). Once I have that data it should be fairly straightforward to apply that to all the other movies a user has rated and come up with a new rating. I think this might push me below 1.0, which would be really nice...and I might even submit that even though right now it needs to be below 0.9884 to make it on the leaderboard.
I also found a good paper that I'm reading through and I might try next, although it looks computationally really expensive.
Anyway, that's most of what's been on my mind lately. I like getting to play with raw data!
Posted on 2006-10-05 09:35:00
So I submitted my first entry last night - it did poorly, of course, but at least I'm on the Leaderboard! (I'm "teamgreg" because I really didn't want to think of a name) Crunched some more numbers last night in support of my next submission, which I have to wait a week for. It's a little annoying that just reading in all 100 million entries takes, at a minimum, 40 minutes (not to mention any processing, but that's all been pretty quick so far), although it works decently if we're doing other things that I can start a run of that and then get back to something else. As long as that "something else" isn't WoW, because it really sloooows my computer down. :-)
I'm impressed one team already has a RMSE (root mean squared error) of .9571, which is within spitting distance of .9474, which is how Netflix's algorithm does on the data.
does the program exhibit the buddha way?
Posted on 2006-10-04 07:37:00
Tags: netflixprize programming
I was trying to track down a particularly odd bug at work yesterday and I suddenly realized I was effectively playing a game of Zendo with the program. I had my theory about what sequence of events made the bug show up in the program (i.e. what koan of events had the buddha way), and I pretty early on got a repeatable koan that always exhibited the buddha way. After some time trying to look at the code to figure out what was going on, I had my theory about what made a sequence of events exhibit the buddha way and I started trying to make it happen some other way. Just before I left I discovered something completely unexpected that seems to determine whether a sequence of events exhibits the buddha way based on something entirely unexpected. That's the thing about debugging versus Zendo - the rules are a lot looser! Anyway, I miss playing Zendo, but I guess I can have my own little game at work :-)
I spent some time working on Netflix Prize stuff last night - I tried to parse all the data but it ran out of memory, so I closed Firefox, tightened up my class that was representing each rating (12 bytes to 8 bytes makes a big difference...), and got it to successfully parse all 100 million+ entries. The only slight concern is that that doesn't leave a whole lot of memory to do any sort of analysis (I only have 1G, 800M or so of which will be used up by the ratings), but I think I can do just O(n) stuff in space and get by. Or, I could buy more memory (woot!), but the last time I tried to put more than 1G of RAM in my computer, bad things happened. Maybe I'm just being superstitious.
The other problem is that it takes around 40 minutes just to read in all the data. I had it calculate the average rating over all ratings last night and it gave me something reasonable, so I was thinking of writing some code to do the ratings on the test set and just rate everything the average and submit that. This would give me a good baseline - although they give you that baseline already, I think, it would be nice to have an official entry submitted, even if it's crap. So maybe I'll get to that tonight...
netflix prize away!
Posted on 2006-10-03 09:09:00
Tags: netflixprize programming
Sunday I was feeling a little down, mostly because I didn't have a project to be working on and none of the ones I had on my long-term radar sounded very interesting (or possible). The very next day Netflix announced the Netflix prize, which is basically a competition to improve their suggestions engine (i.e. if you liked movie X and Y you'll probably like movie Z but not movie W). And they're releasing 700M worth of data to train on. I think I'll work on that next!
Since the amount of data is so massive, I decided to work in C++ instead of Ruby or Python - I'm a little worried about keeping all the data structures in memory, but we'll see how that goes. I got some basic parsing of the data done and hope to submit a very basic entry soon! (you can submit an entry once a week and it will score it to let you know how you're doing)
This backup was done by LJBackup.