benchmark analyzer!
Mood: proud
Posted on 2009-02-24 10:20:00
Tags: projects work
Words: 164

So I wrote this benchmark analyzer to analyze benchmarks that have many parameters and try to figure out what the important ones are. To that end, it creates a decision tree in R, and also generates lots of boxplots to visualize the different parameters. Here's a sample report (with boring data). It also (somewhat cleverly) saves the original .csv file with the web page it generates, so you can save the page and send it to a colleague who can play with the options and rerun the analysis.

It occurred to me as I was prettying this up that most benchmarks don't involve lots of parameters, and so the probability that anyone else is going to find this useful is pretty low. I'm OK with that because
- it's useful to me right this very moment for work stuff
- I had fun writing it and learned some more about R
- pretty graphs!
- it was good to work on something other than whereslunch.org for a while


6 comments

Comment from spchampion:
2009-02-24T22:37:28+00:00

I feel like I should understand what this is doing, but I don't. Can you give an example about how you would use this?

Comment from gregstoll:
2009-02-25T10:28:44+00:00

Sure! Let's say you work on software that has a function F. F takes lots of different parameters that are more or less orthogonal, and you have benchmarks of how well F performs with lots of combinations of these parameters.

Now you want to make some optimizations to F and you want to see how it affects the benchmarks, but you have 500 different combinations of parameters, and for some combinations it got a lot better, for some it got a little better, and for some it got a little worse. What the analyzer does is split up the benchmarks by parameter, and so you can see if for parameter X it got a lot faster but for parameter Y it got a bit slower, and so you should look at what it's doing for parameter Y and try to fix it.

Comment from spchampion:
2009-02-25T12:02:53+00:00

Ah, interesting. BTW, I'm getting a 404 when I click Reanalyze in your sample:

Not Found

The requested URL /~gregstoll/benchmarkanalysis/doanalysis.cgi was not found on this server.

Comment from gregstoll:
2009-02-25T13:06:00+00:00

Dang it, I knew I should have regenerated that instead of manually editing the HTML. Fixed. (and thanks!)

Comment from spchampion:
2009-02-26T10:43:43+00:00

Got another one: if I change the p-value in the options to 0.95, the decision tree still shows p < 0.001. Or am I misunderstanding?

Comment from gregstoll:
2009-02-26T11:00:54+00:00

The p-value represents the certainty needed to do a split. Showing p < 0.001 is correct - if you bump p up to .9999 or something then it should disappear.

This backup was done by LJBackup.