mostly computer stuff

Mood: hopeful

Posted on 2006-02-26 14:06:00

Words: 405

So destroyerj was invited to the Where 2.0 conference to talk for 5 minutes about scipionus.com. He's trying to score me an invite as well, which would very much rock :-)

Another WoW player found out about my guild information table mod, and they now have an information page up. Good stuff!

I got a random email asking if I wanted to interview for a 2 month job in California. I don't, of course, but it was kinda neat to get that sort of an email.

Don Knotts died! That's sad - I really enjoyed The Ghost and Mr. Chicken.

So I'm working on an interesting computer/math problem - if you're interested,

I have lots of markers on a map of the US. I'd like to have an interface such that you can click on a city name and have the map move to that city. (I started out doing this for major cities, but the process involves finding it on Google Maps, getting the coordinates and entering them into a file, which can take quite a while, and I'm sure not doing that for all cities)

Now I'd like some way to automate this. Each marker knows what city it's in, so the simple idea is to, for each, city, find the markers that are in the city and take the mean of their latitude and longitude, and center the map there. This seems like it will work, but I'm a bit concerned about markers that have their city wrong (for example, if the city we're doing is Austin and there's a marker that's in Pennsylvania mistakenly saying it's in Austin, the map will be centered way off from Austin). So what I really want to do is reject outlying points. djedi had the idea of taking the mean, calculating the standard deviation and throwing out things more than 1 standard deviation away. (Presumably I'd do this in the latitude and longitude dimensions) My statistics book talks about finding the interquartile range (difference between the third and first quartile), and rejecting anything more than 2 or so interquartile ranges away from the median.

Anyway, I'm gonna cook up a script to try both these approaches, and hopefully at least one will turn out well. I'm going to have it spit out which markers it's rejecting, too, to hopefully get some idea of the rate of false positives here...

I'll post with results once I figure them out :-)

2 comments

Comment from quijax:

2006-02-26T16:30:24+00:00

Your math problem does sound interesting. Let us know how it works out. If you log your outliers it could be a way for you to suss out mislabels.

Comment from wildrice13:

2006-02-27T12:21:29+00:00

That's very cool! I'll be interested to see what you come up with as well.

This backup was done by LJBackup.