new project: congress ratings
Posted on 2007-03-19 18:48:00
Tags: haskell projects programming congressvotes
So my next project is fun and neat. I can't remember how exactly I got here, but I noticed that all roll-call votes in congress are accessable in a handy XML format (Here's an example vote - note that the page you see is that XML file after being processed with an XSLT stylesheet, so you'll have to hit View Source to see the raw data). My idea (a little vague at this point) is to take all of the votes, and then figure out which issues I care about and which way I would have voted, and then "rate" representatives as to how closely their votes align with mine. This is a little grand in scope.
Anyway, it took me a while to get HaXml (a Haskell XML parser) installed, because I kept not being able to compile it from source for various stupid reasons. Anyway, I finally figured out that it was in fact in Debian in the libghc6-haxml-dev package, which made my life about 5 times easier.
So I saved a sample vote and have it parsing and I'm extracting simple data from it, which is exciting! I have a few questions, though: does anyone know the answer to these?
- I have these two functions:
nothing :: Maybe a -> Bool
nothing Nothing = True
nothing (Just _) = False
findElementContent :: String -> Content -> Maybe Element
findElementContent target (CElem el) = findElement target el
findElementContent target _ = Nothing
findElementContents :: String -> [Content] -> Maybe Element
findElementContents _  = Nothing
findElementContents target (c:cs) = if (nothing (findElementContent target c))
then findElementContents target cs
else findElementContent target c
findElementContent takes in a target tag and some data (Content), and returns the element that has that tag name if it exists, and Nothing otherwise. (
findElementContents is just a helper function to do the same thing with a list of Content) But
findElementContents looks pretty ugly to me - what I want it to do is return
findElementContent target c if that isn't Nothing, and otherwise recur on the rest of the list. The code is correct, but is it inefficient since I'm calling
findElementContent target c twice? My limited understanding says no, since
findElementContent is referentially transparent since it doesn't use monads (i.e. if you call it again with the same inputs it will return the same thing, always), but I'm not entirely clear on this.
- As I mentioned,
findElementContents seems a little inelegant - is there a better way to do this? Is there some builtin
nothing that I couldn't find?
Resources I've been using:
- HaXml reference
- standard library reference, including the Prelude
Comment from wonderjess:
hehe you're turning into a political scientist!
Comment from wonderjess:
oh! I just thought of something I can add. you know, with my 1/5th of a phd knowledge. hahaha. so there's this guy here, curt signorino, who's crazy methods man. and he came up with this new way of measuring affinity called an s-score (he swears s isn't for signorino; no one believes him). anyhow (I learned this second hand, not from his work, so it's a little vague), you create an n-vector where n is the number of votes you're interested in, and each element in the vector is a 1 or 0 (for yes or no for the vote). then you make your n-vector. and that's a point in an n-dimensional space, and to compute affinity, you calculate the distance between your point and someone else's point. anyhow, thought you might like that.
This backup was done by LJBackup.