I wanted to test some ideas brewing around
As an example this is a useful model for repeating. More to learn here but this is curious.
On Sunday, February 25, 2013, the White House released documents detailing the projected costs to states of the upcoming sequester. Several news outlets carried the story - see this Washington Post article. However, only the Huffington Post had links to each individual state document.
Surprised at how these documents had real data, but were presented in pdfs, a colleague, @LearonDalby, and I were discussing how this data could be much better presented. We decided late Sunday night to enter the data into a machine readable format and begin a small project for visualizing the data.
We began by first looking at each of the pdfs for a repeatable pattern in the data. We decided to enter the data into a Google Spreadsheet. My colleague began reading these numbers as I typed them into the Google Spreadsheet. We listed each state in the first column, and then added columns for each of the areas listed in the pdfs (e.g. Teachers and Schools, Clean Air and Water, Public Health, etc).
When this was complete, I exported the data to this csv and read that into a PostgreSQL database. My intent was to make a geojson file of this data, so I joined the data to a table containing state centroids as point geometry and a common state name field in our table. I then used this sql script to export the data to geojson. I then created this gitrepo, turned the repo into a gh-pages branch and began committing these files.
I also engaged another colleague, @qinxiaoming, asking him what he could come up with. Completely on his own time he developed the above example in D3. In parallel I developed the above example in Mapbox.
I think there is a repeatable process here that people should know about.