State Sequester

By The Numbers

Goal

I wanted to test some ideas brewing around

  1. data as part of the web ecology and
  2. launching interactive maps and visualizations with no software costs and no hardware costs or infrastructure.

As an example this is a useful model for repeating. More to learn here but this is curious.

Background

On Sunday, February 25, 2013, the White House released documents detailing the projected costs to states of the upcoming sequester. Several news outlets carried the story - see this Washington Post article. However, only the Huffington Post had links to each individual state document.

Surprised at how these documents had real data, but were presented in pdfs, a colleague, @LearonDalby, and I were discussing how this data could be much better presented. We decided late Sunday night to enter the data into a machine readable format and begin a small project for visualizing the data.

How We Did It

We began by first looking at each of the pdfs for a repeatable pattern in the data. We decided to enter the data into a Google Spreadsheet. My colleague began reading these numbers as I typed them into the Google Spreadsheet. We listed each state in the first column, and then added columns for each of the areas listed in the pdfs (e.g. Teachers and Schools, Clean Air and Water, Public Health, etc).

When this was complete, I exported the data to this csv and read that into a PostgreSQL database. My intent was to make a geojson file of this data, so I joined the data to a table containing state centroids as point geometry and a common state name field in our table. I then used this sql script to export the data to geojson. I then created this gitrepo, turned the repo into a gh-pages branch and began committing these files.

I also engaged another colleague, @qinxiaoming, asking him what he could come up with. Completely on his own time he developed the above example in D3. In parallel I developed the above example in Mapbox.

Results

Data

Visualizations

Costs

Software

  • Storing the original data (Google Docs) - $0
  • Storing, formatting and joining geoData (PostGIS) - $0
  • Exporting to geojson (Postgres) - $0
  • Visualization with javascript (D3) - $0
  • Visualization with map cartography (TileMill) - $0
  • Versioning, governance, issue tracking, history, code sharing, collaboration and general awesomeness (GitHub) - $0

Hardware

  • Hosting map tiles (MapBox) - $0
  • Web hosting (GitHub) - $0

Total

  • $0 spent thanks to open source software and free web hosting

Conclusion

I think there is a repeatable process here that people should know about.