Friday, September 26, 2014

Open Data allows us to tell our own stories

I'm the current lead of Open Data Windsor-Essex and so I can say with some authority that most people have no idea what Open Data is and why they should care enough to ask for it.

So I'd like to give an illustrated example to demonstrate the power of Open Data in the pursuit of trying to make sense of the world.

This story begins with someone I know who asked me if I had a good population map for Essex County as she wanted to do some research in relation to the proposed "mega-hospital" for Windsor-Essex. After some digging around, I found the thematic map for Windsor that shows the population change from the 2006 Census to the one in 2011 [pdf].

http://www12.statcan.gc.ca/census-recensement/2011/geo/map-carte/pdf/thematic/2011-98310-001-559-013-01-00-eng.pdf

What struck me about this map was that it highlights growth in the Windsor CMA but doesn't express much about the degree of population loss. Every census tract that experienced loss is treated equally by the map. 55 of the 73 tracts experienced a loss, but map makers chose to highlight the differences in the growth instead.

Curious about this, I thought I would use Open Data provided by the federal government to make my own map.  I'm planning to write up the technical details of how I did this on my mapping blog in the near future, but for this post, I'll just give you the rough process that was involved in the map making.

First, I used the Canada Open Data portal to find this reference map of Census Tract Boundary files for 2011.  The files are in SHP format, otherwise known as ESRI Shapefiles, and the format needs GIS software for opening, reading, and editing.  I used QGIS to open the file and select only the parts of the map relating to the Windsor CMA.

I then opened a *huge* spreadsheet of census information by census tract also provided by the Open Data catalogue.  This was a particularly messy spreadsheet because it summarizes a number of data sets within the same column and it's huge because it covers all of the census tracts of Canada.  I ended up using Sublime Text with the Filter Lines plug-in to remove what was unnecessary until I was left with a table of data of Windsor Essex census tracts and population numbers for 2006 and 2011.

After a bit of trial and error, I figured out how to merge this table of data with my map of Windsor-Essex. Then I followed these instructions kindly provided by Peter Rukavina to turn these SHP files into GeoJSON.  GeoJSON is a format that allows for all sorts of manipulations on the web, freed from the constraints of geographic information systems. For example, I was now able to take my new dataset and make it available as a Gist on Github for others to share or improve.

https://gist.github.com/copystar/11e6a56dad7931a9014c


While all the relevant data is there, the above map is clearly not particular useful or clearly expressing itself.  And so I followed this tutorial on how to create an interactive choropleth map using the Leaflet JavaScript library.

After some more trial and error futzing around (and more help from others kind enough to share their tips) I finally was able to finish making a map I could be proud of :: click on the image below or this link to see the map in it's full interactive choroplethic glory:


http://theplaceisnow.aedileworks.com/mappings/11-%20PopDrop/WindsorEssexPopChange2006-2011.html

Now we can see a distinctly different snapshot of the population changes of Windsor Essex: namely that there has been significant population loss of population in the West end of the City of Windsor.

Please note that the percentages of population change in my map are similar but not identical with the map produced by Statistics Canada.  I'll investigate further to determine how much difference there is between the data set that I used and the dataset used to build the Statistics Canada thematic map.

I like to this think that the measure of a good map is that it tells a story but also invites further questions. If you have such questions, please let me know in the comments.