Cartogram The Art of Software
Cartogram The Art of Software
ABOUT REFERENCES
08
Wednesday
Feb 2012 ScapeToad Cartogram Tutorial (formerly
Cartogram CrashCourse)
Posted bY craig in Code Snippets, Data AnaLYsis, Notes LEAVE A COMMENT
This post provides a tutorial on how to create a cartogram using ScapeToad v1.1.
In addition it describes how to work with a few common GIS file formats. Upon
completion you will have created a cartogram that shows the per state
Tags population of the United States as well as learned a bit about the DBase and
cartogram, gis, scape
shape file formats. Along the way some simple Python programming will be
toad
required. All of the data files used for this tutorial as well as the Python script can
be found on Git Hubhere.
However, before we start it might be useful to get an idea of how cartograms help
to visualize geographic information. Mark Newmans pages are particularly good
for understanding the importance of this data visualization method. Have a look
at the 2008 U.S. Presidential Election Results, and also at World Mapper.
To begin the tutorial we will need a shape file that describes the state by state
geometry of the United States. This can be downloaded at the Census Bureaus
website. Click the above link, then select States (and equivalent), click
submit, and then from the 2010 box, select the all in one national file option.
Clicking on the download button will give you a zip file with the relevant
information in it. Explore the other options in order to see what additional shape
files are available.
Now that you have the zip file downloaded, unpack it. Assuming the zip file was
named tl_2010_us_state10.zip you should have a single directory with five files
in it. Each of the five files has the same base name as the directory itself, but
each has its own file extension. For our purposes here we care about the shape
file and the DBase file, which have extensions shp and dbf respectively.
The shape file itself contains geometric information, and can be thought of as a
list of geometric entities, where each item corresponds to a particular states
geometry. Wikipedia has a write-up worth reading. The detailed technical
specification for the file format is here. Arc Explorer and Shape Viewer are two
free (as in beer) programs for viewing shape files.
The DBase file is a table of properties where, by convention, each row in the table
contains the attributes of the item in the shape file with the same index. For
example, the 10th shape in the shape file is presumed to have attributes given
by the 10th row in the DBase file.
Note that DBase files can be opened with Excel for viewing, and that there is also
a Python library for manipulating them.
Python(to create DBase files) (optional if you got the dbf files from Git Hub)
dbfpy(to create DBase files) (optional if you got the dbf files from Git Hub)
Scape Toad(to view shape files and create cartograms)
Shape Viewer(to view shape files slightly better UI than Scape Toad)
Excel (to view DBase files)(optional)
Next well create a DBase file that contains the U.S. population datausing the
following Python script.
#!/bin/env python
POP ={
"CA" : 37691912, "TX" : 25145561, "NY" : 19465197, "FL" :
19057542,
"IL" : 12869257, "PA" : 12742886, "OH" : 11544951, "MI" :
9876187,
"GA" : 9815210, "NC" : 9656401, "NJ" : 8821155, "VA" : 8096604,
"WA" : 6830038, "MA" : 6587536, "IN" : 6516922, "AZ" : 6482505,
"TN" : 6403353, "MO" : 6010688, "MD" : 5828289, "WI" : 5711767,
"MN" : 5344861, "CO" : 5116769, "AL" : 4802740, "SC" : 4679230,
"LA" : 4574836, "KY" : 4369356, "OR" : 3871859, "OK" : 3791508,
"PR" : 3706690, "CT" : 3580709, "IA" : 3062309, "MS" : 2978512,
"AK" : 2937979, "KS" : 2871238, "UT" : 2817222, "NV" : 2723322,
"NM" : 2082224, "WV" : 1855364, "NE" : 1842641, "ID" : 1584985,
"HI" : 1374810, "ME" : 1328188, "NH" : 1318194, "RI" : 1051302,
"MT" : 998199, "DE" : 907135, "SD" : 824082, "AR" : 722718,
"ND" : 683932, "VT" : 626431, "DC" : 617996, "WY" : 568158,
}
if POP.has_key(abbrev):
rec['POPULATION']= POP[abbrev]
pop = POP[abbrev]
else:
# Print a message if we cannot find the population
# for a given record.
print "BAD POP KEY:", abbrev
rec['POPULATION']= 0
rec.store()
olddb.close()
newdb.close()
The script itself should be run in the directory where the shape and DBase files
are located, however before running the script, rename the file
tl_2010_us_state10.dbf to tl_2010_us_state10-orig.dbf. We do this because
the Python script uses the old DBase file to determine the order in which to write
records into the new file, but in addition it overwrites the original location, since
the DBase file to be used with any particular shape file must have the same base
name as the shape file itself. Edit the script to account for any differences in file
names.
Alternatively, you can skip running the script anddownload the appropriate
DBase file from my Git Hub page.
At this point, if you have Excel you might also want to open both the original and
new DBase files and see for yourself what is in them.
Now we can fire-up Scape Toad. When it comes up, click the add layer button in
the tool bar. Navigate to the shape file and select it.If the shape file came in
correctly you should see something like this on your screen.
Note that the DBF file you created must have the exact same base name as the
shape file and that they must both be in the same directory. Otherwise we wont
be able to create a cartogram.
Next click the Create cartogram icon in the toolbar. Click next, next, and
then ensure that POPULATION is selected in the drop down menu. Click next
again. And again. And then compute. Now wait
After the computation is finished you should see a cartogram that looks
something like this on your screen.
Unfortunately Scape Toad has no zoom feature, so to get a close up look at the
cartogram youll want to export it as a shape file and bring it up in Shape Viewer.
Unfortunately there you will lose the legend and will be left with just the
distorted shapes. Cest la vie.
If you have gotten this far then congratulations! You have succeeded in creating a
simple cartogram that shows how the population of the United States is spread
across its geography.
RSS - Posts
RSS - Comments
CATEGORIES
Code Snippets
Data Analysis
Data Structures
Exercises
Math
Navel Gazing
Notes
Software Engineering
Tools
TWITTER UPDATES
Follow @kungfucraig