Okay, so this week’s unit isn’t really about statistics, but I’ve felt my data illiteracy more strongly these past few days! Working with QGIS was an exercise in experimentation as much as frustration, and maybe this illustrates one of the assumptions of GIS projects: that researchers will already have a clear question they want to ask of the sources, or testing of a specific hypothesis, rather than exploring in the hopes of discovery.
It wasn’t all bad to start; getting into QGIS and joining the polygon ‘map’ layers with a related dataset worked mostly as expected. Not having much experience with election maps (or much idea of what to do with these varied numbers), though, I had serious difficulty working with the Fairfax Congressional Election Results dataset. Thanks to a patient classmate, it became clear that this would require more than just joining two compatible tables—the data in the Fairfax would necessarily require multiple layers to do any kind of comparison between candidates. After more “see what happens if I do this,” I was able to get an approximation of an election map, showing which candidates won with at least 40% of the vote across the precincts. Intellectually, I recognize that this can be used—and misused—for many purposes. For one, these visualizations really only seem capable of highlighting majority candidates. Without overlays or animations that could be controlled by a user, it’s almost impossible to see anyone outside of the two major parties.
With the London Plague dataset, I felt more comfortable in figuring out what needed to be visualized (spread of plague week by week) and recognizing how my choices in defining and grouping the data could tell this story. Too few classes, and it appears as though plague descended virtually overnight on half of London’s parishes; likewise, arbitrary choice in distribution patterns obscures any useful visual data on when the plague reached certain regions. I wasn’t fully sure how each mode worked to divide/distribute the data, but the tried-and-true method of “click and see what happens” was surprisingly useful.1 I realized the “Equal Count (quartile)” mode was evenly dividing the number of data points across all weeks—which might be useful in other visualization applications, but maybe less so if one is interested in seeing a pattern of spread over time. Given that, I switched between different presets to eventually land on a distribution over ~4 weeks of time each, which seemed to provide enough granularity to show patterns of change over time. This approach made sense while also highlighting the perils of GIS: it’s very easy to lie with visualizations.
It’s difficult to separate my frustrations and ineffectual fumbling with these tools from my thoughts on GIS as a whole. To be sure, it has its exemplars: for example, the Mapping Inequality project balances clear visuals and accessible annotations with robust documentation, and its relevance to contemporary concerns of racial disparity and wealth inequality are immediate (though this immediacy isn’t always necessary for DH projects). I keep returning to a question posed by Ian Gregory and Paul Ell:
“To what extent does [the visualization] advance our understanding of the topic?”Historical GIS, 104
Or: at what point are you just creating a boutique digital history project? This is a tricky question to navigate, and I don’t want to suggest that obscure or hyper-narrow fields of research need to justify themselves, or “prove” relevance by standard metrics or usage. However, at what point is the work of tracking down sources, digitizing, painstaking recreation of boundaries and topologies over time, and the synthesis of multiple disparate modes of data leading to new insights, rather than just create a fractalized approach to history? Does more and more sophisticated data always reveal greater insights, or is there a point of diminishing intellectual returns?
This feels particularly sharp in Siebert’s example of visualizing the spatial history of Tokyo, and the sheer volume of sources collected and reformatted. This may also be simply a difference in intent and methods—Siebert notes early on that this “data-driven” approach is built on finding and mapping information over time, in order to spark questions based on these patterns, rather than starting with a question and seeking out sources to provide an answer 2
I am less convinced of Siebert’s assertion that the analysis is provided by the visualization, and that “interpretation can follow so quickly, one upon the other, that it is difficult to say which comes first or to differentiate them.”3 This suggests that the visualization IS the analysis, rather than representation of a very specific set of conditions. Even Siebert’s own examples of revelations in the data seem to require further analysis—that the visualizations might revealing change over time, but these leave us with little understanding of how or why this occurred, or their overall impact. “There appears in this percentage view a rough visual balance, but there was actually a greater loss […] A comparison with population changes for other prefectures would probably reveal where many, but not all, of these people went.”4
This critique aside, one (maybe not) surprising commonality across this week’s readings and examples is the mass of paper-based historic and traditional methods behind each project. Each project spoke to the need for deep archival investigation and analysis, tracking down fugitive sources, and close reading of materials and comparative sources. (And maybe some of my reservations with GIS also stem from this: that an emphasis of visualization can so easily elide the volume of work and time, but also selection and interpretation, on the part of historians or researchers.)
- Had I read Gregory and Ell before starting, I might have had a slightly better time with this. See: Ian Gregory and Paul Ell, “Using GIS to Visualise Historical Data,” in Historical GIS: Technologies, Methodologies and Scholarship (Cambridge: Cambridge University Press, 2007): 97-100.
- Loren Siebert, “Using GIS to Document, Visualize, and Interpret Tokyo’s Spatial History,” Social Science History 24, no. 3 (2000): 539.
- Siebert 556
- Siebert 561