Historical Research and the Problem of Categories: Reflections on 10,000 Digital Notecards (Fall 2011 version)
Permalink for this paragraph 0 Some historians must proceed through research in tidy fashion: frame research question, gather sources, develop argument, write up narrative and argument. A quick survey of “methods” texts conveys this impression. Yet many historians define their work differently, acknowledging or even celebrating our discipline’s less linear, more meandering approach. Global historian William McNeil spoke of starting with a historical problem in mind, but then seeing that problem continually redefined through reading and researching.1
Permalink for this paragraph 6 My own experience has followed the less tidy model. I start with a hodge-podge of vaguely-formed questions of different magnitudes, and interesting and seemingly relevant source material to explore. It is in that exploring that some questions come to seem more relevant, important, or productive than others. Others fall away entirely. And my writing is similarly non-linear, as I move between pieces of narrative, using writing as a way to find our test out what I am arguing, what glues the pieces of the story together. The big, driving question I am addressing becomes clear only gradually, often long after the evidence-gathering and sense-making, and well into the writing process.
Permalink for this paragraph 0 To research my dissertation, “Schooling the Metropolis: Educational Inequality Made and Remade, Nashville, Tennessee, 1945-85,” I started with various questions about desegregation in Nashville, Tennessee: Why did black students ride buses more, and longer, than white students? Who was to blame? Was this a product of planning practice, or power imbalances, or something else? Was Nashville’s heavily “post-industrial” economy relevant to this story? I then worked my way backwards to the big question I came to address – how the politics and economics of growth feed educational inequality.2
Permalink for this paragraph 5 I suspect I am not the only historian who feels that they are engaged in the work of research and writing unsure of what their final product will be. Once at an archive, while taking a break, I stood at the snack machine alongside a senior historian. She sighed, explained that she was embarking on a new project, and said that she was reliving how hard it is to be at the point where you don’t know anything yet. How, then, do we how do we proceed to do research –the real nuts and bolts of it – if we acknowledge this uncertainty? How can we organize information, keep it accessible in ways that will facilitate our ongoing thinking, if we are driven not by a single hypothesis, but changing focal points or areas of interest?
Permalink for this paragraph 2 This essay recounts my own dissertation research process. I make no claim to have been a model historical practitioner or to have used particularly cutting-edge research methods. Speaking about research process requires careful attention to many, sometimes minute, details, and thus of necessity I draw on the story I know well. I hope this example can be a provocation to discussion and reflection. I first lay out how I organized my research process and how this process related to my thinking and writing. Then, I venture some connections between that research process and questions in the social history of knowledge and the scholarship of the archive – questions about the making and impact of categories in thought.
Permalink for this paragraph 0 In the summer of 2006, I had a viable dissertation prospectus, and was about to embark on the first of my research trips. And I was scared that I would forget things. I knew what it took to manage the information involved in a seminar-length paper. Earlier, I had filled hand-written notebooks and compiled pages of word-processed text, later filtering through them as I built an argument. But what of a project that would extend over years of research and writing? Where, in the most literal sense, would I put all of the information, so that I could find it later? I needed something that would back-stop my own memory, allow for flexible organization, and yet be as accessible as possible. I also wanted to make sure I kept information in the context of its originating source, and that my notes clearly distinguished between what came directly from the sources and what came from my own thinking about them.
Permalink for this paragraph 4 I turned to a relational database program, following the example of some more senior graduate students and one faculty in my department. (FileMaker Pro was my choice, like those in my department; there are a variety of alternatives now, some more streamlined (like Bento), some free and web-compatible (such as Zotero), some designed for qualitative research (NVivo, AtlasTI). I created two notetaking layouts, one for sources, and another related one for notes from those sources. (You can download a template of my layout). Guessing at how I might later hope to sort and analyze my notes, I put in a keyword field. I set up about 15 key terms that I expected to show up a lot in my research – state, for state-level policy, for example, or vocational, for vocational education, or McGavock, for one high school that I thought would figure prominently.Figure 1: FileMaker layout for each source Figure 2: Filemaker layout for taking notes, automatically populated with content from the source layout.
Permalink for this paragraph 0 Although many aspects of historical notetaking are similar to working with other kinds of data, some are distinct. Dates are a good example. Database programs often have standard date formats, which assume or even require a precise, complete date for each entry. This structure does not accommodate easily sources that have vague or partial dates, yet still need to be able to fit (at least roughly) into chronological sorting. I worked around this by entering dates into a text (rather than standardized date) field and formatting dates in a way that worked if I had full information or not. Jean Bauer’s essay in this volume explores this point further.Figure 3: I used a YYYY MM.DD format, which allowed FileMaker to sort chronologically even if I had sources with only the year, or only the month with no specific day, specified.
Permalink for this paragraph 2 In trips to several archives over a year, I collected tens of thousands of pages of documents by taking digital photographs of these. I read and took notes on a portion on site, in those collections that prohibited digital copying or charged exorbitantly for physical copies. Because I had very limited time to work onsite at archives, most of my note-taking happened once I was back at home. I read digital copies on one screen while entering direct quotes, my own observations and questions, and tentative analyses, into the database layout on the other screen. The vast majority of my notecards were descriptive, but when I had a thought that tied portions of my sources together or hinted at an argument, I made a new notecard, titled “memo to self,” and then these entered the digital stack as well, tagged with keywords.Figure 4: Sample image of archival material, this a portion of a court transcript of Nashville’s school desegregation suit, here from 1970. Figure 5: Notes from court transcript above
Permalink for this paragraph 0 Once I had read through nearly all of my documents, I had nearly ten thousand note cards. I used the database as I began my analysis and sense-making. I first ran large searches based on my keywords: hundreds of notecards on vocational education, for example. I organized these cards chronologically – an action that takes only a few keystrokes – and spent a day reading them through in that order. As themes or patterns began to emerge, or there were connections to other sections of my research that weren’t under the “vocational” heading, I ran separate searches on these, and would incorporate that material into the bin of quotes and comments I was building by cutting and pasting into a new text document. (Databases often have “report” functions that could help this process, but I did not explore that route). None of this process, of sorting information into relevant groups, has to happen with a database. But I found it to happen quickly and more easily with one.Figure 6: Collected sections of notes from a keyword search
Permalink for this paragraph 0 Having used the notecards in this way to reorient myself to my research material and sources, I began to write. I started to write before I was sure of the precise structure of the chapter or my detailed argument. I used writing as a way to find and refine my argument, rather than simply presenting it.3 Crafting a basic narrative often helped me identify what I was missing, what I needed to find out more about. I tried to keep moving in my writing by thinking, “who did what, when, and with what result?” Then, why? Writing in this exploratory fashion, I often needed bits of information that may not have made it into the first batch of notecards I was working with, so it was easy to flip back to the database and get those pieces when I needed them.
Permalink for this paragraph 1 Using a relational database did accomplish the most basic of my goals. It proved a reliable and convenient way to keep notes and contextual information in the same place, and it addressed my most basic fear of forgetting by allowing searches for information in myriad ways – by title, content of notes, direct quotations, keywords, dates.
Permalink for this paragraph 0 It was once I began to write, though, I came to appreciate how the database’s full-text searchability allowed me not only to follow my original questions, but to explore ones that I had not anticipated at the start of my research. This digital mode of notetaking I chose allowed me to see things as I wrote and thought that I would not have seen otherwise – likely only because of the difficulty of tracking down notes without such a database.
Permalink for this paragraph 0 Let me illustrate this with an example. One central problem in my work has been understanding the multiple layers of inequality at work in Nashville’s desegregation story. There are of course salient and central differences by race and by class, but these divisions were often expressed in the language of geography. By the mid-1960s, residents, planners, and educators used the phrase “inner city” to indicate predominantly black neighborhoods, or neighborhoods where planners anticipated that the black population would continue to grow. I had noticed this pattern in my own reading, and captured examples of such language and other descriptions and imaginings of geographic space and with a keyword – cognitive map, as I chose to label it. When I went to read about this phenomenon, I read through all of my “cognitive map” notes, in chronological order, and over several iterations of conference papers and draft chapters developed an argument about how what I called pro-suburban bias informed Nashville’s busing plan. In early versions, I seemed to imply that in Nashville residents’ cognitive maps, the correlation between suburban space and white residents, and urban space and black residents, was absolute. But I wondered whether there were exceptions to these generalizations made in imagined landscapes. What could I do to test this? It occurred to me that I could read all of the instances where my sources used the phrase “inner city.” Of course, I may not have not written down every single instance, as I did not imagine this textual analysis to be a part of the project from the outset. Nonetheless, I had enough to provide a basis from which to work.
Permalink for this paragraph 0 When I read my sources in this way – some of which I had labeled as about “cognitive maps” and some of which I had not – I saw something that I had not fully noticed before. Among the critics of schooling in the “inner city,” and the smaller group of its defenders, there was a case that proved that the identification of urban space with black residents was not complete, at least for some city residents. I had made earlier notes about, but had not remembered to come back to, the story of a central-city school that was historically segregated white, remained largely working class, and had a local council representative fighting to retain the school in conjunction with what he labeled its surrounding inner city neighborhood. William Higgins, the council representative, asked, “You’re taking children from the inner city and busing them to suburbia. Why place the hardship on them? Why not bring children from suburbia to the inner city?,” and later proposed that “All new schools … should be unified with the inner-city, otherwise the city finds itself a lonely remnant, disunited and eventually abandoned.”4 When I read these passages, in the first years of my research, I had not thought to tag it as about “cognitive maps.” Thus they did not show up in that keyword search when I began my writing over two years later. I was able to discover them again because I could use a broader search based on a phrase laden with meaning and insinuation. Doing so yielded access to notes and sources that became quite important in my understanding how categories of race, geography, and class overlapped, and where they diverged, in my story. (Other essays in this volume explore the utility of full-text searchability in digital archives; the tool is useful as well within own notes in digital form).Figure 7: Notes from text search for “inner city”
Permalink for this paragraph 0 In another case, I found that the database allowed me to reframe an initial research question about school location into a broader one about the distribution of public goods – schools and otherwise – in the metropolis. Such a topic links my work to the broader matter of what political and economic structures support metropolitan inequality. From the start, my dissertation was centrally concerned with why schools were built where they were, how locations got chosen, to suit which interests. I understood that in this way schools were a kind of good being struggled over in political and economic terms. But it was not until I was through the process of analyzing the local politics of school construction that I understood that my story was not just about schools, but about the distribution of public goods generally in the metropolitan area.
Permalink for this paragraph 0 I had been tracing how urban renewal funds subsidized school construction, and how, in the context of a metropolitan government, such subsidies could allow a municipality to shift more of its own tax revenues to its suburban precincts. I suspected that this use of urban renewal dollars to reduce the local commitment to supporting city areas in favor of suburban ones was visible in other areas of city services as well. How could I illustrate that, provide some evidence for this broadened claim? I could see what my sources – planning reports, maps, records of community meetings– said about another kind of public good, to see if the dynamics were similar. I knew that I had made some notes about the building and repair of proper sewer lines for the city and surrounding suburbs, but I had not expected to write about them, so I had no related keyword. Text searchability of the database meant that I could very easily track down everything I had about sewers, organize it chronologically, and test if the pattern I saw for schools fit for sewers as well. Without fully searchable notes, I would have been looking through stacks of notecards, organized to fit another set of categories entirely. I may not have felt I had the time, at least at this stage of research, to expand my original question to a broader one.
Permalink for this paragraph 0 In each case, the database helped relevant information jump out of the noise of years of research and thinking, and helped make that information available relationally, easily connected to other information. It is possible that I am overvaluing what the program did, however, as my appreciation of it comes from contrasting its use with other approaches I have taken before. Other researchers may have developed other approaches to keeping their own systems of information-gathering, digital or otherwise, flexible.
Permalink for this paragraph 4
Categories and the making of historical knowledge
Reflecting upon my use of this digital tool for notetaking has led me to questions about how we think about our research practice and how we understand the relationship between how we research and what we learn. Recent work in the social history of knowledge and the history of the archive share a core interest in categories – where they come from, what assumptions or values they represent, how they can be reified on paper or in practice. These interests are relevant to our thinking about research methods. In the writing of my dissertation, I felt fortunate to be able to set out initial categories of analysis (via keywords), but to have technological tools that made it possible, at no great expense of time, to throw these out. Sometimes I used my initial keywords, and sometimes I skipped over these to evaluate new connections, questions, or lines of analysis. If I had used pen-and-paper notebooks or a set of word processing documents, regrouping information would have required a great expenditure of time. I would have been less likely, then, to consider these new avenues, and thus my earlier thinking about categories of analysis would have been more determinative of my final work, even though those earlier categories were set out when I really did not know anything yet, in the words of that senior historian at the snack machine. Since there was virtually no time cost involved in trying out new questions, I could do so easily and in an exploratory fashion, without commitment. That is, thinking about how my database worked and how it helped my analysis got me thinking about how historians construct, use, and rely upon categories in our work.
Permalink for this paragraph 4 It makes sense that historians would think about categories, as we encounter them in many ways in our work. As new graduate students, we learn to identify ourselves by sub-field – “I do history of gender,” or “I’m an Americanist.” And we are trained implicitly and explicitly to organize information and causal explanations into categories of analysis – race, class, gender, sexuality, politics, space, etc – when in fact these categories are never so neat and separate, whether in an individual’s life or in a historical moment. Then we research in archives that establish and reify their own categories – legal records divided by plaintiff or defendant, institutions that keep their records with an eye to confirming their power or reinforcing their independence. To make sense of a sometimes overwhelming volume of fact, all of which needs to be analyzed relationally, we rely on categories that we create as we work – like my database keywords.
Permalink for this paragraph 1 This matter of categories connects to at least two fields of scholarship. Scholars of the history of knowledge like Peter Burke have examined the organizational schemes embodied in curricula, in libraries, in encyclopedias, and have shown us how these structures and taxonomies represent particular ways of seeing the world. Burke then shows that such schemes reify or naturalize those ways of seeing, helping to reproduce the view of the world from which they came. They also make some kinds of information more, or less, accessible.
Permalink for this paragraph 0 Think, for example, of the encyclopedia. We are accustomed to its A to Z organization of topics, but this structure in fact represented a break away from previous reference formats that grouped subjects under a structure of classical disciplines. The alphabetized encyclopedia came about at a point when the previous disciplinary categories were no longer so stable as to be able to contain growing knowledge, and a new, more horizontal or less hierarchical model took their place, a model that allowed readers access to information by topic, outside of the hierarchies of a discipline.5 Burke points us to the importance of how we categorize information, where these categories come from, and how categorizations affect our access to and experience of information.
Permalink for this paragraph 2 Anthropologist Ann Stoler comes to the problem of categories from a different perspective. Stoler thinks of the archive as an active site for ethnography, and seeks to understand how archives are live spaces in which the Dutch colonial state in Indonesia built, among other things, social categories. She traces how colonial administrators through their archiving categorized, and assigned particular rights and privileges to, people with different national heritages. As they categorized, they made some peoples and experiences of the colonial state visible and obscured others. Stoler writes that categories are both the explicit subject of archives and their implicit project: “the career of categories is also lodged in archival habits and how those change; in the telling titles of commissions, in the requisite subject headings of administrative reports, in what sorts of stories get relegated to the miscellaneous and ‘misplaced.’” She then frames the archive as a place to understand “how people think and why they seem obliged to think, or suddenly find themselves having difficulty thinking,” in certain ways.6
Permalink for this paragraph 0 The work of scholars like Burke and Stoler poses important questions about how historians understand their own research process. Burke’s work suggests that we investigate how categories of thought, either between disciplines or within them, affect us. Think of academic sub-fields, for example, the boundaries of which still shape the literatures we read even as many try to transcend them, and still guide which archives we pursue or whether we think of particular questions as in or out of our domain. Stoler raises a different kind of question. At what points in our research, out of pragmatic necessity, out of a desire for intellectual order, or for yet other reasons, do we set out categories of evidence, of thought, that influence what we see and what we don’t see? What kinds of tools could help us be more aware of these categories, or have the flexibility to move beyond them when we need or want to?
Permalink for this paragraph 1 I hypothesize here that relational databases offer a kind of flexibility in working with notes that can allow us to create and recreate categories as we work. That flexibility means that we can evaluate particular ways of categorizing what we know, and then adapt if we realize that these categories are not fully satisfactory. When I think of building categories as part of what I am doing when I research and analyze, I am reminded to evaluate those categories, those ways or organizing or thinking, for how they help or what they leave out. I have the flexibility to adjust my categories as I know more about my sources, about how they relate to one another, about how they relate to the silences I’m finding. I have tried here to illustrate this by talking about keyword versus full-text searching.
Permalink for this paragraph 0 The matter of flexible categorization touches upon another strand of scholarship about archives, in which archivists debate what postmodernism has meant for their work. How does the growing understanding of archives as spaces in which certain kinds of power are codified and justified, and where information has to be understood relationally, matter for the practice of archiving? One archival theorist, Terry Cook, posited that finding aids and item descriptions should be constantly evolving, adapting to new relevant knowledge about the item’s sources and its relationship to other archived and unarchived materials.7 Working with relational databases provokes historians to think about how our notetaking and organizing practices may achieve this same level of flexibility and relationality.
Permalink for this paragraph 0 Yet there are at least two cautions to think of, as well. One comes from the flatness of databases like the one I used. In Burke’s terms, my database is not a reference text organized along disciplinary lines. It was more like an A-to-Z encyclopedia. Without hierarchies that keep each fact locked in relationship to others – through the structure of earlier historiography, for example, or through the structure of an archive’s collections – the historian has to be more intentional about seeing information in its context. If we can look across all of our notes at a very granular level, and make connections across categories that we or others created, it becomes to easy to look at these bits of information devoid of context – a danger visible even in my own way of cutting and pasting out of my database, which linked bits of notes only to a source code, meaning that they could be read in less than direct connection to their origins. Digital bits seem very easily severed from their context.
Permalink for this paragraph 1 More importantly, despite its usefulness in helping see things we might have otherwise forgotten or missed, no database does the work of analysis. The two are, of course, interdependent – and that interdependency exists in a digital or a non-digital form of notekeeping. The analytical work, the crucial sense-making that pushes history writing from chronology to critical interpretation, still happens in our own heads. And there, other implicit categories or habits of thought might be shaping our analysis. Here we decide whose stories to tell first, for example, or prioritize one set of historical drivers over another. Some of these habits reflect the deepest-held of assumptions and beliefs. It is less easy to talk of these, and certainly less easy for an author to identify their own, than it is to talk about notetaking and notekeeping. But, maybe if we are critically conscious of the mechanical, we can be prompted to more reflection about the conceptual, as well.
Permalink for this paragraph 8 It is also worth considering what kinds of concerns may lead historians – even technology-saavy members of the youngest generation in the profession – to resist using digital tools like databases in their own research. The choice to use only word-processing files, for example, may stem in part from a concern about what might be lost in adapting a more mechanized information organization system. Charlotte Rochez, responding to an earlier version of this essay, explained that she worried about sacrificing “some of deeper insights, interpretations and understanding induced from being more involved in sorting and interpreting the sources.”8 Historians surely value, maybe even romanticize, the encounter with sources in the archives. Does converting that textual, even textural, experience into digital notecards somehow deaden it? Does it render our research uncomfortably close to a social scientist’s coding and writing up of findings? It is important to clarify that digital notetaking may add to, but does not of necessity replace, varied encounters between researcher and sources. It remains possible to meander through your notes from a given collection or source, to look back at the original page (even if in pdf or photocopied form). But it becomes newly more feasible also to look across those collections and sources.
Permalink for this paragraph 0 One prompt for this volume came from the Journal of American History’s 1997 special issue that made public the process of academic peer review. David Thelen’s introduction to that issue raised questions about the work of history-writing that seem important to revisit in light of digital innovations. The centerpiece of the issue was a submission by Joel Williamson, in which Williamson recounted his failure to perceive the centrality to, and the origins of lynching in, American and southern history. Two reviewers received Williamson’s piece with shock and dismay that he could have missed what they had known, had appreciated as central in their field, for years. Despite this disagreement, or perhaps because of it, Thelen saw Williamson’s piece as issuing a challenge to historians to “think about what we see and do not see, to reflect on what in our experience we avoid, erase, or deny, as well as what we focus on.”9 I see my attention to categories, to the possibilities and implications of how we choose to organize the information upon which our interpretations rest, as a kindred effort.
Permalink for this paragraph 0 About the author: Ansley T. Erickson is Assistant Professor of History and Education at Teachers College, Columbia University.
Permalink for this paragraph 0
Acknowledgements: The author thanks Jack Dougherty and Kristen Nawrotzki for the invitation to reflect on research practice and for good feedback on this essay, Courtney Fullilove for reading suggestions, and Seth Erickson for ongoing conversations about archives and information architecture. She also thanks all those who commented on the 2010 version of this essay. The dissertation research described here was supported by a Spencer Dissertation Fellowship, a Clifford Roberts/Eisenhower Institute Fellowship, and a Mellon Interdisciplinary Graduate Fellowship at the Paul Lazersfeld Center, Institute for Social and Economic Research and Policy, Columbia University.
- Permalink for this paragraph 0
- McNeil quoted in Tracy L. Steffes, “Lessons From the Past: A Challenge and a Caution for Policy-Relevant History,” in Kenneth Wong and Robert Rothman, eds. Clio at the Table: Using History to Inform and Improve Education Policy,” (New York: Peter Lang, 2008), 267-8 ↩
- Erickson, Ansley T. “Schooling the Metropolis: Educational Inequality Made and Remade, Nashville, Tennessee, 1945-1985.” (Ph.D. Diss., Columbia University, 2010) ↩
- Lynn Hunt argues for a similar approach, encouraging scholars not to delay writing by organizing notes or other getting-ready activities, and to use writing to further thinking, at http://www.historians.org/perspectives/issues/2010/1002/1002art1.cfm. I agree that writing should begin sooner rather than later, even before all of the questions about what the argument is or what the structure is have been resolved. I think that digital note-keeping can help, as the accessibility of information reduces barriers between a “getting organized” phase and the actual writing. James B. McSwain tells a similar story about his experience with the qualititative research software Nota Bene in a comment on an earlier version of this essay (http://writinghistory.wp.trincoll.edu/2010/10/06/organize/#comment-148). I wrote more about my approach to getting started and keeping going in writing at http://writinghistory.wp.trincoll.edu/2010/10/06/erickson-thinking/. ↩
- William Higgins, “Suggestions for Development of Guidelines for an Unitary Plan for the Metropolitan Board of Education.” 1979. Kelly Miller Smith Papers, Vanderbilt University Special Collections and University Archives, Box 69, File 8; Saundra Ivey, “School Closing Plan Draws Fire.” Tennessean, Nov. 23, 1977. ↩
- Peter Burke, A Social History of Knowledge: From Gutenberg to Diderot. (Cambridge: Polity Press, 2000). See especially 184-7. ↩
- Ann Laura Stoler, Along the Archival Grain: Epistemic Anxieties and Colonial Common Sense. (Princeton: Princeton University Press, 2010), 36, emphasis in original. ↩
- Terry Cook, “Fashionable Nonsense or Professional Rebirth: Postmodernism and the Practice of Archives.” Archivaria No. 51, Spring 2001, accessed at http://journals.sfu.ca/archivar/index.php/archivaria/issue/view/428/showToc, Sept. 12, 2010. ↩
- Charlotte Rochez, in Jack Dougherty and Kristen Nawrotzki, eds. Writing History in the Digital Age. Under contract with the University of Michigan Press. Web-book edition, Trinity College (CT), Fall 2011. http://writinghistory.trincoll.edu/evolution/discuss/ ↩
- Journal of American History, Vol. 83, No. 4 (Mar. 1997), and David Thelen, “What We See and Can’t See in the Past: An Introduction,” in that volume, via http://www.jstor.org/stable/295898, Accessed Aug. 20, 2010. ↩