A Distinction Worth Exploring: “Archives” and “Digital Historical Representations”

In the original presentation of these papers at the AHA session, I was the final speaker on the panel, and so my talk was framed as a response to and expansion of the points made by the previous speakers.

In preparing for the panel “Digital Historiography and Archives” at the 2014 meeting of the American Historical Association, I had my usual trepidations about how the other speakers and the audience would frame their conception of “archives.” In writing my talk I read an article Josh had written for an archival journal in 2011[1] and was pleased to see his careful usage of the phrase “digital historical representations” as an umbrella term covering some of the resources presented by archives, as well as a range of products from other sources.

In discussing archives with historians and other humanities scholars, I often feel somewhat pedantic in my continual emphasis on the meaning of words.[2] But after all, words represent concepts and perceptions of reality, and if those words aren’t clearly communicating what we intend, then it’s hard to achieve meaningful progress. The approach I chose for my remarks at the digital historiography session was to illustrate the points the other speakers had made about the importance of questioning, understanding, and articulating the context of creation of digital historical representations by discussing the differences between different types of digital information sources created and used by historians—many if not most of which are often all referred to as “archives.”  In all of these cases the context of the creation of the information sources is critical to understanding the problems that may be inherent in that source and which the researcher should take into consideration. I am not a historian, but I would think that understanding why and how an information resource was created—that is to say, its context—is more valid than ever in digital historiography.

Most readers will be familiar with what for lack of a better term I’ll call “traditional” archives—that is, primarily paper-based (or non-digital) largely unique materials, brought together in repositories in aggregations either created by the originating organization or person, or by a third party, such as a scholar, manuscript dealer, or the repository itself (as in special collections). Appraisal and selection of such materials is a multi-dimensional process with many factors involved, often including political influence, censorship on the part of the creator or collector, resource limitations on the part of the repository, random chance, and acts of God. How and why the materials on our shelves end up there is not always a straightforward story and one that is usually not captured in detail in the public description of the materials. How the materials were aggregated and for what purpose is usually described at some level in the finding aid, but documentation in this area can be sporadic. I would guess most archivists believe—rightly or wrongly—that metadata fields like “Custodial History,” “Appraisal, Destruction and Scheduling Information,” and “Administrative/Biographical History” are not valued by most users. Even among historians I’m not sure how often they are of interest, or at least how often historians ask the archivist for more information if the finding aid is skimpy in this regard.

Again, that’s “traditional” physical archival materials, represented digitally by descriptions in online finding aids, catalog records, etc. For these materials, I think what has changed for historians in the modern digital age is the increased expectation—and reality—that more descriptive information about materials will be made available online, and also the ability to easily create their own digital copies with digital cameras and smart phones.

Next we have collections of digitized analog historical materials—sometimes called “digital archives.” These may be topically based—assembled from holdings of many repositories, like the William Blake Archive or the Wilson Center Digital Archive. Or they may be all from one repository—as in the recently launched FRANKLIN site, which provides online access to digitized collections from the Franklin D. Roosevelt Presidential Library and Museum. These collections may be created by archivists, librarians, historians, passionate amateurs, nonprofit organizations or for-profit companies. Because these digital historical representations are created by such a wide range of sources, it’s critical to know about the context of these collections—including who assembled them, what their purpose was, and what criteria they used.

Often when historians are talking about archives, when I probe to see what they mean, it is these kinds of collections they are referring to. In her paper, Katharina Hering observed that it’s important to know where the individual original materials are located and where they fit in their archival context and that is certainly true. But it’s also important to understand where materials fit in the context of the new digital collection. On what basis were items added to this collection? Why were some items excluded? To what extent is what’s being presented a subset of what’s available? Where does the metadata come from? How was it created and reviewed?  As with online finding aids for physical collections, what is being accessed in this kind of digital collection is a surrogate—a description of that object or aggregate created by a person to represent it. Even a scanned image of a document is a surrogate, although hopefully an accurate one. Descriptions and metadata can be subjective and also subject to errors.

It seems to me as if these kinds of collections—or “digital archives” as they’re commonly called, would raise a host of questions in terms of digital historiography—some similar to those presented by online information for “traditional” archives, but many others that are different.

Yet a different kind of aggregate, also sometimes called “digital archives,” are groups of born-digital materials as opposed to digital surrogates of analog originals. These types of aggregates, kept together because they come from a single source or creator, reside primarily within archives and special collections repositories, and consist of records created or received by an organization in the course of business, maintained by them and transferred to their associated archival repository. The electronic records created by the Census Bureau and transferred to the National Archives are an example of this kind of aggregate. Another example can be found in the equivalent of the “papers” of a person or family, such as Salman Rushdie collection at Emory, which contains the contents of his personal computers. For these kinds of aggregates, archives have most of the same kinds of issues with selection, appraisal, and custodial history as they do with non-digital materials, but with additional issues raised by their digital format related to reliability and authenticity as well as how to provide access.

And last but not least, you can have assembled collections of born-digital materials—yet another category of what are termed “digital archives.” The September 11 Digital Archive, created by the Roy Rosenzweig Center for History and New Media, is a good example of this type of collection. In this case—and also with the Internet Archive—the collection serves a critical function: acquiring born-digital materials that might not otherwise survive. Many born-digital materials are more fragile than their analog counterparts for various reasons, and so some of these collections are similar in function to special collections libraries, which pull together valuable individual items for preservation. It’s also worth noting that in digital collections, copies of materials can reside in more than one collection. For example, in the September 11 collection there are copies of documents created by the New York City Fire Department (Incident Action Plans). Presumably there are also copies of these born-digital records being transferred to the official repository for the municipal records of New York City. These kinds of “digital archives” combine the issues related to assembled collections—that is, the necessity of exploring who is creating them, for what purpose and using what methods— and those concerns related to born-digital materials as far as preservation and authenticity.

Coming back to the term “digital historical representations,” I’m happy to see this broader term being used in discussions about “archives” and digital historiography. Many products that could fall into this category—such as databases and sources like Google Books—would be removed one step (or more than one step) too far to be categorized as “archives.” I would consider these as separate intellectual products created from archival sources. And, indeed, in a way, so are any of the collections in which copies of archival materials are removed from their original context and “re-mixed” to be part of a new creation—a new “digital archives” like Valley of the Shadow, to use a classic example. In fact, in a pre-digital era analogous versions of the scholarly products mentioned here (other than databases) would still have existed, I think, and been called something other than “archives”—they would have taken the form of exhibits, edited volumes of letters or printed collections of documents, assembled and edited by historians or other sources. The question of why the word “archives” has been adopted to refer to collections of materials is one for a different discussion, but I do think it’s worth noting that this co-opting of the word does seem to be a rather recent development.

I hope the efforts discussed in this session encourage more rigorous assessment of digital historical representations and will result in a greater understanding and appreciation of what makes archives distinct from these other kinds of products. I often fear that this appreciation and understanding is being lost as fewer historians work with “old-fashioned” physical archival collections, and do most of their work online, where it is easy to think that all digital collections are the same. The value of the collections of materials preserved in archives often lies in the relationship of the records to each other—what’s called the archival bond—which means that the whole is greater than the sum of the parts. As a whole, the materials provide evidence about the activities of their creator or the person or organization who brought them together.

Discussions of digital historiography and the archives should be a two way street. It was heartening to see archival concepts such as appraisal and provenance being discussed at an AHA session and so seeing information flow from the archival literature to that audience. It is unclear what kind of awareness most historians have of archival theory or practice. Anecdotal evidence provided by many archivist colleagues suggests that such knowledge is, at best, uneven.[3] In return it is certainly also the case that digital historiography—that is, the study of the interaction of digital technology with historical practice—can inform the work of the archival profession.

The papers from this session discussed how technology has changed the way historians do their work, and it certainly has effected the way archivists do our work, as well. Among the most significant of those ways is in the increased workload placed on archivists to create descriptions and digital copies to share online, to find ways to collect and preserve digital materials, and of course, to actively connect with the public via the ever widening world of digital tools and social media. Digital technology has also increased the user base for archival resources, meaning that the connection between our historian users and archivists is more diluted than it was in the past. In prioritizing our work and establishing our practices, archivists are trying to meet the needs of the broadest range of users. In so doing, it’s possible that the more specialized needs of historians—if indeed they are different from other users—are not being met. We need to keep an ongoing dialog between our two professions to ensure that we’re all working together as effectively as possible to support the historical enterprise.

Originally published by Kate Theimer on January 15, 2014. Revised for Journal of Digital Humanities August 2014.

  1. [1] Joshua Sternfeld. “Archival Theory and Digital Historiography: Selection, Search, and Metadata as Archival Processes for Assessing Historical Contextualization.” American Archivist Fall/Winter 2011, 544-575.
  2. [2] See, for example, Kate Theimer. “Archives in Context and as Context.” Journal of Digital Humanities Vol. 1, No. 2 Spring 2012, http://journalofdigitalhumanities.org/1-2/archives-in-context-and-as-context-by-kate-theimer/.
  3. [3] The need for greater communication between historians and archivists was discussed in the concluding chapter of Francis Blouin and William Rosenberg, Processing the Past Oxford University Press, 2012.

About Kate Theimer

Kate Theimer is the author of the popular blog and Twitter account ArchivesNext and a frequent writer, speaker, and commentator on issues related to the future of archives. She is the editor of the series Innovative Practices in Archives and Special Collections, in which books on description, management, outreach, and reference and access were published in 2014. She is the author of Web 2.0 Tools and Strategies for Archives and Local History Collections and the editor of A Different Kind of Web: New Connections between Archives and Our Users. She has published articles in The American Archivist and the Journal of Digital Humanities. Kate served on the Council of the Society of American Archivists from 2010 to 2013. Before starting her career as an independent writer and editor, she worked in the policy division of the National Archives and Records Administration in College Park, Maryland. She holds an MSI with a specialization in archives and records management from the University of Michigan and an MA in art history from the University of Maryland.