Remix and Digital Archives

[This post was first drafted in December 2020 and re-edited in July 2022 by Caitlin Birch and Michelle Warren. It spent the intervening months lost in a draft folder!]

As we come up on the end of Remix’s anniversary year, we’ve been doing some thinking about the long-term future. A day will surely come when even an open-ended project like Remix comes to a close. What will happen to this website then? We have always considered it one our research products, but it is surely less durable than formally published articles. A quick survey of https://sites.dartmouth.edu/RemixBrut on https://archive.org was not encouraging: automated web-crawling hasn’t captured “the site” but rather an incoherent collection of snippets.

Michelle asked Laura’s advice—and Laura referred the question to Caitlin Birch, Assistant Archivist for Digital Collections. On Dec 10, 2020, Caitlin and Michelle had a lively zoom chat about the digital archiving ecosystem in Rauner Special Collections at Dartmouth Library. An edited summary here!

Michelle: What can happen to the site if we decide we’re done with the project?

Caitlin: At the request of the Library, Dartmouth’s Web Services team (part of ITC) began saving websites that had been replaced with newer versions (around 2011 or 2012). The sites were technically still live on the web but were moved to a restricted subdomain that the public couldn’t access. My position was created in 2014 and I was tasked with launching a web archiving program. After becoming a partner in the Internet Archive’s Archive-It service in 2016, Dartmouth had the capacity for web archiving for the first time. Our sole focus in the first year or two was the capture of the mothballed content from Web Services. Since then, we’ve focused on a comprehensive capture of all Dartmouth-created web content with enduring value. The notable exception to this has been rapid response collecting centered around the pandemic and the most recent racial justice movement. Eventually our program will expand to support other areas of strength in teaching and research at Dartmouth. Meanwhile, web archiving in Rauner Special Collections has mainly been focused on college records.

I should note the importance of distinguishing between web archiving and digital archiving. Archivists still make a distinction (even though websites are of course a type of digital collecting) because websites and born-digital archives/manuscripts have required such separate technical infrastructure and as a result, distinct advocacy within institutions to support each separately. The result is that at an institution like Dartmouth, a digital archives program has existed longer than has a web archiving program.

Michelle: Once a website is archived, how can people find it?

Caitlin: This question is still a bit unresolved question. We use Archive-it.org (from the Internet Archive) to create copies of websites that can be accessed through Dartmouth’s searchable instance of ArchiveSpace.  It’s important to distinguish between archives (boxed collections) and manuscripts/rare books. An archived website is analogous to the former—you know the box is there, but you don’t know what’s in it unless an archivist has created an additional finding aid. I worry about the fracturing of discovery systems, this is a general issue across all archives and libraries, not particular to our systems.

Michelle: So true! In my work with another manuscript—Corpus Christi College, Cambridge, MS 80—I’ve been thinking about all the ways in which digital infrastructures has been making cataloguing more fragmented rather than more complete. But these aren’t obstacles to overcome, rather arrangements to understand. As a user, these arrangements are determining what I can find even before I start looking. And if I don’t understand the arrangements, then I don’t understand the constrains on my own research. I could make the mistake of thinking that lack of results means lack of materials—when the real issue might be lack of metadata or some other cataloguing or technical issue.

Michelle: Once the website is archived, can people still Google “Dartmouth Brut” and find it?

Caitlin: Maybe, maybe not…One solution might be a redirect to the archival copy from a live URL. Linking to the archived website in the library’s MARC record for the Brut manuscript would be another possible means of discovery. That would be new for our treatment of web archives.

Michelle: That reminds me about the digital Brut website itself. There are actually several parts. They’re more than ten years old and have become historical artifacts in their own right (in my opinion!). Digitized manuscripts aren’t really presented this way any more, in medieval studies (and probably in other fields too). Practices have been trending toward interoperability and linked open data so that digital manuscripts from throughout the world can be aggregated for cross-collection discovery. Biblissima is a good example of this effort. The Dartmouth Brut is an interesting case study because it came into a library so late (2006) and on to the web so early (2009). It can be both extremely easy to find and impossible to know about. And digital aggregation makes things even more complicated. The Digitized Medieval Manuscripts app, for example, has picked up Dartmouth’s collection, but via a webpage that hasn’t been updated to include the Brut. So, the Brut—the one on the shelf and the one online—slips from the “dark archive” to the “bright archive” depending on how you start looking for it.

Caitlin: I would say that this is a legacy problem that many institutions face. When libraries began getting into the digitization game, early practice favored treating the digitized object as new and in some ways separate from the original. That’s how you end up with two separate records in the catalogue: one for the original and one for the digital. I would say that among archivists (and I’m speaking generally here and also from my own professional perspective, which doesn’t represent all archivists), there’s a growing break from that library practice. “Intellectual arrangement” is a key part of the archivist’s work — how does item x relate to item y and how do we make that context clear to the researcher at point of discovery. From that approach, you’d be more likely to end up with a single descriptive record for the Brut, with “child” records indicating the instances of manuscript, digitization, remix website, etc.

Michelle: This is so interesting! The fact that the MARC record for the digital Brut and the one for the book on the shelf in Rauner don’t refer to each other does feel strange, from the point of view of a person trying to study the manuscript.

Michelle: Who decides what gets archived?

Caitlin: I work on the technical side. Appraisal is done by Julia Logan, Assistant Archivist for Acquisitions, and by Peter Carini, College Archivist and Records Manager. Julia and I both report to Peter. The Brut is interesting because the book on the shelf is managed through a different chain of responsibility than the related digital objects. The manuscript is the responsibility of the rare books colleagues,

Michelle: This has been so fascinating! Can we write a blog post capturing this conversation?

[Caitlin said yes and here’s the post!]

Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *