Library of Congress Announces Release 4,240 New Web Archives on Loc.gov (Largest Single Release to Date)
From The Signal/Library of Congress:
The Library of Congress Digital Content Management Section is excited to announce the release of 4,240 new web across 43 event and thematic collections on loc.gov, our largest single release of web archives to date! Web archives such as Slate Magazine from 2002 to present, Elizabeth Mesa’s Iraq War blog, and Sri Lanka’s current president Maithripala Sirisena’s campaign website (no longer live on the web) are now waiting to be discovered alongside millions of other Library items. Keep watching The Signal for deeper dives into the unique collections with web archives now available on loc.gov. The Web Archiving Team sends its deepest gratitude to all involved in this significant achievement for the Library.
With over 20,000 web archives among 114 ongoing and finished collections, the scale of the Library’s web archive has grown significantly, presenting compelling new challenges for description along the way. To provide access at the same rate the archive continues to expand, the Web Archiving Team (WAT), representatives from Acquisitions and Bibliographic Access (ABA), and Web Services created an innovative new MPLP cataloguing approach. The approach, known internally as the minimal-record approach, combines the descriptive talents of cataloging librarians with the power of Python scripting to automatically create MODS records.
The Library successfully implemented the minimal-record approach during its previous releases of the Federal Courts, International Tribunals, and Legislative Branch Web Archive collections. In planning subsequent releases, WAT saw that many web archives overlap between thematic collections — this is possible because of the way the Library collects and manages the collections when building them. For example, Hark! a vagrant, appears in the Webcomics Web Archive and the Small Press Expo Comic and Comic Art Web Archive. In the current release, there are even more complicated examples, such as Beliefnet, which appears in three different collections curated by four different library units.
Learn More About the Minimal-Record Schema, View Screenshots, in Complete Blog Post
Direct to the Library of Congress–Web Archives
Direct to LOC Web Archiving FAQs
Filed under: Archives and Special Collections, Libraries, Management and Leadership, News
About Gary Price
Gary Price (gprice@gmail.com) is a librarian, writer, consultant, and frequent conference speaker based in the Washington D.C. metro area. He earned his MLIS degree from Wayne State University in Detroit. Price has won several awards including the SLA Innovations in Technology Award and Alumnus of the Year from the Wayne St. University Library and Information Science Program. From 2006-2009 he was Director of Online Information Services at Ask.com.