Curious about the text, GIS, or scientific datasets available through the University Libraries? It might be surprising, but the Digital Archive includes historical content that lends itself well to data-driven research. Inspired by the Collections as Data movement, the Libraries are now making these materials more visible through a new web portal.
The idea to create a portal first arose because of digital collections, like the Senator Dean Heller Press Archive. Consisting of over three thousand press releases and statements, the Heller Archive posed a challenge: how can librarians give users enough information to find a press release, but without spending months reading the entire corpus of documents. The solution involved harvesting text from the files and then using text analysis to describe each item.
“In essence, we let machines read the documents for us,” said Nathan Gerth, Head of Digital Services. “It then occurred to us that we could make the full set of text files available to users as a dataset, something that we were able to do in the Digital Archive.” Four years later, the data package has been downloaded over a hundred and fifty times, making it one of the more popular items in our collections.
The success of the first dataset in the Digital Archive encouraged the teams working with similar collections to think bigger. The result was a portal that highlights two types of content. On the one hand, it directs users to the Data to Go materials, these datasets might include anything from text harvested from word-processing documents to transcripts created from audio or video files. On the other hand, the portal also highlights thirteen data-rich collections in the following categories: scientific, textual, and GIS. Some of these materials, like the James Edward Church snow survey notebooks, have just recently been digitized.
“We may think of the STEM disciplines as being solely forward thinking,” said Tara Radniecki, Head of DeLaMare Science and Engineering Library. “But the ability to analyze historic data, such as Dr. Church's snow sampling records, or to add metadata ourselves to primary historical documents, like georeferencing regional maps, can help researchers better understand what has changed and how we can move forward using what we've learned.”
University students, faculty, and staff can access the Collections as data portal via the Libraries website. Users interested in viewing other unique digital materials can browse them in the Libraries’ Digital Archive.
The University Libraries embrace intellectual inquiry and innovation, nurture the production of new knowledge, and foster excellence in learning, teaching and research. During each academic year, the Libraries welcomes more than 1.2 million visitors across its network of three libraries: the Mathewson-IGT Knowledge Center, the DeLaMare Science and Engineering Library and the Savitt Medical Library. Visitors checked-out more than 90,000 items and completed more than 2 million database searches.