Transparency, provenance and collections as data
The National Library of Scotland’s Data Foundry
Keywords:digital humanities, datasets, provenance, transparency, collections as data, digital scholarship
‘Collections as data’ has become a core activity for libraries in recent years: it is important that we make collections available in machine-readable formats to enable and encourage computational research. However, while this is a necessary output, discussion around the processes and workflows required to turn collections into data, and to make collections data available openly, are just as valuable. With libraries increasingly becoming producers of their own collections – presenting data from digitisation and digital production tools as part of datasets, for example – and making collections available at scale through mass-digitisation programmes, the trustworthiness of our processes comes into question. In a world of big data, often of unclear origins, how can libraries be transparent about the ways in which collections are turned into data, how do we ensure that biases in our collections are recognised and not amplified, and how do we make these datasets available openly for reuse? This paper presents a case study of work underway at the National Library of Scotland to present collections as data in an open and transparent way – from establishing a new Digital Scholarship Service, to workflows and online presentation of datasets. It considers the changes to existing processes needed to produce the Data Foundry, the National Library of Scotland's open data delivery platform, and explores the practical challenges of presenting collections as data online in an open, transparent and coherent manner.
How to Cite
This work is licensed under a Creative Commons Attribution 4.0 International License.