Happy Open Data Day! Today we celebrate with some incredible collections as data. Cultural institutions including Galleries, Libraries, Archives and Museums (GLAM), collect, preserve and share objects of research interest including books, diaries, artefacts, photographs, sound recordings, specimens and much more. These institutions work in innovative ways to digitize their collections and make them findable, accessible, interoperable and reusable. Below are some exciting projects which have digitised corpus, images and other objects for use as data by researchers and the public.
Late last year the Australian Data Archive released The Historical Census and Colonial Data Archive (HCCDA), a collection of Australian colonial census publications and reports covering the period from 1833 to 1901. The corpus includes 18,638 pages of text and 208 maps all with full digital images, text conversion and individually identified pages and tables.
The Smithsonian just launched its new Open Access platform , releasing 2.8 million high-resolution two and three dimensional images into the public domain for use and downloading.
The Vatican Library recently unveiled Thematic Pathways on the Web with narratives and over 26,000 annotations for important manuscripts in its collection. The technologies used in the project support comparative analysis of texts and images from different manuscripts, making annotation easy to save as well as share.
The National Gallery of Victoria has digitised over 90 per cent of its collection for the public to access online, with more than 30,000 images of works in the public domain available to download at high resolution for free for publications and non-commercial use.
All these GLAM projects rely on the use of standardised metadata to describe and link the objects to make them discoverable. As a result, some collections are accessible via application programming interfaces (APIs). APIs request and receive specific bits of data from large corpuses of information and allow you to explore the data far deeper than a search engine or index can.
A great way to learn and use APIs is via digital historian Tim Sherrat’s GLAM workbench . Tim has developed tools, tutorials and examples to help researchers “find patterns, extract features, make connections” and more, with data from GLAM collections including TROVE and Queensland State Archives.