Early English Print in the HathiTrust (ElEPHãT)
The ElEPHãT project uses Linked Data to enable scholarly investigation across dynamic collections combining Early English Books Online - Text Creation Partnership (EEBO-TCP) and the HathiTrust
The ElEPHãT project -- Early English Print in HathiTrust, a Linked Semantic Worksets Prototype, demonstrates the use of Linked Data via worksets in order to combine information from independent collections into a coherent view. This can be studied and analyzed to facilitate and improve academic investigation of the constituents.
The project focuses on the potential symbiosis between two datasets:
- Early English Books Online - Text Creation Partnership (EEBO-TCP), a mature corpus of digitized content consisting of English texts from the first book printed through to 1700, with highly accurate, fully-searchable, XML-encoded texts;
- a custom dataset from the HathiTrust Digital Library of all materials in English published between 1470 and 1700.
The project is a sub-award of the Mellon funded Workset Creation for Scholarly Analysis (WCSA) at the University of Illinois, and within Oxford is a collaboration between the Oxford e-Research Centre and the Bodleian Libraries. The project is working towards several technical objectives:
- To generate RDF metadata for EEBO-TCP to complement the WCSA HathiTrust RDF;
- To identify suitable ontologies for encoding the EEBO-TCP RDF that can be usefully linked to the HathiTrust data, and other external entities;
- To identify and align co-references to entities within both datasets, and store these as RDF;
- To provide infrastructure to host the RDF datasets and SPARQL query interfaces;
- To create SPARQL queries of sufficient expressivity to parameterise worksets for scholarly investigation;
- To demonstrate the construction and utility of such parameterised worksets through prototype user interfaces, showing how a user might create and view a workset and the content within it.