Centre and Micklem Lab awarded Wellcome grant for 'ISA-InterMine'

The Centre and the University of Cambridge have been awarded a Wellcome Technology Development Grant for a new three year project: 'ISA-InterMine: accelerating and rewarding data sharing'.

The project is a collaboration between the Centre’s Associate Director Dr Susanna-Assunta Sansone and Dr Gos Micklem at the University of Cambridge.

InterMine is a data warehouse system developed by the Micklem group and already in use as a data warehouse framework by a number of model organism databases. The grant covers the continuing development of the InterMine framework, as well as the development of HumanMine - an extensive Homo sapiens genetics, genomics and proteomics resource built using the InterMine framework. 

Dr Sansone says, "We are delighted to work with Dr. Micklem's team in Cambridge and our collaborators at Springer Nature, Oxford University Press, EMBO Press and F1000 Research. Our aim is to create a set of tools to reward researchers for collecting metadata, by integrating further data to facilitate analysis as well as easing publication".

Biomedical research is expensive and technically difficult. Particularly with recent rapid developments in technology, often large and complex datasets are generated, which can be reused in later studies especially when integrated with each other. However, for effective reuse at scale, the datasets must themselves be described through “metadata”.

Recognising the importance of data reuse, the funders and publishers of research have recently defined policies requiring the creation of sufficient metadata. This is an additional burden for researchers for which they are typically not expert. Over time, by incentivising metadata collection, the ability to automatically find, collect, integrate and reuse data will increase, so accelerating the rate of biomedical research.

This project builds on two well-established open projects that: (i) have international use and impact, (ii) are selected ELIXIR-UK Node resources, and (iii) are complementary and well aligned with the aims of the Wellcome Trust to promote open research.

ISA is an open-source metadata-tracking framework that facilitates standards compliant collection, curation, management, publication and reuse of biomedical experiments.

InterMine is a model-based framework for large-scale data integration, allowing flexible querying and providing extensive web services in addition to a web interface. 

The project will further develop and combine ISA and InterMine into an integrated system implementing a workflow to accelerate and reward data-driven biomedical science: (i) A researcher starts by using a sophisticated web-based cloud-hosted ISA authoring tool to describe their experiment, at a lower conceptual and technical cost than previously possible; (ii) They import the resulting ISA archive (containing richly described experiments, linked to their data files) into a cloud-based InterMine instance, in a seamless process. This triggers a workflow, exploiting the provided metadata, to map the experiment data onto InterMine’s life sciences data model; (iii) The researcher’s data is enriched via automatic integration of third party data (e.g. mutant phenotypes, pathways) and through InterMine’s library of internal and third-party visualisation and analysis tools; (iv) The researcher can then share their InterMine, so facilitating collaboration; (v) Further reward comes from leveraging and extending ISA’s existing functionality to deposit datasets to public repositories, and by delivering new publishing methods, via virtual machines and data paper skeletons.