The Image Data Resource – example of an added-value database

At present, many repositories recommend but do not enforce to use of specific metadata items or formats when submitting bioimaging data. Mostly, also the file format can be chosen at the submitter‘s discretion. However, to make data Findable, Accessible, Interoperable, and Reusable, a minimum set of criteria should be met as to how the data is curated before submission. Thus, the public data can be of added value due to its rich, browsable, if possible machine-actionable metadata.

The Image Data Resource (IDR) has been developed as a collaboration of the EMBL-EBI at Hinxton, UK, and the Open Microscopy Environment consortium (OME) at University of Dundee. This repository holds imaging data sets from unrelated fields of research and across different scales, organisms and methods. For example, a large number of images originate from genetic perturbation screens. Due to the standardized, ontology-based annotations in the IDR, it is possible to perform data mining across the different datasets to find new hypotheses or features. This has been showcased in the original publication for the IDR, which was published in Nature Methods (Williams et al., 2027).

Users can interact with the IDR via application programming interfaces (APIs) or using the integrated web-browser, which is based on customized OMERO image data base. At the time of writing this post article, the Image Data Resource holds more than 13 million images with a total size of over 300 Terabytes .