Overview of Image Data Management Platforms for Microscopy Data

Microscopy images are a common type of data across research domains. Modern microscopy is collection of many different imaging techniques with various complexities. Often, microscopy data has a large file size (e.g., data from whole-slide scannners, light sheet microscopes, or high-content imaging systems). In addition, image processing and analysis workflows are either intrinsic steps of image generation, or they are regularly applied on images to obtain quantitative and qualitative information out of pixel intensity values. Given the various microscopy modalities and the large number of vendors for microscopy equipment, at present, microscopy data is heterogenous with respect to its file formats and metadata models. To enable full access to all image information, proprietary software is often required.

For researchers, the organization of raw data, and a good strategy to document the full provenance throughout a data life cycle is an important aspect of good scientific practice. Large file sizes and proprietary formats make data handling demanding. In particular, when data must be shared with other researchers in a collaborative setting, keeping track of when who did what to the data, how data was transfered, and if all steps are properly audited is a time-consuming challenge for many experimentalists.

To ease image data handling, several software were produced to help manage and/or analyse bioimaging data with different degrees of specialization. Using a suitable data management platform helps to organize, preview, annotate, and otherwise act on image data in a well-structured manner. According to the 2021 NFDI4BIOIMAGE community survey, many researchers acknowledge of the benefits of data management systems, but researchers are undecided about the effort-to-benefit ratio to implement such software (Schmidt & Hanne et al., 2022). Dedicated time and hardware resources are needed to set up a well-working data management platform. Hence, providing image data management as a central service for many researchers together creates synergy.

Here, we provide an overview of some selected open-source software platforms for bioimaging data management. Some of these platforms are available in combination with commercial services, and in addition, many commercial providers offer software suites for data management and analysis. Commercial offers are not included here. Many of the introduced software platforms have shared hallmarks as they:

  • are suited for (large) imaging-specific data
  • are extensible to work with third-party software
  • provide organization and analysis functionalities
  • have an open-source code and a maintainer community
  • have been successfully used in many research projects

The I3D:bio‘s focus is on OMERO because at the time of project planning, it was the best established among applicants, and it was generally declared the most well-known among (German) bioimaging scientists (Schmidt & Hanne et al., 2022). Quick links to the platforms introduced below:

OMERO – OME Remote Objects

Keywords:
Storage, visualisation, rendering, sharing, analysis, Bio-Formats, figure-creation, Tags, Key-Value Pairs, metadata, web-based

OMERO (OME Remote Objects) is an open-source software product developed by the Open Microscopy environment consortium. It is an image data management system enabling to

  • store and visualize bioimaging data
  • render and share data
  • analyse or connect to analysis software
  • enrich data with metadata (Key-Value pairs & Tags)
  • and more…

OMERO is available via the OME website:https://openmicroscopy.org

OMERO is freely available and has open-source code. The OMERO guides help with all aspects of using OMERO.

Who can install OMERO?

OMERO is usually installed on central IT hardware resources. An installation on virtual machines is possible, and OMERO can be installed in container environments.

In principle, anyone can install OMERO and run it. However, in practice, OMERO requires IT-support, and a good implementation plan including core facility stakeholders and the users.

Implementing OMERO from scratch at a university or institute can take a few months before all functions run smoothly and reliably. The reason is that the installation must be adapted to local IT environments. Communication between all relevant stakeholders is key. Hence, installing a new OMERO instance needs to be regarded as a project with a sufficient degree of organization.

OMERO is a mature image data management systems.

+ great support from an international community using OMERO (e.g. on image.sc)

+ very good for organization of data and creating figures directly with original files

o New users need to get used to the object-oriented storage that does not allow deep folder hierarchies

o Making use of Key-Value Pairs and Tags might need to be re-inforced by Data Stewards or Managers

OME Remote Objects (OMERO) is an open source software developed by the Open Microscopy Environment Consortium (OME). The software consists of a central middleware software (OMERO.server) that amalgamates all tasks of data management from its different components (Allan, 2012). Images can be safely stored on a separate disk that is mounted to the OMERO server. Users thus cannot accidentally corrupt the original image data. OMERO reads the original data files and converts them on the fly to the open file format OME-TIFF (using the Bio-Formats translation library). Recent versions of OMERO also support the new OME-NGFF file format. With the help of a database, the image data is organized, displayed depending on user access rights and user settings, and can be annotated with various metadata. For basic image analysis, OMERO comes with build-in function. Extensions and plug-ins allow to widen the functionalities. With OMERO.figure, the software allows to create publication-ready figures directly linked to the original raw data, only changing rendering settings, but without duplicating or exporting images to sloppily compressed versions of them.

According to the 2021 NFDI4BIOIMAGE community survey, OMERO was the best-known and most widely used image data management platform among core facilties and researchers in Germany (Schmidt & Hanne, 2022). Based on the open-source code, a large community of maintainers and developers around OMERO contribute to new functions, e.g., the Metadata Editor OMERO.mde (Kunis et al, 2021), or scripts to upload csv-tables as metadata annotations. OMERO can be used by researchers via a web-browser (OMERO.web). Introduction videos about OMERO can be found online (e.g., in the Global BioImaging Workshop from January 2022).

BisQue (Bioimage Semantic Query User Environment)

Keywords:
Storage, visualisation, rendering, sharing, analysis, Bio-Formats, metadata, web-based

The web-platform BisQue has been developed at University of California, Santa Barbara, as a combined image organization and image analysis platform (Kvilekval et al., 2010). Offered by UCSB, an online BisQue platform is available (https://bisque2.ece.ucsb.edu/client_service/).

  • store and visualize bioimaging data
  • render and share data
  • analyse or connect to analysis software
  • enrich data with metadata

Official website: https://bioimage.ucsb.edu/bisque

Offered by UCSB, an online BisQue platform is available (https://bisque2.ece.ucsb.edu/client_service/).

BisQue is freely available and has an open source code.

For technical details, please review the official website.

According to community exchange in the image.sc forum, it appears that BisQue is used less than OMERO overall, which is in line with the results from the 2021 NFDI4BIOIMAGE community survey (Schmidt & Hanne, 2022). 

In BisQue (Bioimage Semantic Query User Environment), images are stored in original file formats and converted to an open file format upon presentation to the user. For basic quantitative image analysis, BisQue offers modules, e.g., to make measurements in images in the web browser. Several modules exist to enable compatibility of BisQue with external analysis software.

An introduction video about the features and the use of BisQue is available from 2016. Find more information on the website of BisQue: https://www.youtube.com/watch?v=D5Aktit4h9o

XNAT (Extensible Neuroimaging Archive Toolkit)

Keywords:
Storage, visualisation, rendering, sharing, analysis, metadata, DICOM, medical imaging, pre-clinical imaging

XNAT originated from a Neuroinformatics Research Group at the Washington University School of Medicine (Marcus et al, 2007). Originally named eXtensible Neuroimaging Archive Toolkit, the XNAT platform has developed over the years into a versatile data management tool accepting imaging and non-imaging data. Yet, XNAT is predominantly used in the field of neuroimaging, in the preclinical and in the clinical context where imaging modalities often produce data in DICOM format (e.g., MRI, ultrasound, PET and CT data). XNAT offers options for data anonymization.

Find more information on XNAT.org.

XNAT is an open-source imaging informatics platform. In XNAT, data is organized into projects, in which the authorized persons can view, quality control and annotate image data. Depending on the setup, XNAT can be used in multi-center and mutli-group collaborations to share data. Data can be accessed via many different application programming interfaces (APIs) so that processing and analysis can be performed.

Since the underlying code is open-source, a large community of contributors exists. For example, researchers at the University of Torino (and members of the Euro-BioImaging ERIC) have developed an extension of XNAT intended for use in preclinical imaging centers, termed XNAT-PIC (Zullino et al., 2022) – also presented in a talk at Euro-BioImaging‘s Virtual Pub in 2021.

The XNAT software is a versatile tool that can be installed as stand-alone application on researchers computers or be set up as a central or multi-center instance for collaborative research. XNAT is often used in medical and pre-clinical imaging areas, but suitable for microscopy data, too.

XNAT Central and Open Access Series of Imaging Studies are examples of respositories based on XNAT to publicly share imaging data.

CATMAID (Collaborative Annotation Toolkit for Massive Amounts of Image Data)

Keywords:
Storage, visualisation, rendering, sharing, metadata, neurobiology

The Collaborative Annotation Toolkit for Massive Amounts of Image Data was developed by Saalfeld et al. (2009) and serves collaboratively work on large image data sets. A feature of CATMAID is that it allows to load data from other internet accessible storage locations without duplicating the data. It has been used frequently in neurobiology research projects, for example to identify large maps of neuron connectomes from large EM datasets of different animals and developmental stages.

Find more information on the CATMAID website.

CATMAID can be installed as a central instance or used by invidual labs.

There are also public image resources based on CATMAID, for example VirtualFlyBrain.org.

Cytomine

Keywords:
Storage, visualisation, rendering, sharing, metadata, histology, pathology

This software was developed starting in 2010 at the University of Liege as a platform for histopathology image analysis. The open-source project available under https://cytomine.org is accessible publicly (not to be confused with cytomine.com, which is a for-profit corporation selling services on top of the software). Since the primary purpose for which cytomine was developed is histology, the software is mostly known for this type of data. However, the functionality of Cytomine extends well beyond the needs of histology, and can be used for bioimaging data management in general (or other images from a broad range of research disciplines – also beyond the life science domains).

The software consists of mainly four components:

  • Cytomine-Core is a web and database software that manages projects, image annotations and metadata annotations.
  • Cytomine Image Management System is the backend server software allowing to perfom image operations (e.g., to load and work on images stored in the underlying file system)
  • Cytomine Web User Interface is the application allowing users to interact with their data over a web browser
  • Cytomine DataMining adds image analysis modules to the platform

An introduction to Cytomine including its current and upcoming features was given by Raphael Marée in a Euro BioImaging Virtual Pub in July 2022.

A focus of Cytomine has become to enable the integration of images from multiple imaging modalities (e.g., correlated light and electron microscopy) and to incorporate them together in image analysis workflows. Cytomine can be installed as a central service at the institution or can be run by single users on their desktop PC or laptop. The Cytomine developers are creating a web platform for access to data through the internet so that image annotations and running image analysis algorithms on the images can be done from remote locations, and by many collaboration partners. Natively, the software supports many 2D file formats and pyramid resolutions that are frequently used in histology. Via the Bio-Format and other libraries, additional file formats can be loaded via pre-conversion steps.

For detailed information about Cytomine, read the original publications (Marée, 2016, Bioinformatics; Rubens, 2019, Proteomics Clin Appl). Some parts of Cytomine have been redesigned in recent years.