Skip to main content

Table 2 Three tiers of metadata within Majora

From: CLIMB-COVID: continuous integration supporting decentralised sequencing for SARS-CoV-2 genomic surveillance

Tier Implementation Properties Example
Primary Database model Fast queries via object-relational mapping
Takes up space in database even if unused
Significant work to add to the database model, API and user templates
Biosample identifier
Patient sex, age
Digital resource file path, size, hash
Secondary Database model Fast queries via object-relational mapping
Additional lookups necessary to link back to the primary database model
Cannot assume a primary model will have a secondary
Cycle threshold metrics for biosamples
BAM coverage metrics
Patient healthcare worker or care home status
Tertiary Key-value row in generic model More difficult to manage artifacts based on tagged properties alone
Highly flexible
No work required to add new tags at any time
Locally relevant tags not implemented in a model
Additional anonymised patient information
Additional sequencing run information
  1. Majora stores submitted metadata about artifacts and processes in an SQL database. Metadata is stored differently based on its priority. Fields that are a core part of a model (for example, a sample identifier, or the name of a file) are considered primary metadata and are stored in a distinct database model. Metrics such as the results of a PCR Ct test, or the coverage levels of a BAM are also stored in a distinct database model and are attached to primary models through a database foreign key. Arbitrary metadata can then be stored in key value pairs (not backed by any particular database model) and tagged to primary and secondary models as appropriate