Skip to main content
Fig. 1 | Genome Biology

Fig. 1

From: Storing and analyzing a genome on a blockchain

Fig. 1

SAMchain design and implementation. a Overview of the SAMchain network ecosystem. The network consists of owner, sequencer, clinician, and researcher nodes. The owner node builds the SAMchain and the sequencer node accesses the chain and inserts SAM data into it. The clinician and researcher nodes access the SAMchain and analyze the on-chain SAM data. b Details of data storage in SAMchain. A read is typically stored in a SAM file containing several features. Our data structure is organized by genomic location. A single stream, named metaData, contains all of the header data and other chain info. Many other streams serve as bins by genomic location and hold the SAM feature data and MODCIGAR. A FLANK feature is used to indicate whether a read’s position spans two consecutive bins. Stream items correspond to a single read. A single stream, named unmappedANDcontigs stores unmapped reads and contigs. c Overview of the query process. Upon querying a genomic location, our algorithm searches through the binned streams to obtain the SAM data and MODCIGAR features corresponding to the specified location. These data, in combination with a reference genome, yield a complete SAM read. Our algorithms and stream-based data structures are built on top of MultiChain, which provides the underlying blockchain, stream design, and network configuration

Back to article page