Team:Concordia-Montreal/Database

Astroyeast - Accelerating outer space exploration through synthetic biology !-- Title end -->

...

Database Modelling

We used the most popular database for modern apps as our backend for.

Introduction

Mongo DB

MongoDB is part of the backend software of AstroBio. MongoDB is considered a document-oriented NoSQL database used for high volume data store. This means that instead of using tables and rows as it is the case in relational databases, AstroBio makes use of collections and documents. Collections contain sets of documents and functions which are equivalent of traditional database tables (MongoDB Inc., 2020).

iGEM Concordia 2020 Database

The iGEM Concordia 2020 database contains collections which in turn contain documents. Each document has a varying number of fields. This structure is aligned with how classes and objects are typically constructed. The data model available within MongoDB allows us to represent hierarchical relationships, and to store arrays easily. Moreover, the environment is highly scalable.

Collections

metaData

The metaData collection contains experiment information, which includes the following fields: Accessions, Treatment, Description, Experimenter, PMID, Institution, Assay Type, Design, Summary, and finally the URL of the study.

Metadata Schema

{ "_id" : "", "field" : "", "accession" : "", "treatment: "", "description" : "", "Link" : "", "Experimenter" : "", "Contact" : "", "Title" : "", "URL" : "", "PMIDs" : "", "Institute" : "", "Design" : "", "PlatformID" : "", "Type" : "", "Summary" : "" }



geneResults

The geneResults contains basic gene information, as well as gene ontology information, gene annotations and differential expression analysis results.

Available Schemas

GSE4136 Schema

{ "_id" : " ", "ID" : " ", "adj" : { "P" : { "Val" : " "} }, "P" : { "Value" : " " } }, "t" : " ", "B" : " ", "logFC": " ", "Gene" : { "symbol" : " ", "title" : " ", } "Platform_ORF" : " ", "GO" : { "Function" : " ", "Process" : " ", "Component" : " " } "Chromosome" : { "annotation" : " " } }, "EGEOD" : " ", "Organism" : " ", "Species" : " ", "Strain" : " ", "StudyType" : " ", "AssayType": " ", "Gen" : " ", "EssentialforFlight" : " ", "meta_data" : " " }

GSE40648 and GSE95388 Schemas

{ "_id" : " ", "adj" : { "P" : { "Val" : " "} }, "P" : { "Value" : " " } }, "t" : " ", "B" : " ", "logFC" : " ", "Gene" : { "symbol" : " ", "title" : " ", } "Platform_ORF" : " ", "GO" : { "Function" : " ", "Process" : " ", "Component" : " " } "Chromosome" : { "annotation" : " " } }, "EGEOD" : " ", "Organism" : " ", "Species" : " ", "Strain" : " ", "StudyType": " ", "AssayType" : " ", "Gen" : " ", "meta_data" : " " }

GSE50881 Schema

{ "_id" : " ", "adj" : { "P" : { "Val" : " "} }, "P" : { "Value" : " " } }, "AveExpr" : " ", "F" : " ", "logFC" : " ", "Gene" : { "symbol" : " ", "title" : " ", } "Platform_ORF" : " ", "GO" : { "Function" : " ", "Process" : " ", "Component" : " " } "Chromosome" : { "annotation" : " " } }, "EGEOD" : " ", "Organism" : " ", "Species" : " ", "Strain" : " ", "StudyType" : " ", "AssayType" : " ", "Gen" : " ", "meta_data" : " " }



reducedGenes

Map-reduce operation to split and map genes with multiple names that point to the same statistical analysis results.

Map-Reduce Schema

{ "_id" : " ", "value" : " " }

References

MongoDB Inc. (2020). Retrieved from https://www.mongodb.com/