Biomedical Term Service Help

Data Structure

The CV-BTS is essentially an API wrapper for a pre-built database structure. To understand the full functionality and the logic behind the system, it is important to understand the data structure that is used by the system.


The two databases powering BTS are:

  • Neo4J, storing all the term IDs as nodes, and the connections (parent-child, replaced-by, etc.) as relationships. It handles the traversal, expansion and similarity search of terms.

  • MongoDB, storing the term details, such as label, description, synonyms, etc. It is used for the term auto-completion and term details retrieval.

The term IDs are stored in both databases, which act as a common key to link the term details and the relationships.

Ontologies, Genes and other term sets

The BTS contains multiple sets of controlled vocabulary terms to facilitate different use cases. The ones supported (and planned) are:

  • Human Phenotype Ontology (HPO)

  • Orphanet Rare Disease Ontology (ORDO)

  • Online Mendelian Inheritance in Man (OMIM) (planned)

  • SNOMED-CT, SNOMED UK Edition, SNOMED UK Drug Extension, SNOMED UK Clinical Extension (SNOMED)

  • National Cancer Institute Thesaurus (NCIT) (planned)

  • Hugo Gene Nomenclature Committee (HGNC) symbols and IDs

  • National Center for Biotechnology Information (NCBI) Gene IDs (planned)

  • Reactome Pathways (planned)

Last modified: 09 October 2024