Because names for biomedical entities can change quite often (e.g., HGNC gene symbols change quite frequently), it's much safer to reference them using stable identifiers. However, humans prefer names to identifies when interacting with data and knowledge, so there needs to be a fast, unified way to resolve identifiers to names.
Ontologies like the Gene Ontology can be accessed through one of several unified lookup
services such as the OLS,
AberOWL,
OntoBee, and
BioPortal. However, these services can only be used for
biomedical entities appearing in ontologies, and not for other important nomenclatures such as HGNC, UniProt, or
PubChem. Alternatively, small databases like SwissProt (i.e., the reviewed portion of UniProt entries) can be exported and wrapped in small packages like the
protmapper
that provide easy lookup for names
based on identifiers. Larger databases like PubChem Compound
and dbSNP can be accessed through a programmatic API becuase
they can't be easily exported or quickly loaded in memory.
The Biolookup Service is a unified platform that is not only applicable for biomedical entities in ontologies, but from both small and large databases as well.
The first set of resources ingested in the Biolookup Service are those listed in the Bioregistry as either
having an OWL or OBO ontology file. This mostly covers the OBO Foundry as well as additional resources like
Cellosaurus. They are parsed with a combination of the obonet
and pronto
Python packages. Unfortunately, many
ontologies listed in the OBO Foundry that only appear with an OWL build artifact have issues that make them
impossible to parse. The Biolookup Service has the benefit that the resource list is externally maintained and
can therefore benefit from arbitrary improvements to the upstream data source. Ontologies in the
BioPortal are not automatically listed in the Bioregistry the same as ontologies in the OBO Foundry due to their
lack of quality control.
The second set of resources ingested in the Biolookup Service are any resources (ontologies, databases, etc.)
available through the pyobo
Python package.
Additional resources can be suggested for inclusion in the Biolookup Service via the PyOBO issue tracker.