Orphadata : Data and services to harness the power of the Orphanet knowledge base

Orphanet provides a knowledge base on rare diseases and orphan drugs. To manage the increasing number of data requests it receives from the scientific community, Orphanet has created Orphadata to provide comprehensive, massive, re-usable and computable quality data sets related to rare diseases.

Jan 29, 2024

Charlotte Rodwell

Partnerships and tech transfer officer, INSERM, US14 - Orphanet

Orphadata : Data and services to harness the power of the Orphanet knowledge base

Like Be the first to like this

Over 300 million people in the world live with a rare disease, united by common challenges, including that of poor visibility of their disease. Since 2007 Orphanet (coordinated by INSERM, the French Institute of Health and Medical Research) provides a knowledge base on rare diseases and orphan drugs. This knowledge base is structured around the Orphanet nomenclature of rare diseases, a dedicated and globally-recognised standard that improves data interoperability between healthcare and research information systems.

Image showing the Orphadata elements in four boxes

Since Orphanet has become increasingly well known as the reference source for knowledge on rare diseases, a growing number of requests for its high-quality data are received. To meet the needs of massive data extraction, Orphadata was created. Orphadata hopes to contribute to accelerating R&D and to facilitate the global adoption of the Orphanet nomenclature. Orphadata Science, the open-access section of the platform was designated as an ELIXIR Core Data Resource at the start of 2019 and as a Global Core Biodata Resource at the end of 2022.

In order to meet the evolving needs of the life science data community, Orphadata services will soon evolve with the development of additional customised features and APIs, but in the meanwhile we hope that this article will help you understand the services you can use to harness the rare disease knowledge in the Orphanet database.

Orphadata Science: Open access data from Orphanet

Orphanet has chosen to make a number of its datasets and ontologies available through the Orphadata Science portal via the CC BY 4.0 licence in order to help the research community make the best use of our curated data and to ease the adoption and integration of ORPHAcodes into health information systems and thus improve the interoperability of rare disease data. This licence is Open Science compatible, and allows for integration of our data into third-party resources and tools (including commercial resources) as long as provenance is cited. These resources are updated twice a year (apart from the nomenclature pack, which is an annual release).

Orphadata Science includes the following resources:

An inventory of rare diseases, cross-referenced with OMIM, ICD-10, MeSH, MedDRA, UMLS, GARD and with genes in HGNC, OMIM, UniProtKB, IUPHAR and Genatlas. Annotations on typology of diseases and genes and of gene-disease relationships. Definitions for RD, XML and JSON format.
A classification of rare diseases established by Orphanet, based on the literature and expert consensus classifications.
Epidemiological data related to rare diseases based on the literature (point prevalence, birth prevalence, lifelong prevalence and incidence, or the number of families reported with respective intervals per geographical area, type of inheritance, interval average age of onset).
Phenotypes associated with rare disorders (annotations using HPO terms), as well as their frequency.
Linearisation of RD : for analytical purposes, each disorder is attributed to a preferred classification (linearisation) by linking it to the head of classification entity.
Orphanet Rare Diseases Ontology (ORDO)
HPO-ORDO Ontological Module (HOOM)
Orphanet nomenclature files for coding (Nomenclature pack)

The Orphanet nomenclature pack compiles various files (listed below) which provide the computable information necessary to achieve implementation of ORPHAcodes in health information systems, and ensure easier and accurate coding. These files are updated once a year, in 9 different languages: Czech, Dutch, English, French, German, Italian, Polish, Portuguese and Spanish. The nomenclature is also available through dedicated APIs and a human-readable view is provided through a Dataviz.

ORDO – The Orphanet Rare Disease Ontology – was initially jointly developed by Orphanet and the EBI to provide a structured vocabulary for rare diseases capturing relationships between diseases, genes and other relevant features which will form a useful resource for the computational analysis of rare diseases. It is derived from the Orphanet knowledge, a multilingual database dedicated to rare diseases populated from literature and validated by international experts. It integrates a nosology (classification of rare diseases), relationships (gene-disease relations, epidemiological data) and connections with other terminologies (MeSH, UMLS, MedDRA), databases (OMIM, UniProtKB, HGNC, ensembl, Reactome, IUPHAR, Genatlas) or classifications (ICD-10). Orphanet classifications can be browsed in the OLS view. The Orphanet Rare Disease Ontology is updated every six months and follows the OBO guidelines on deprecation of terms. An ORDO SPARQL Endpoint is available (beta test).

HOOM - Orphanet provides phenotypic annotations of the rare diseases in the Orphanet nomenclature using the Human Phenotype Ontology (HPO). HOOM is a module that qualifies the annotation between a clinical entity and phenotypic abnormalities according to their frequency and with further annotations (diagnostic criterion, pathognomonic sign) when appropriate. In ORDO a clinical entity is either a group of rare disorders, a rare disorder or a subtype of disorder. The “Clinical Entity” branch of ORDO has been refactored as a logical import of HPO, and the HPO-ORDO phenotype disease-annotations have been provided in a series of triples in OBAN format in which associations, frequency and provenance are modeled. HOOM is provided as an OWL (Ontologies Web Languages) file, using OBAN, the Orphanet Rare Disease Ontology (ORDO), and HPO ontological models. HOOM provides extra possibilities for researchers, pharmaceutical companies and others wishing to co-analyse rare and common disease phenotype associations, or re-use the integrated ontologies in genomic variants repositories or match-making tools. A HOOM SPARQL Endpoint is available (beta test).

ORPHApackets are produced twice a year (synchronised with ORDO generation). An ORPHApacket is a formalized data sharing container embedding ”pieces ” of knowledge related to known rare disorders derived from the Orphanet knowledge database and ORDO (Orphanet Rare Diseases Ontology). The ORPHApacket format is encoded in JSON or YAML.

Orphadata Expert Resources

The Orphanet knowledge base includes a directory of expert resources : centres of expertise, laboratories and diagnostic tests, patient organisations, research projects, clinical trials, registries, biobanks and medicinal products related to the diseases in the Orphanet nomenclature or rare diseases. Orphadata provides a number of datasets extracted from the Orphanet knowledge base to facilitate their exploitation (see the catalogue of products and detailed description of datasets).

An inventory of Orphan Drugs at all stages of development, from EMA orphan designation to market authorization, cross-linked with diseases.
Summary information on each rare disease in eight languages (English, French, German, Italian, Spanish, Portuguese, Dutch, Polish).
A directory of specialised services, providing information on centres of expertise, medical laboratories, diagnostic tests, research projects, clinical trials, patient registries, mutation databases, biobanks and patient organisations in the field of rare diseases, in each of the countries in the Orphanet network.

These datasets are available on request (contact us) via signature of a Data Transfer Agreement (for academia) or via signature of a service contract (for other entities such as for-profits) for use in research and development. The funds raised through these service contracts are used to fund part of Orphanet’s infrastructure.

A range of possibilities for different end users

Whatever domain of work, end use or need, there is an Orphadata service best suited to your case.

Feel free to contact the team at [email protected] for guidance so you can make the most of the knowledge Orphanet has to offer.

Image showing Orphadata data and tools in boxes

Photo by Y K on Unsplash

Charlotte Rodwell

Partnerships and tech transfer officer, INSERM, US14 - Orphanet

Join the FEBS Network today

Joining the FEBS Network’s molecular life sciences community enables you to access special content on the site, present your profile, 'follow' contributors, 'comment' on and 'like' content, post your own content, and set up a tailored email digest for updates.