Clinical And Registry Entries (CARE-SM) is a semantic data model designed to effectively represent healthcare patient information by using knowledge graphs represented in the Resource Description Framework (RDF). This technical description aims to provide a comprehensive overview of CARE-SM, its origins, and the design principles that underlie its structure.

Introduction

CARE-SM is a more robust and matured representation of its precursor, the Common Data Element (CDE) semantic data model. The primary objective of its creation was to develop a semantic data model capable of representing a set of common data elements for rare diseases registration recommended by the European Commission Joint Research Centre. CARE-SM stands as the matured iteration of this CDE semantic model, extending its capabilities to encompass the representation of all data elements pertinent to patient registries and clinical encounters.

CARE-SM is built upon the Semanticscience Integrated Ontology (SIO) as its core structural schema. SIO is used to define every concept within the data model, utilizing upper-class classes and properties. This knowledge graph serves as a "scaffold" that holds every data element within its structure Figure 1. By a combination of these instances defined by SIO, it becomes possible to represent every clinical entry comprehensively.

Moreover, each instance within CARE-SM is associated with a domain-specific ontological class from the OBO Foundry . For instance, the representation of patient birthdate is described at an upper-class level using the ontological term SIO:attribute and, at a domain-specific level, as ncit:Birthdate . This dual ontological characterization enhances data interoperability and precise semantic descriptions.

Foundational Design of CARE-SM

Core Structure

CARE-SM is built upon the Semanticscience Integrated Ontology (SIO) as its core structural schema. SIO is used to define every concept within the data model, utilizing upper-class classes and properties. This knowledge graph serves as a "scaffold" that holds every data element within its structure Figure 1. By a combination of these instances defined by SIO, it becomes possible to represent every clinical entry comprehensively.

Moreover, each instance within CARE-SM is associated with a domain-specific ontological class from the OBO Foundry . For instance, the representation of patient birthdate is described at an upper-class level using the ontological term SIO:attribute and, at a domain-specific level, as ncit:Birthdate . This dual ontological characterization enhances data interoperability and precise semantic descriptions.

Figure 1: Core structure

Contextual Layer

To maintain a common core structure using CARE-SM, only one data element is modeled at a time. For that reason, if you do not have that element, you do not use that particular data representation. This could lead to situations where data is not aggregated enough. To address this, a layer of metadata has been created around every data element representation Figure 2.

This metadata describes the context of the data represented in the core structure model, giving some temporal information to each data element. This structure is preserved even when date/time are the core observation of the model (e.g., date of symptom onset). The context layer creates a timeline of events around every data element, allowing the model to represent not only individual patient registry entries but also patient clinical encounters in a precise way.

In addition to the patient's timeline and temporal information, common context can be grouped into other arbitrary data elements by connecting them through event nodes. This event has a common context between data elements for cases where multiple data elements share a unique relationship (like conditions/treatment scenarios, visit-based aggregated information). It's not mandatory to implement this in your model, but it is made possible by the design.

This metadata requires the combination of RDF-Quads and RDF-Triples, rather than only RDF Triples used for regular knowledge graphs. The core structure of the model is represented using RDF-Quad, containing as a fourth element (Quad) the same context ID URL. This URL is used as the subject for other RDF Triples that define the metadata layer Figure 2.

Figure 2: Context representation

List of data elements

Birthdate

The date of birth of the patient, providing essential demographic information.

The module shows how a patient possesses a personal attribute, their date of birth, captured through a specific data process. The output is an ISO 8601-formatted date.

Birthdate Visualization

OBO Foundry terms used:

Exemplar RDF-Quads

Birthyear

The year of birth of the patient, often used when full date details are unavailable.

The module shows how a patient possesses an attribute for the year of birth, captured through a specific data process. The output is the patient’s year of birth.

Birthyear Visualization

OBO Foundry terms used:

Exemplar RDF-Quads

Deathdate

The date of death of the patient, significant for mortality analyses and records.

The module shows how the individual, defined as a patient, possesses a personal attribute which is a date of death. This date is captured through a specific data process that outputs the date of death (defined using ISO 8601 formatted date).

Deathdate visualization

OBO Foundry terms used:

Exemplar RDF-Quads

First Confirmed Visit

The date of the patient’s first confirmed visit to a healthcare provider.

The module shows how the individual, defined as a patient, possesses a personal attribute which is the date of first confirmed visit. This patient participates in a data capture process that outputs the date of first confirmed visit (defined using ISO 8601 formatted date).

First confirmed visit visualization

OBO Foundry terms used:

Exemplar RDF-Quads

Participation Status

Indicates whether the patient is actively participating in the registry or study.

The module shows how the individual, defined as a patient, possesses a personal attribute which is its participation status in the medical record. This patient participates in a data capture procedure that outputs the patient participation status.

Participation status visualization

OBO Foundry terms used:

Exemplar RDF-Quads

Sex

The biological sex of the patient, used for demographic and medical analyses.

The module shows how the individual, defined as a patient, possesses a personal attribute which is their sex at birth. This patient participates in a sex data capture assessment, which outputs the patient’s sex code.

Sex visualization

OBO Foundry terms used:

Exemplar RDF-Quads

Education Level

The highest level of education completed by the patient, relevant for socio-economic studies.

The module shows how the individual, defined as a patient, possesses a personal attribute which is the level of education. This patient participates in an education level questionnaire assessment process that outputs a value measured by the International Standard Classification of Education.

Education visualization

OBO Foundry terms used:

Exemplar RDF-Quads

Disability

Information on disabilities affecting the patient, essential for healthcare planning.

The module shows how the individual, defined as a patient, participates in a questionnaire. The questionnaire defines a question (specific question part of a questionnaire)—in this case, a disability assessment question—and outputs a score.

Disability visualization

OBO Foundry terms used:

Exemplar RDF-Quads

Questionnaire

Responses to patient questionnaires, often capturing subjective and qualitative data.

The module shows how the individual, defined as a patient, participates in a questionnaire. The questionnaire defines an specific question which is part of a questionnaire and outputs a score.

Questionnaire visualization

OBO Foundry terms used:

Exemplar RDF-Quads

Diagnosis

Details of diagnosed conditions for the patient, forming the basis of medical care.

The module shows how the individual, defined as a patient, possesses a medical condition that is a certain disease (measured in this case using ORDO codes). This patient participates in a diagnosis procedure that outputs an observational result referencing its condition.

Diagnosis visualization

OBO Foundry terms used:

Exemplar RDF-Quads

Symptom/Phenotype

The symptoms or phenotypes observed in the patient, critical for medical evaluations.

The module shows how the individual, defined as a patient, possesses a medical condition that is a certain sign/symptom/phenotype (measured in this case using HPO codes). This patient participates in a diagnosis procedure that outputs an observational result referencing its condition.

Symptom/Phenotype visualization

OBO Foundry terms used:

Exemplar RDF-Quads

Symptom onset

The time of the patient's symptoms onset.

The module shows how the individual, defined as a patient, participates in a symptoms onset assessment, the output could be defined using ISO 8601 formatted for date, or an integer for age. Also, the assessment targets a certain symptom or phenotype in case of any

Symptoms onset visualization

OBO Foundry terms used:

Exemplar RDF-Quads

Laboratory Measurement

Data from laboratory tests performed on the patient, such as blood tests and biochemical analysis.

The module shows how the individual, defined as a patient, participates in a laboratory testing process. The procedure includes the input (an anatomic structure from which the substance is extracted) and the output (observational result and its value, measured in specific units).

Laboratory measurement visualization

OBO Foundry terms used:

Exemplar RDF-Quads

Body Measurement

Physical measurements of the patient, including height, weight, and body mass index.

The module shows how the individual, defined as a patient, possesses a personal attribute such as weight or height. This patient participates in a data capture process that outputs the value of the measurement and its unit.

Body measurement visualization

OBO Foundry terms used:

Exemplar RDF-Quads

Medical Imaging

Medical imaging studies performed on the patient.

The module shows how the individual, defined as a patient, participates in a medical imaging procedure. The output is a medical image referencing its findings, targeting an anatomical structure or substance.

Medical imaging visualization

OBO Foundry terms used:

Exemplar RDF-Quads

Genetic

Genetic tests and data related to the patient’s hereditary information.

The module illustrates the genetic testing process, identifying genetic variants using HGVS/HGNC/OMIM nomenclature. Attributes like zygosity are included, along with molecular processes and anatomical input samples associated with genetic testing.

Genetic data visualization

OBO Foundry terms used:

Exemplar RDF-Quads

Medication

Details of medications prescribed to the patient, including dosage and duration.

The module depicts a drug administration process, detailing the prescribed medication, its route of administration, and dosage specifications.

Medication data visualization

OBO Foundry terms used:

Exemplar RDF-Quads

Surgery

Records of surgical procedures undergone by the patient.

The module describes a surgical interventions, including anatomical structures targeted.

Surgery data visualization

OBO Foundry terms used:

Exemplar RDF-Quads

Biobank

Information about biological samples provided by the patient for research purposes.

Biobank data visualization

OBO Foundry terms used:

Exemplar RDF-Quads

Clinical Trial

Participation of the patient in clinical trials, including trial IDs and outcomes.

This module illustrates the patient’s involvement in clinical trials, referencing medical conditions.

Clinical trial data visualization

OBO Foundry terms used:

Exemplar RDF-Quads

This is required for specifications that contain normative material.