Clinical And Registry Entries (CARE-SM) is a semantic data model designed to effectively represent healthcare patient information by using knowledge graphs represented in the Resource Description Framework (RDF). This technical description aims to provide a comprehensive overview of CARE-SM, its origins, and the design principles that underlie its structure.
CARE-SM is a more robust and matured representation of its precursor, the Common Data Element (CDE) semantic data model. The primary objective of its creation was to develop a semantic data model capable of representing
a set of common data elements for rare diseases registration
recommended by the European Commission Joint Research Centre. CARE-SM stands as the matured iteration of this CDE semantic model, extending its capabilities to encompass the representation of all data elements pertinent to patient registries and clinical encounters.
CARE-SM is built upon the
Semanticscience Integrated Ontology (SIO)
as its core structural schema. SIO is used to define every concept within the data model, utilizing upper-class classes and properties. This knowledge graph serves as a "scaffold" that holds every data element within its structure Figure 1. By a combination of these instances defined by SIO, it becomes possible to represent every clinical entry comprehensively.
Moreover, each instance within CARE-SM is associated with a domain-specific ontological class from the
OBO Foundry
.
For instance, the representation of patient birthdate is described at an upper-class level using the ontological term
SIO:attribute
and, at a domain-specific level, as
ncit:Birthdate
.
This dual ontological characterization enhances data interoperability and precise semantic descriptions.
CARE-SM is built upon the
Semanticscience Integrated Ontology (SIO)
as its core structural schema. SIO is used to define every concept within the data model, utilizing upper-class classes and properties. This knowledge graph serves as a "scaffold" that holds every data element within its structure Figure 1. By a combination of these instances defined by SIO, it becomes possible to represent every clinical entry comprehensively.
Moreover, each instance within CARE-SM is associated with a domain-specific ontological class from the
OBO Foundry
.
For instance, the representation of patient birthdate is described at an upper-class level using the ontological term
SIO:attribute
and, at a domain-specific level, as
ncit:Birthdate
.
This dual ontological characterization enhances data interoperability and precise semantic descriptions.
Figure 1: Core structure
To maintain a common core structure using CARE-SM, only one data element is modeled at a time. For that reason, if you do not have that element, you do not use that particular data representation. This could lead to situations where data is not aggregated enough. To address this, a layer of metadata has been created around every data element representation Figure 2.
This metadata describes the context of the data represented in the core structure model, giving some temporal information to each data element. This structure is preserved even when date/time are the core observation of the model (e.g., date of symptom onset). The context layer creates a timeline of events around every data element, allowing the model to represent not only individual patient registry entries but also patient clinical encounters in a precise way.
In addition to the patient's timeline and temporal information, common context can be grouped into other arbitrary data elements by connecting them through event nodes. This event has a common context between data elements for cases where multiple data elements share a unique relationship (like conditions/treatment scenarios, visit-based aggregated information). It's not mandatory to implement this in your model, but it is made possible by the design.
This metadata requires the combination of RDF-Quads and RDF-Triples, rather than only RDF Triples used for regular knowledge graphs. The core structure of the model is represented using RDF-Quad, containing as a fourth element (Quad) the same context ID URL. This URL is used as the subject for other RDF Triples that define the metadata layer Figure 2.
Figure 2: Context representation
The date of birth of the patient, providing essential demographic information.
The module shows how a patient possesses a personal attribute, their date of birth, captured through a specific data process. The output is an ISO 8601-formatted date.
OBO Foundry terms used:
Exemplar RDF-Quads
The year of birth of the patient, often used when full date details are unavailable.
The module shows how a patient possesses an attribute for the year of birth, captured through a specific data process. The output is the patient’s year of birth.
OBO Foundry terms used:
Exemplar RDF-Quads
The date of death of the patient, significant for mortality analyses and records.
The module shows how the individual, defined as a patient, possesses a personal attribute which is a date of death. This date is captured through a specific data process that outputs the date of death (defined using ISO 8601 formatted date).
OBO Foundry terms used:
Exemplar RDF-Quads
The date of the patient’s first confirmed visit to a healthcare provider.
The module shows how the individual, defined as a patient, possesses a personal attribute which is the date of first confirmed visit. This patient participates in a data capture process that outputs the date of first confirmed visit (defined using ISO 8601 formatted date).
OBO Foundry terms used:
Exemplar RDF-Quads
Indicates whether the patient is actively participating in the registry or study.
The module shows how the individual, defined as a patient, possesses a personal attribute which is its participation status in the medical record. This patient participates in a data capture procedure that outputs the patient participation status.
OBO Foundry terms used:
Exemplar RDF-Quads
The biological sex of the patient, used for demographic and medical analyses.
The module shows how the individual, defined as a patient, possesses a personal attribute which is their sex at birth. This patient participates in a sex data capture assessment, which outputs the patient’s sex code.
OBO Foundry terms used:
Exemplar RDF-Quads
The highest level of education completed by the patient, relevant for socio-economic studies.
The module shows how the individual, defined as a patient, possesses a personal attribute which is the level of education. This patient participates in an education level questionnaire assessment process that outputs a value measured by the International Standard Classification of Education.
OBO Foundry terms used:
Exemplar RDF-Quads
Information on disabilities affecting the patient, essential for healthcare planning.
The module shows how the individual, defined as a patient, participates in a questionnaire. The questionnaire defines a question (specific question part of a questionnaire)—in this case, a disability assessment question—and outputs a score.
OBO Foundry terms used:
Exemplar RDF-Quads
Responses to patient questionnaires, often capturing subjective and qualitative data.
The module shows how the individual, defined as a patient, participates in a questionnaire. The questionnaire defines an specific question which is part of a questionnaire and outputs a score.
OBO Foundry terms used:
Exemplar RDF-Quads
Details of diagnosed conditions for the patient, forming the basis of medical care.
The module shows how the individual, defined as a patient, possesses a medical condition that is a certain disease (measured in this case using ORDO codes). This patient participates in a diagnosis procedure that outputs an observational result referencing its condition.
OBO Foundry terms used:
Exemplar RDF-Quads
The symptoms or phenotypes observed in the patient, critical for medical evaluations.
The module shows how the individual, defined as a patient, possesses a medical condition that is a certain sign/symptom/phenotype (measured in this case using HPO codes). This patient participates in a diagnosis procedure that outputs an observational result referencing its condition.
OBO Foundry terms used:
Exemplar RDF-Quads
The time of the patient's symptoms onset.
The module shows how the individual, defined as a patient, participates in a symptoms onset assessment, the output could be defined using ISO 8601 formatted for date, or an integer for age. Also, the assessment targets a certain symptom or phenotype in case of any
OBO Foundry terms used:
Exemplar RDF-Quads
Data from laboratory tests performed on the patient, such as blood tests and biochemical analysis.
The module shows how the individual, defined as a patient, participates in a laboratory testing process. The procedure includes the input (an anatomic structure from which the substance is extracted) and the output (observational result and its value, measured in specific units).
OBO Foundry terms used:
Exemplar RDF-Quads
Physical measurements of the patient, including height, weight, and body mass index.
The module shows how the individual, defined as a patient, possesses a personal attribute such as weight or height. This patient participates in a data capture process that outputs the value of the measurement and its unit.
OBO Foundry terms used:
Exemplar RDF-Quads
Medical imaging studies performed on the patient.
The module shows how the individual, defined as a patient, participates in a medical imaging procedure. The output is a medical image referencing its findings, targeting an anatomical structure or substance.
OBO Foundry terms used:
Exemplar RDF-Quads
Genetic tests and data related to the patient’s hereditary information.
The module illustrates the genetic testing process, identifying genetic variants using HGVS/HGNC/OMIM nomenclature. Attributes like zygosity are included, along with molecular processes and anatomical input samples associated with genetic testing.
OBO Foundry terms used:
Exemplar RDF-Quads
Details of medications prescribed to the patient, including dosage and duration.
The module depicts a drug administration process, detailing the prescribed medication, its route of administration, and dosage specifications.
OBO Foundry terms used:
Exemplar RDF-Quads
Records of surgical procedures undergone by the patient.
The module describes a surgical interventions, including anatomical structures targeted.
OBO Foundry terms used:
Exemplar RDF-Quads
Information about biological samples provided by the patient for research purposes.
OBO Foundry terms used:
Exemplar RDF-Quads
Information about patient consent for research purposes.
OBO Foundry terms used:
Participation of the patient in clinical trials, including trial IDs and outcomes.
This module illustrates the patient’s involvement in clinical trials, referencing medical conditions.
OBO Foundry terms used:
Exemplar RDF-Quads
This is required for specifications that contain normative material.