dfp_trail_1920x517_red_lila_final.png

Health Data

The importance of health data


The relevance of health data extends beyond individual patient care and plays a key role in various aspects of healthcare:

Data types

Health data includes all information about you that is collected and used in the course of medical examinations and treatments.


This may include information about your medical history, blood pressure readings or x-rays. The data is collected by your doctor or hospital, for example, in your electronic health record.

Health data is very diverse and includes a wide range of information, from personal details about your general health to technical measurements. Careful categorization makes it easier to understand and more secure to handle. As part of the Data Sharing Cockpit, different health information is grouped into clusters based on internationally accepted classification systems for health data.

Data categories

Description

Comments / Examples

Allergies & Vaccinations
Structured medical information regarding documents allergies, sensitivities, vaccination and immunization the patient received in the past
e.g., Penicillin Allergy, HepB vaccination
Tests & Labs
Structured medical information regarding the different clinical tests and results from clinical laboratories, tests and measurements that affect medical risk-factors and treament options, including infectious pathogen data
Measurements (e.g., Height, Weight, Blood Pressure, etc.)
Labs tests & results (e.g. Blood [CBC, troponin, TSH, etc.…)
Tests (e.g. EKG, EMG)
Risk Scores
Medications
Structured medical information regarding the different medication treatment recommended to the patient, including the relevant dates and prescription parameters (dosage, regimen, form, route of administration)
e.g., TAB Simvastatin 20mg PO, once a day
Conditions & Procedures
Structured medical information regarding the different medical conditions (chronic and acute) the patient has suffered from in the past (sometimes refered to as "Problem list"), procedures & surgeries the patient went through, and meta-data regarding medical encounters with clinical teams and the treating healthcare professionals
e.g. Tonsillectomy (1995), UTI (2005), data on urgent care visits for Ischemic Heart Disease (2020)
Genetic Risk Factors
Structured list of genetic markers and genetic predispositions to develop specific diseases
Results from Genetic analysis: e.g., BRCA1
Demographics
Structured list of potential demographic risk factors including: demographics, family history of chronic disease, and lifestyle
Demographics (e.g., Age, Sex)
Lifestyle (e.g., smoking, occupation, diet, physical activity)
Family History (e.g., Mother with DM2)
Wearables & Wellness data
Data collected using personal wearable devices and wellness application (e.g., fitbit, google fit, apple health kit, samsung health, garmin etc)
e.g., number of daily steps, REM sleep quality
Unstructured text
Medical notes and reports containing free text without coding
e.g., physicians' free-text reports
Imaging data
Raw data from different imaging studies
e.g., MRI, CT, PET CT, X-RAY
Full Genome / Biomics data
Raw data from genetic, genomic, proteomic & other biomic studies
e.g., Genome seq data

For maximum transparency, you can explicitly see which data fields are currently being collected and which data types they are assigned to.

Personal data

In most research and development projects, it is not necessary to link data to a specific person. Health data, in particular, requires a high level of confidentiality, as it contains highly sensitive information. For this reason, health data is typically pseudonymized or anonymized. Below is an example of how the level of personal identification in data can vary.

Identifying dataPseudonymized dataAnonymized data

Identifying data

Identifying data can be directly linked to a specific person. This includes basic information such as name, date of birth, address and other personal identifiers. Because of the sensitivity of this data, special security precautions are required.

illu_klemmbretteng_pink_eng.png

Data analysis

Health data can be made available to data users for research and development in several ways. A basic distinction is made between displaying or releasing data to data users and evaluating data in a secure processing environment.

statistische_daten_icn.png

Display of aggregated statistical data

Aggregated data is consolidated and anonymized data that originates from a large number of individual data sets. This combination makes it possible to identify general trends, patterns and insights without revealing the identity of any single person.

dataverse_trust_center_icn.png

Evaluation in a secure processing environment

The datasets are not made directly available to data users. Technical measures are in place to allow clinics, scientists and technology providers to process the data for specific research questions, but not to view, copy or download individual data. The data therefore remains the responsibility of the operator of that processing environment at all times.

dataverse_docpseudonymized_d_icn.png

Transmission of individual datasets

In the context of health data, these are anonymized datasets that contain information about an individual patient, such as medical examination results, diagnoses or treatment histories. The handling of individual data records requires special attention to privacy and security to ensure the confidentiality and integrity of personal information.