Table of contents
Multimodal Clinical Monitoring in the Emergency Department (MC-MED)
Emergency department (ED) patients often present with undiagnosed complaints, and can exhibit rapidly evolving physiology. Therefore, data from continuous physiologic monitoring, in addition to the electronic health record, is essential to understand the acute course of illness and responses to interventions. The complexity of ED care and the large amount of unstructured multimodal data it produces has limited the accessibility of detailed ED data for research. We release Multimodal Clinical Monitoring in the Emergency Department (MC-MED), a comprehensive, multimodal, and de-identified clinical and physiological dataset. MC-MED includes 118,385 adult ED visits to an academic medical center from 2020 to 2022. Data include continuously monitored vital signs, physiologic waveforms (electrocardiogram, photoplethysmogram, respiration), patient demographics, medical histories, orders, medication administrations, laboratory and imaging results, and visit outcomes. MC-MED is the first dataset to combine detailed physiologic monitoring with clinical events and outcomes for a large, diverse ED population.
All | with Waveforms | ||
---|---|---|---|
Visits | 118,385 | 83,623 | |
Unique Patients | 70,545 | 53,123 | |
Age
| Mean | 53 | 55 |
St. Dev. | 20 | 20 | |
Sex
| Female | 64,272 (54.29%) | 45,500 (54.41%) |
Male | 54,077 (45.68%) | 38,098 (45.56%) | |
Unknown | 36 (0.03%) | 25 (0.03%) | |
Race
| White | 47,504 (40.25%) | 34,320 (41.18%) |
Other | 39,951 (33.85%) | 27,409 (32.89%) | |
Asian | 19,430 (16.46%) | 13,927 (16.71%) | |
Black or African American | 7,653 (6.49%) | 5,261 (6.31%) | |
Native Hawaiian or Other Pacific Islander | 2,468 (2.09%) | 1,734 (2.08%) | |
Declines to State | 551 (0.47%) | 365 (0.44%) | |
American Indian or Alaska Native | 309 (0.26%) | 231 (0.28%) | |
Unknown | 143 (0.12%) | 95 (0.11%) | |
Ethnicity
| Non-Hispanic/Non-Latino | 84,557 (71.65%) | 60,313 (72.37%) |
Hispanic/Latino | 32,494 (27.54%) | 22,363 (26.83%) | |
Declines to State | 633 (0.54%) | 436 (0.52%) | |
Unknown | 325 (0.28%) | 230 (0.28%) | |
Triage Acuity
| 1-Resuscitation | 1,066 (0.90%) | 876 (1.05%) |
2-Emergent | 27,880 (23.66%) | 22,185 (26.64%) | |
3-Urgent | 77,259 (65.55%) | 56,552 (67.91%) | |
4-Semi-Urgent | 10,973 (9.31%) | 3,508 (4.21%) | |
5-Non-Urgent | 680 (0.58%) | 150 (0.18%) | |
ED Disposition
| Discharge | 70,246 (59.34%) | 45,139 (53.98%) |
Inpatient | 29,483 (24.90%) | 23,583 (28.20%) | |
Observation | 15,890 (13.42%) | 12,631 (15.10%) | |
ICU | 2,766 (2.34%) | 2,270 (2.71%) | |
Chief Complaint (Top 7)
| Abdominal Pain | 10,568 (8.93%) | 8,079 (9.66%) |
Chest Pain | 6,239 (5.27%) | 5,197 (6.22%) | |
Fall | 5,204 (4.40%) | 3,854 (4.61%) | |
Shortness Of Breath | 4,507 (3.81%) | 3,818 (4.57%) | |
Back Pain | 2,464 (2.08%) | 1,510 (1.81%) | |
Headache | 2,173 (1.84%) | 1,560 (1.87%) | |
Fever | 2,089 (1.77%) | 1,648 (1.97%) |
Data Dictionary v1
File Tree
./
└── mcmed-stanford-multi/
└── v1/
├── labs.csv
├── meds.csv
├── numerics.csv
├── orders.csv
├── pmh.csv
├── rads.csv
├── split_*.csv
├── visits.csv
├── waveform_summary.csv
└── waveforms/
├── 000/ (Last three digits of visit id)
│ ├── 99013000/ (CSN - visit id)
│ │ ├── II/
│ │ │ ├── 99180000_1.dat
│ │ │ └── 99180000_1.hea
│ │ ├── Pleth/
│ │ │ ├── 99180000_1.dat
│ │ │ ├── 99180000_1.hea
│ │ │ ├── 99180000_2.dat
│ │ │ └── 99180000_2.hea
│ │ └── Resp/
│ │ ├── 99180000_1.dat
│ │ └── 99180000_1.hea
│ ...
├── 000/
...
Physiological Waveforms
This dataset contains three types of physiological waveforms as described in Table 2. The waveform segments are stored as WaveForm DataBase (WFDB) files, which are nested in the waveforms directory. The waveform segments will be found following this pattern:
./waveforms/{CSN[-3:]}/{CSN}/{type}/{CSN}_{segment_number}
Type | Description | Common Uses |
---|---|---|
Electrocardiogram (ECG) | Records the voltage and timing of the heart’s electrical activity | Diagnosis of myocardial injury, arrhythmias, electrolyte derangements; assessment of medication responses |
Photoplethysmogram (Pleth/PPG) | Records changes in blood volume over time at the site of the sensor | Estimation of heart rate, respiratory rate, blood pressure, blood oxygen saturation |
Respiration (Resp) | Estimates chest wall expansion and contraction | Estimation of respiratory rate, tidal volume, respiratory function |
Waveform Reference
waveform_summary.csv
– This table is a reference for the waveform data. The number of segments are a result of removing sections of the signal record that did not show changes in signal readings. During a patient’s visit, the patient may be disconnected from the diagnostic instruments for various reasons. These gaps were removed, and the remaining segments with varying signal were stored as separate files.
- Rows: 188,315
- Columns: 4
Column Name | Description | Data Type | Sample Data |
---|---|---|---|
CSN | Visit identifier (Random ID mapped to original CSN) | int | 99633476 |
Type | One of 3 waveforms - II Electrocardiogram - Pleth Photoplethysmogram - Resp Respiration | string | Pleth |
Segments | The number of waveform segments | int | 1 |
Duration | The total duration for all segments for the waveform type, in seconds | float | 119.984 |
Tabular Data Descriptions
Emergency Department Visits
visits.csv
– This table describes high-level characteristics of each visit.
- Rows: 118,385
- Columns: 33
Column Name | Description | Data Type | Sample Data |
---|---|---|---|
MRN | Patient identifier (Random ID mapped to original MRN) | int | 99940664 |
CSN | Visit identifier (Random ID mapped to original CSN) | int | 98874959 |
Visit_no | The visit number for this patient in the dataset | int | 1 |
Visits | Total number of visits for this patient in the dataset | int | 1 |
Age | Patient age in years at time of visit (Random perturbation of age +/- 2 years, Ages greater than 90 set to 90) | int | 90 |
Gender | Patient gender | string | F |
Race | Patient race | string | White |
Ethnicity | Patient hispanic ethnicity | string | Non-Hispanic/Non-Latino |
Means_of_arrival | Means of arrival to the ED | string | Self |
Triage_Temp | Temperature at triage (C) | float | 36.7 |
Triage_HR | Heart rate at triage (bpm) | float | 80.0 |
Triage_RR | Respiratory rate at triage (breaths per min.) | float | 18.0 |
Triage_SpO2 | Oxygen saturation at triage (%) | float | 100.0 |
Triage_SBP | Systolic blood pressure at triage (mmHg) | float | 128.0 |
Triage_DBP | Diastolic blood pressure at triage (mmHg) | float | 78.0 |
Triage_acuity | Emergency Severity Index (ESI) at triage (1-5) | string | 3-Urgent |
CC | Chief complaint(s) at triage | string | ABDOMINAL PAIN |
ED_dispo | Disposition of patient from the ED | string | Discharge |
Hours_to_next_visit | For patients with a subsequent visit, number of hours from departure of current visit to arrival of next visit | float | 40.0 |
Dispo_class_next_visit | Dispo_class for next visit | string | Discharge |
ED_LOS | Length of ED stay, hours | float | 4.82 |
Hosp_LOS | Length of hospital stay (including post-ED admission), days | float | 1.0 |
DC_dispo | Final disposition of patient from the hospital | string | Home/Work (includes foster care) |
Payor_class | Class of primary visit payor | string | Medicare |
Admit_service | For admitted patients, service admitting the patient from the ED | string | Emergency Medicine |
Dx_ICD9 | Primary visit diagnosis, ICD9 code | string | 786.50 |
Dx_ICD10 | Primary visit diagnosis, ICD10 code | string | R07.9 |
Dx_name | Name of ICD10 code | string | Chest pain, unspecified type |
Arrival_time | Time of arrival to ED (Random-shift date, keeping season constant) | datetime | 2262-01-09T03:16:07Z |
Roomed_time | Time of patient rooming (Random-shift date, keeping season constant) | datetime | 2283-03-02T07:36:59Z |
Dispo_time | Time of disposition decision (Random-shift date, keeping season constant) | datetime | 2247-09-22T10:54:42Z |
Admit_time | For admitted patients, time of admission order (Random-shift date, keeping season constant) | datetime | 2283-03-02T12:29:59Z |
Departure_time | Time of departure from ED (Random-shift date, keeping season constant) | datetime | 2209-08-12T11:31:38Z |
Orders
orders.csv
– This table contains all orders placed by the ED physicians during the visit.
- Rows: 3,288,744
- Columns: 7
Column Name | Description | Data Type | Sample Data |
---|---|---|---|
CSN | Visit identifier (Random ID mapped to original CSN) | int | 99139687 |
Order_time | Time of order (Random-shift date, keeping season constant) | datetime | 2226-01-15T15:39:22Z |
Order_type | Type of order (lab, imaging, medication, etc) | string | Lab |
Procedure_name | Name of order | string | CBC WITH DIFFERENTIAL |
Procedure_ID | Identifier for order (Mapped to CPT codes) | string | LABMETC |
First_admin_time | For medications, time of first administration (Random-shift date, keeping season constant) | datetime | 2212-11-03T16:51:00Z |
Result_time | For lab and imaging tests, time of result (Random-shift date, keeping season constant) | datetime | 2295-08-25T16:55:28Z |
Medications
meds.csv
– This table contains patient home medications.
- Rows: 235,718
- Columns: 11
Column Name | Description | Data Type | Sample Data |
---|---|---|---|
MRN | Patient identifier (Random ID mapped to original MRN) | int | 99721983 |
Med_ID | Medication identifier (mapped to NDC id) | int | 14113 |
NDC | National Drug Code identifier | string | 69618-066-10 |
Name | Medication name | string | ASPIRIN 81 MG PO TBEC |
Generic_name | Generic name | string | aspirin 81 mg tablet,delayed release |
Med_class | High-level classification of the medication | string | VITAMIN D PREPARATIONS |
Med_subclass | A more detailed classification | string | Vitamins - D Derivatives |
Active | Indicates whether a patient was thought to be using the medication at the time of the visit | string | Y |
Entry_date | Medication entry date (Random-shift date, keeping season constant) | date | 2270-07-20T00:00:00Z |
Start_date | Medication start date (Random-shift date, keeping season constant) | date | 2275-08-31T00:00:00Z |
End_date | Medication end date (Random-shift date, keeping season constant) | date | 2241-08-06T00:00:00Z |
Past Medical History
pmh.csv
– This table contains prior diagnoses.
- Rows: 755,341
- Columns: 7
Column Name | Description | Data Type | Sample Data |
---|---|---|---|
MRN | Patient identifier (Random ID mapped to original MRN) | int | 99084665 |
Noted_date | Date when the diagnosis was recorded (Random-shift date, keeping season constant) | date | 2219-08-10T00:00:00Z |
CodeType | Whether code is ICD9 or ICD10 | string | Dx10 |
Code | Diagnosis code | string | I10 |
Desc10 | Text description of the code | string | Essential (primary) hypertension |
CCS | Clinical Classification Software category of the diagnosis | float | 259.0 |
DescCCS | Text description of the CCS category | string | Residual codes; unclassified |
Lab Test Results
labs.csv
– This table gives the results for lab tests ordered during the ED visit.
- Rows: 5,706,470
- Columns: 12
Column Name | Description | Data Type | Sample Data |
---|---|---|---|
CSN | Visit identifier (Random ID mapped to original CSN) | int | 99880957 |
Order_time | Time of lab order (Random-shift date, keeping season constant) | datetime | 2221-07-03T18:37:12Z |
Result_time | Time of lab result (Random-shift date, keeping season constant) | datetime | 2233-07-07T04:37:14Z |
Display_name | Name of order | string | CBC with Differential (CBCD) |
Abnormal | Flag for abnormal or critical result | string | Abnormal |
Component_name | Name of lab component | string | SODIUM |
Component_result | Lab result (Removed results containing dates or names) | string | Negative |
Component_value | Lab value (Sometimes identical to result, sometimes different, e.g. result may be categorical and value numeric. Removed results containing dates or names) | string | 1 |
Component_units | Units of component_value, where applicable | string | % |
Component_abnormal | Flag for abnormal component_value | string | Normal |
Component_nml_low | Low end of normal range for component | float | 0.0 |
Component_nml_high | High end of normal range for component | float | 5.2 |
Radiology
rads.csv
– This table contains the results of imaging studies ordered during each ED visit.
- Rows: 156,866
- Columns: 5
Column Name | Description | Data Type | Sample Data |
---|---|---|---|
CSN | Visit identifier (Random ID mapped to original CSN) | int | 99002166 |
Order_time | Time of imaging order (Random-shift date, keeping season constant) | datetime | 2205-07-07T11:11:36Z |
Result_time | Time of imaging result (Random-shift date, keeping season constant) | datetime | 2291-10-02T05:06:13Z |
Study | Imaging study name | string | XR CHEST 1 VIEW |
Impression | Imaging result impression (Used the Stanford AIMI De-identification tool JAMIA, HuggingFace) | string | “1. No acute cardiopulmonary disease.” |
Numeric Monitoring Data
numerics.csv
– This table contains the non-waveform monitoring data measured during each ED visit.
- Rows: 47,821,508
- Columns: 5
Column Name | Description | Data Type | Sample Data |
---|---|---|---|
CSN | Visit identifier (Random ID mapped to original CSN) | int | 99833461 |
Source | Indicates whether a value was recorded by nursing (Chart) or derived directly from the monitoring database (Monitor) | string | Monitor |
Measure | One of 12 measurements - HR Heart rate, heartbeats per minute - RR Respiratory rate, breaths per minute - SpO2 Oxygen saturation by pulse oximetry, % - SBP Systolic blood pressure, mmHg - DBP Diastolic blood pressure, mmHg - MAP Mean arterial pressure, mmHg - Temp Body temperature, degrees Fahrenheit - Perf Mean last-minute perfusion index derived from the PPG waveform - Pain Self-reported pain rating, 0 (no pain) to 10 (worst pain) - LPM_O2 Flow rate of supplemental oxygen, liters per minute - 1min_HRV or 5min_HRV Heart rate variability over the last 1 minute or 5 minutes, calculated as the standard deviation of the beat-to-beat RR interval of the ECG waveform over this period. | string | SpO2 |
Value | Observation value | float | 100.0 |
Time | Timestamp of the measurement. When underlying observations are made more frequently than once per minute, they are aggregated to the mean value over the 60 seconds preceding the timestamp. | string | 2247-05-09T06:40:42Z |
Type | Description |
---|---|
Heart Rate (HR) | Number of heartbeats per minute; measured using ECG or PPG waveform |
Respiratory Rate (RR) | Number of breaths per minute; typically measured using RESP waveform |
Oxygen Saturation by Pulse Oximetry (SpO2) | Percentage of oxygen-saturated hemoglobin in blood; measured using PPG waveform. |
Systolic Blood Pressure (SBP) | Highest pressure in arteries during heart contraction, recorded by sphygmomanometry |
Diastolic Blood Pressure (DBP) | Lowest pressure in arteries during heart relaxation, recorded by sphygmomanometry |
Mean Arterial Pressure (MAP) | Average pressure in a patient’s arteries during one cardiac cycle |
Temperature in Degrees Fahrenheit (Temp) | Body temperature measured in degrees Fahrenheit |
Perfusion (Perf) | Ratio of pulsatile to nonpulsatile peripheral blood flow, estimated using PPG waveform |
Pain rating on 0-10 scale (Pain) | Patient’s self-reported pain level on a scale from 0 (no pain) to 10 (worst pain) |
Liters Per Minute of Supplemental Oxygen (LPM_O2) | Flow rate of supplemental oxygen administered to the patient, measured in liters per minute |
Heart Rate Variability over last 1 minute (1min_HRV) | Variability in the time interval between heartbeats over the last minute; indicates autonomic nervous system function |
Heart Rate Variability over last 5 minutes (5min_HRV) | Variability in the time interval between heartbeats over the last five minutes; indicates autonomic nervous system function |
Train-Validation-Test Splits
MC-MED contains two training/validation/test splits for general use. For both splits, the training set contains 80% of visits, and validation and test sets each contain 10% of visits. The visits for each patient are restricted to a single split, in order to prevent data leakage between sets.
Random Patient-level Split split_random_*.csv
– The files contain the visit IDs (CSN) for the visits in each split set.
Chronological Split split_chrono_*.csv
– All visits in the validation set occur after the final visit in the training set, and all visits in the test set occur after the final visit in the validation set. To prevent patient data leakage between sets, each patient (MRN) is again restricted to only one of the training, validation, or test sets. This results in 13,007 visits being removed from these sets, and exact splits of 78%, 11%, and 11% for training, validating, and testing sets, respectively.