Table of contents
  1. Multimodal Clinical Monitoring in the Emergency Department (MC-MED)
  2. Data Dictionary v1
    1. File Tree
    2. Physiological Waveforms
      1. Waveform Reference
    3. Tabular Data Descriptions
      1. Emergency Department Visits
      2. Orders
      3. Medications
      4. Past Medical History
      5. Lab Test Results
      6. Radiology
      7. Numeric Monitoring Data
      8. Train-Validation-Test Splits

Multimodal Clinical Monitoring in the Emergency Department (MC-MED)

Emergency department (ED) patients often present with undiagnosed complaints, and can exhibit rapidly evolving physiology. Therefore, data from continuous physiologic monitoring, in addition to the electronic health record, is essential to understand the acute course of illness and responses to interventions. The complexity of ED care and the large amount of unstructured multimodal data it produces has limited the accessibility of detailed ED data for research. We release Multimodal Clinical Monitoring in the Emergency Department (MC-MED), a comprehensive, multimodal, and de-identified clinical and physiological dataset. MC-MED includes 118,385 adult ED visits to an academic medical center from 2020 to 2022. Data include continuously monitored vital signs, physiologic waveforms (electrocardiogram, photoplethysmogram, respiration), patient demographics, medical histories, orders, medication administrations, laboratory and imaging results, and visit outcomes. MC-MED is the first dataset to combine detailed physiologic monitoring with clinical events and outcomes for a large, diverse ED population.

Dataset Summary

    All with Waveforms
Visits   118,385 83,623
Unique Patients   70,545 53,123
Age
Mean 53 55
St. Dev. 20 20
Sex

Female 64,272 (54.29%) 45,500 (54.41%)
Male 54,077 (45.68%) 38,098 (45.56%)
Unknown 36 (0.03%) 25 (0.03%)
Race






White 47,504 (40.25%) 34,320 (41.18%)
Other 39,951 (33.85%) 27,409 (32.89%)
Asian 19,430 (16.46%) 13,927 (16.71%)
Black or African American 7,653 (6.49%) 5,261 (6.31%)
Native Hawaiian or Other
Pacific Islander
2,468 (2.09%) 1,734 (2.08%)
Declines to State 551 (0.47%) 365 (0.44%)
American Indian or
Alaska Native
309 (0.26%) 231 (0.28%)
Unknown 143 (0.12%) 95 (0.11%)
Ethnicity


Non-Hispanic/Non-Latino 84,557 (71.65%) 60,313 (72.37%)
Hispanic/Latino 32,494 (27.54%) 22,363 (26.83%)
Declines to State 633 (0.54%) 436 (0.52%)
Unknown 325 (0.28%) 230 (0.28%)
Triage Acuity



1-Resuscitation 1,066 (0.90%) 876 (1.05%)
2-Emergent 27,880 (23.66%) 22,185 (26.64%)
3-Urgent 77,259 (65.55%) 56,552 (67.91%)
4-Semi-Urgent 10,973 (9.31%) 3,508 (4.21%)
5-Non-Urgent 680 (0.58%) 150 (0.18%)
ED Disposition


Discharge 70,246 (59.34%) 45,139 (53.98%)
Inpatient 29,483 (24.90%) 23,583 (28.20%)
Observation 15,890 (13.42%) 12,631 (15.10%)
ICU 2,766 (2.34%) 2,270 (2.71%)
Chief Complaint (Top 7)





Abdominal Pain 10,568 (8.93%) 8,079 (9.66%)
Chest Pain 6,239 (5.27%) 5,197 (6.22%)
Fall 5,204 (4.40%) 3,854 (4.61%)
Shortness Of Breath 4,507 (3.81%) 3,818 (4.57%)
Back Pain 2,464 (2.08%) 1,510 (1.81%)
Headache 2,173 (1.84%) 1,560 (1.87%)
Fever 2,089 (1.77%) 1,648 (1.97%)

Data Dictionary v1

File Tree

./
└── mcmed-stanford-multi/
    └── v1/
        ├── labs.csv
        ├── meds.csv
        ├── numerics.csv
        ├── orders.csv
        ├── pmh.csv
        ├── rads.csv
        ├── split_*.csv
        ├── visits.csv
        ├── waveform_summary.csv
        └── waveforms/
            ├── 000/ (Last three digits of visit id)
            │   ├── 99013000/ (CSN - visit id)
            │   │   ├── II/
            │   │   │   ├── 99180000_1.dat
            │   │   │   └── 99180000_1.hea
            │   │   ├── Pleth/
            │   │   │   ├── 99180000_1.dat
            │   │   │   ├── 99180000_1.hea
            │   │   │   ├── 99180000_2.dat
            │   │   │   └── 99180000_2.hea
            │   │   └── Resp/
            │   │       ├── 99180000_1.dat
            │   │       └── 99180000_1.hea
            │   ...
            ├── 000/
            ...

Physiological Waveforms

This dataset contains three types of physiological waveforms as described in Table 2. The waveform segments are stored as WaveForm DataBase (WFDB) files, which are nested in the waveforms directory. The waveform segments will be found following this pattern:
./waveforms/{CSN[-3:]}/{CSN}/{type}/{CSN}_{segment_number}

Waveform Type Description

Type Description Common Uses
Electrocardiogram (ECG) Records the voltage and timing of the heart’s electrical activity Diagnosis of myocardial injury, arrhythmias, electrolyte derangements; assessment of medication responses
Photoplethysmogram (Pleth/PPG) Records changes in blood volume over time at the site of the sensor Estimation of heart rate, respiratory rate, blood pressure, blood oxygen saturation
Respiration (Resp) Estimates chest wall expansion and contraction Estimation of respiratory rate, tidal volume, respiratory function

Waveform Reference

waveform_summary.csv – This table is a reference for the waveform data. The number of segments are a result of removing sections of the signal record that did not show changes in signal readings. During a patient’s visit, the patient may be disconnected from the diagnostic instruments for various reasons. These gaps were removed, and the remaining segments with varying signal were stored as separate files.

  • Rows: 188,315
  • Columns: 4

Waveform Summary Data Elements

Column Name Description Data Type Sample Data
CSN Visit identifier (Random ID mapped to original CSN) int 99633476
Type One of 3 waveforms
- II Electrocardiogram
- Pleth Photoplethysmogram
- Resp Respiration
string Pleth
Segments The number of waveform segments int 1
Duration The total duration for all segments for the waveform type, in seconds float 119.984

Tabular Data Descriptions

Emergency Department Visits

visits.csv – This table describes high-level characteristics of each visit.

  • Rows: 118,385
  • Columns: 33

Visits Data Elements

Column Name Description Data Type Sample Data
MRN Patient identifier (Random ID mapped to original MRN) int 99940664
CSN Visit identifier (Random ID mapped to original CSN) int 98874959
Visit_no The visit number for this patient in the dataset int 1
Visits Total number of visits for this patient in the dataset int 1
Age Patient age in years at time of visit (Random perturbation of age +/- 2 years, Ages greater than 90 set to 90) int 90
Gender Patient gender string F
Race Patient race string White
Ethnicity Patient hispanic ethnicity string Non-Hispanic/Non-Latino
Means_of_arrival Means of arrival to the ED string Self
Triage_Temp Temperature at triage (C) float 36.7
Triage_HR Heart rate at triage (bpm) float 80.0
Triage_RR Respiratory rate at triage (breaths per min.) float 18.0
Triage_SpO2 Oxygen saturation at triage (%) float 100.0
Triage_SBP Systolic blood pressure at triage (mmHg) float 128.0
Triage_DBP Diastolic blood pressure at triage (mmHg) float 78.0
Triage_acuity Emergency Severity Index (ESI) at triage (1-5) string 3-Urgent
CC Chief complaint(s) at triage string ABDOMINAL PAIN
ED_dispo Disposition of patient from the ED string Discharge
Hours_to_next_visit For patients with a subsequent visit, number of hours from departure of current visit to arrival of next visit float 40.0
Dispo_class_next_visit Dispo_class for next visit string Discharge
ED_LOS Length of ED stay, hours float 4.82
Hosp_LOS Length of hospital stay (including post-ED admission), days float 1.0
DC_dispo Final disposition of patient from the hospital string Home/Work (includes foster care)
Payor_class Class of primary visit payor string Medicare
Admit_service For admitted patients, service admitting the patient from the ED string Emergency Medicine
Dx_ICD9 Primary visit diagnosis, ICD9 code string 786.50
Dx_ICD10 Primary visit diagnosis, ICD10 code string R07.9
Dx_name Name of ICD10 code string Chest pain, unspecified type
Arrival_time Time of arrival to ED (Random-shift date, keeping season constant) datetime 2262-01-09T03:16:07Z
Roomed_time Time of patient rooming (Random-shift date, keeping season constant) datetime 2283-03-02T07:36:59Z
Dispo_time Time of disposition decision (Random-shift date, keeping season constant) datetime 2247-09-22T10:54:42Z
Admit_time For admitted patients, time of admission order (Random-shift date, keeping season constant) datetime 2283-03-02T12:29:59Z
Departure_time Time of departure from ED (Random-shift date, keeping season constant) datetime 2209-08-12T11:31:38Z

Orders

orders.csv – This table contains all orders placed by the ED physicians during the visit.

  • Rows: 3,288,744
  • Columns: 7

Orders Data Elements

Column Name Description Data Type Sample Data
CSN Visit identifier (Random ID mapped to original CSN) int 99139687
Order_time Time of order (Random-shift date, keeping season constant) datetime 2226-01-15T15:39:22Z
Order_type Type of order (lab, imaging, medication, etc) string Lab
Procedure_name Name of order string CBC WITH DIFFERENTIAL
Procedure_ID Identifier for order (Mapped to CPT codes) string LABMETC
First_admin_time For medications, time of first administration (Random-shift date, keeping season constant) datetime 2212-11-03T16:51:00Z
Result_time For lab and imaging tests, time of result (Random-shift date, keeping season constant) datetime 2295-08-25T16:55:28Z

Medications

meds.csv – This table contains patient home medications.

  • Rows: 235,718
  • Columns: 11

Meds Data Elements

Column Name Description Data Type Sample Data
MRN Patient identifier (Random ID mapped to original MRN) int 99721983
Med_ID Medication identifier (mapped to NDC id) int 14113
NDC National Drug Code identifier string 69618-066-10
Name Medication name string ASPIRIN 81 MG PO TBEC
Generic_name Generic name string aspirin 81 mg tablet,delayed release
Med_class High-level classification of the medication string VITAMIN D PREPARATIONS
Med_subclass A more detailed classification string Vitamins - D Derivatives
Active Indicates whether a patient was thought to be using the medication at the time of the visit string Y
Entry_date Medication entry date (Random-shift date, keeping season constant) date 2270-07-20T00:00:00Z
Start_date Medication start date (Random-shift date, keeping season constant) date 2275-08-31T00:00:00Z
End_date Medication end date (Random-shift date, keeping season constant) date 2241-08-06T00:00:00Z

Past Medical History

pmh.csv – This table contains prior diagnoses.

  • Rows: 755,341
  • Columns: 7

Pmh Data Elements

Column Name Description Data Type Sample Data
MRN Patient identifier (Random ID mapped to original MRN) int 99084665
Noted_date Date when the diagnosis was recorded (Random-shift date, keeping season constant) date 2219-08-10T00:00:00Z
CodeType Whether code is ICD9 or ICD10 string Dx10
Code Diagnosis code string I10
Desc10 Text description of the code string Essential (primary) hypertension
CCS Clinical Classification Software category of the diagnosis float 259.0
DescCCS Text description of the CCS category string Residual codes; unclassified

Lab Test Results

labs.csv – This table gives the results for lab tests ordered during the ED visit.

  • Rows: 5,706,470
  • Columns: 12

Labs Data Elements

Column Name Description Data Type Sample Data
CSN Visit identifier (Random ID mapped to original CSN) int 99880957
Order_time Time of lab order (Random-shift date, keeping season constant) datetime 2221-07-03T18:37:12Z
Result_time Time of lab result (Random-shift date, keeping season constant) datetime 2233-07-07T04:37:14Z
Display_name Name of order string CBC with Differential (CBCD)
Abnormal Flag for abnormal or critical result string Abnormal
Component_name Name of lab component string SODIUM
Component_result Lab result (Removed results containing dates or names) string Negative
Component_value Lab value (Sometimes identical to result, sometimes different, e.g. result may be categorical and value numeric. Removed results containing dates or names) string 1
Component_units Units of component_value, where applicable string %
Component_abnormal Flag for abnormal component_value string Normal
Component_nml_low Low end of normal range for component float 0.0
Component_nml_high High end of normal range for component float 5.2

Radiology

rads.csv – This table contains the results of imaging studies ordered during each ED visit.

  • Rows: 156,866
  • Columns: 5

Rads Data Elements

Column Name Description Data Type Sample Data
CSN Visit identifier (Random ID mapped to original CSN) int 99002166
Order_time Time of imaging order (Random-shift date, keeping season constant) datetime 2205-07-07T11:11:36Z
Result_time Time of imaging result (Random-shift date, keeping season constant) datetime 2291-10-02T05:06:13Z
Study Imaging study name string XR CHEST 1 VIEW
Impression Imaging result impression
(Used the Stanford AIMI De-identification tool JAMIA, HuggingFace)
string “1. No acute cardiopulmonary disease.”

Numeric Monitoring Data

numerics.csv – This table contains the non-waveform monitoring data measured during each ED visit.

  • Rows: 47,821,508
  • Columns: 5

Numerics Data Elements

Column Name Description Data Type Sample Data
CSN Visit identifier (Random ID mapped to original CSN) int 99833461
Source Indicates whether a value was recorded by nursing (Chart) or derived directly from the monitoring database (Monitor) string Monitor
Measure One of 12 measurements
- HR Heart rate, heartbeats per minute
- RR Respiratory rate, breaths per minute
- SpO2 Oxygen saturation by pulse oximetry, %
- SBP Systolic blood pressure, mmHg
- DBP Diastolic blood pressure, mmHg
- MAP Mean arterial pressure, mmHg
- Temp Body temperature, degrees Fahrenheit
- Perf Mean last-minute perfusion index derived from the PPG waveform
- Pain Self-reported pain rating, 0 (no pain) to 10 (worst pain)
- LPM_O2 Flow rate of supplemental oxygen, liters per minute
- 1min_HRV or 5min_HRV Heart rate variability over the last 1 minute or 5 minutes, calculated as the standard deviation of the beat-to-beat RR interval of the ECG waveform over this period.
string SpO2
Value Observation value float 100.0
Time Timestamp of the measurement. When underlying observations are made more frequently than once per minute, they are aggregated to the mean value over the 60 seconds preceding the timestamp. string 2247-05-09T06:40:42Z

Numerics Measure Description

Type Description
Heart Rate (HR) Number of heartbeats per minute; measured using ECG or PPG waveform
Respiratory Rate (RR) Number of breaths per minute; typically measured using RESP waveform
Oxygen Saturation by Pulse Oximetry (SpO2) Percentage of oxygen-saturated hemoglobin in blood; measured using PPG waveform.
Systolic Blood Pressure (SBP) Highest pressure in arteries during heart contraction, recorded by sphygmomanometry
Diastolic Blood Pressure (DBP) Lowest pressure in arteries during heart relaxation, recorded by sphygmomanometry
Mean Arterial Pressure (MAP) Average pressure in a patient’s arteries during one cardiac cycle
Temperature in Degrees Fahrenheit (Temp) Body temperature measured in degrees Fahrenheit
Perfusion (Perf) Ratio of pulsatile to nonpulsatile peripheral blood flow, estimated using PPG waveform
Pain rating on 0-10 scale (Pain) Patient’s self-reported pain level on a scale from 0 (no pain) to 10 (worst pain)
Liters Per Minute of Supplemental Oxygen (LPM_O2) Flow rate of supplemental oxygen administered to the patient, measured in liters per minute
Heart Rate Variability over last 1 minute (1min_HRV) Variability in the time interval between heartbeats over the last minute; indicates autonomic nervous system function
Heart Rate Variability over last 5 minutes (5min_HRV) Variability in the time interval between heartbeats over the last five minutes; indicates autonomic nervous system function

Train-Validation-Test Splits

MC-MED contains two training/validation/test splits for general use. For both splits, the training set contains 80% of visits, and validation and test sets each contain 10% of visits. The visits for each patient are restricted to a single split, in order to prevent data leakage between sets.

Random Patient-level Split split_random_*.csv – The files contain the visit IDs (CSN) for the visits in each split set.

Chronological Split split_chrono_*.csv – All visits in the validation set occur after the final visit in the training set, and all visits in the test set occur after the final visit in the validation set. To prevent patient data leakage between sets, each patient (MRN) is again restricted to only one of the training, validation, or test sets. This results in 13,007 visits being removed from these sets, and exact splits of 78%, 11%, and 11% for training, validating, and testing sets, respectively.


Copyright © 2021-2023 Nightingale Open Science. All rights reserved.