Data Dictionary v1
Table of contents
File Tree
.
└── tamil-jpal-multi
├── echo
│ ├── 00
│ │ ├── 00ab...
│ │ │ ├── 1.mpg
│ ... ... ...
│
├── ecg-12-lead
│ └── waveform
│ ├── 00
│ │ ├── 00ab...npz
│ ... ...
│
├── ecg-single-lead
│ ├── 00
│ │ ├── 00ab...npz
│ ... ...
│
├── fundus
│ ├── 00
│ │ ├── 00ab...
│ │ │ ├── 1.jpg
│ ... ... ...
│
├── xray
│ ├── 00
│ │ ├── 00ab...jpg
│ ... ...
│
└── v1
├── patient-metrics.csv
├── echo-filepath.csv
├── ecg-12-lead-filepath.csv
├── ecg-single-lead-filepath.csv
├── fundus-filepath.csv
└── xray-filepath.csv
Participant Clinical Measurements and Survey Responses
patient-metrics.csv
contains the following medical data.
- Vitals - Blood Pressure, Pulse rate, Respiratory rate, SPO2
- Physical measures - Height, Weight, Waist/Hip/Mid-arm Circumference
- Fitness measures - Endurance test, Grip test
- Eye Measures - Tonometry, Fundus observations
- Cognition Score (MMSE short-form)
- Smoking Status responses
- Blood test data
- PHQ 4, Direct loneliness responses
Column Name | Description | Data Type | Example |
---|---|---|---|
patient_ngsci_id | Unique identifier for each respondent | string | 24fc123… |
year | Year of survey | int | 2023 |
verbal_consent | Whether verbal consent was provided?Yes ,No
| string | Yes |
age | Age of the respondent (value 999 represents age 90 and above) | int | 60 |
sex | Sex of the respondentFemale ,Male
| string | Female |
bp | Whether bp was measured?Yes ,No
| string | Yes |
bp_systolic, bp_diastolic | Blood pressure in mmHg | int | 80 |
pulse | Whether pulse rate was measured? Yes ,No
| string | Yes |
pulse_entry | Pulse rate in beats per min | int | 83 |
resp_rate | Whether respiratory rate was measured?Yes ,No
| string | Yes |
resp_rate_entry | Respiratory rate in breaths per min | int | 22 |
spo2 | Whether SPO2 was measured?Yes ,No
| string | Yes |
spo2_entry | SPO2 level in % | float | 98.0 |
rbs | Whether random blood sugar was measured?Yes ,No
| string | Yes |
rbs_entry | Random blood sugar in mg/dL | float | 138.0 |
height | Whether height was measured?Yes ,No
| string | Yes |
height_entry | Height in cm | float | 155.0 |
weight | Whether weight was measured?Yes ,No
| string | Yes |
weight_entry | Weight in kg | float | 60.0 |
midarm_circum | Whether MUAC was measured?Yes ,No
| string | Yes |
midarm_circum_entry | Mid-arm circumference in cm | float | 28.0 |
waist_circum | Whether waist circumference was measured?Yes ,No
| string | Yes |
waist_circum_entry | Waist circumference in cm | float | 88.0 |
hip_circum | Whether hip circumference was measured? Yes ,No
| string | Yes |
hip_circum_entry | Hip circumference in cm | float | 96.0 |
endurance_test | Whether endurance test was conducted?Yes ,No
| string | Yes |
endurance_test_entry | Endurance test result in terms of no. of unassisted stands | int | 11 |
grip_left, grip_right | Whether grip test was conducted, left and right hand?Yes ,No
| string | Yes |
grip_left_entry, grip_right_entry | Grip test reading for left and right hand in kg | float | 16.2 |
tonometry_lefteye, tonometry_righteye | Whether tonometry was conducted, left and right eye?Yes ,No
| string | Yes |
tonometry_lefteye_entry, tonometry_righteye_entry | Tonometry reading for left and right eye in mmHg | float | 17.3 |
fundus_lefteye, fundus_righteye | Whether the fundus examination was conducted, left and right eye?Yes ,No
| string | Yes |
fundus_lefteye_obs, fundus_righteye_obs | Observations of the optometrist for left and right eye | string | … |
cognition_sf | Whether MMSE short-form was conducted?Yes ,No
| string | Yes |
cognition_sf_score | Score obtained on the MMSE short-form out of 16 | int | 13 |
cognit_impaired | Whether the MMSE short-form indicated cognitive impairment?Yes ,No If the score is <8, the respondent has cognitive impairment | string | Yes |
Hb | Hemoglobin in g/dl | float | 12.3 |
HbA1c | HbA1c in % | float | 6.0 |
triglycerides_mg_dl | Triglycerides Cholesterol in mg/dl | int | 177 |
tot_cholesterol_mg_dl | Total Cholesterol in mg/dl | int | 201 |
HDL_mg_dl | HDL Cholesterol in mg/dl | int | 42 |
LDL_mg_dl | LDL Cholesterol in mg/dl | float | 117.0 |
VLDL_mg_dl | VLDL Cholesterol in mg/dl | float | 35.4 |
totchol_by_hdl_ratio | Total Cholesterol/HDL ratio | float | 4.8 |
ldl_by_hdl_ratio | LDL/HDL ratio | float | 2.77 |
creatinine_mg_dl | Serum Creatinine in mg/dl | float | 0.9 |
literate | Whether the respondent is literate?Yes ,No
| string | Yes |
smoking_1 | Do you currently smoke tobacco on a daily basis, less than daily, or not at all?Daily ,Less than daily ,Not at all ,Refused ,Don’t know Source: GATS | string | Not at all |
smoking_2 | Have you smoked tobacco daily in the past?Yes ,No ,Refused ,Don’t know Source: GATS | string | Yes |
smoking_3 | In the past, have you smoked tobacco on a daily basis, less than daily, or not at all?Daily ,Less than daily ,Not at all ,Refused ,Don’t know Source: GATS | string | Not at all |
The next four fields correspond to the responses of the 4-item patient health questionnaire on anxiety and depression. Source: PHQ-4 Over the last two weeks, how often have you been bothered by the following problems? Not at all ,Several days ,More than half the days ,Nearly everyday ,Refused ,Don't know
| |||
phq_1 | Feeling nervous, anxious or on edge | string | Not at all |
phq_2 | Not being able to stop or control worrying | string | Not at all |
phq_3 | Feeling down, depressed or hopeless | string | Not at all |
phq_4 | Little interest or pleasure in doing things | string | Not at all |
direct_lonely | Over the last two weeks, how often have you been feeling lonely?Not at all ,Several days ,More than half the days ,Nearly everyday ,Refused
| string | Several days |
- Rows: 4,448
- Columns: 62
Electrocardiograms
This dataset contains electrocardiograms from two different devices. The first device produces a 12 lead ECG with a sample rate of 1000 Hz. The second device produces a single lead ECG with a sample rate of 300 Hz.
12 Lead ECG Waveforms
ecg-12-lead-filepath.csv
Column Name | Description | Data Type | Example |
---|---|---|---|
patient_ngsci_id | Unique identifier for each respondent | string | 00ab… |
filepath | Path to file | string | path/to/file |
- Rows: 3,827
- Columns: 2
The ECG filepaths have the following structure:
ecg-12-lead/waveform/{first two digits of patient_ngsci_id}/{patient_ngsci_id}.npz
The waveform data for each ECG is stored in a separate compressed NumPy file. numpy.load doc. These files contain two arrays.
-
'waveform'
- Raw waveform. Shape: (12, 10000) -
'beat_waveform'
- Beat waveform derived from the raw waveform data. Shape: (12, various)
Single Lead ECG Waveforms
ecg-single-lead-filepath.csv
Column Name | Description | Data Type | Example |
---|---|---|---|
patient_ngsci_id | Unique identifier for each respondent | string | 00ab… |
filepath | Path to file | string | path/to/file |
- Rows: 4,224
- Columns: 2
The ECG filepaths have the following structure:
ecg-single-lead/{first two digits of patient_ngsci_id}/{patient_ngsci_id}.npz
The waveform data for each ECG is stored in a separate compressed NumPy file. numpy.load doc. These files contain two arrays.
-
'raw_waveform'
- Raw waveform. Shape: (9000,) -
'enhanced_waveform'
- Enhanced waveform derived from the raw waveform data. Shape: (9000,)
Echocardiogram Videos
echo-filepath.csv
Column Name | Description | Data Type | Example |
---|---|---|---|
patient_ngsci_id | Unique identifier for each respondent | string | 00ab… |
echo_view | TTE views of video | int | 1 |
video_format | The format of the video file | string | MPEG |
filepath | Path to file | string | path/to/file |
- Rows: 52,099
- Columns: 4
The echocardiogram videos are not uniformly formatted. There are three formats: AVI, MPEG, MP4. The filenames of the videos corresponds to one of 13 views, as seen below.
Transthoracic Echocardiography (TTE) Views
- PSLAX: 2D
- PSLAX: color (for assessing aortic and mitral regurgitation)
- PSAX at AOV level: 2D
- PSAX at AOV level: color
- PSAX at MV level (visualize MV orifice)
- PSAX at MV level: Color
- PSAX at PAP level
- Apical: Four-chamber: 2D (full visualization of atria and ventricles)
- Apical: Four-chamber: color mitral regurgitation
- Apical: Four-chamber: color tricuspid regurgitation
- Apical: Five-chamber: color aortic regurgitation
- Subcostal: 2D for interatrial septum
- IVC: volume and IVC collapse
Acronyms: AOV, Aortic valve; MV, mitral valve; PAP, papillary muscle; PSAX, parasternal short-axis view; PSLAX, parasternal long-axis view; 2D, two-dimensional
Fundus Images
fundus-filepath.csv
Column Name | Description | Data Type | Example |
---|---|---|---|
patient_ngsci_id | Unique identifier for each respondent | string | 00ab… |
image_number | The image number | int | 1 |
filepath | Path to file | string | path/to/file |
- Rows: 18,445
- Columns: 3
Most patients have four images. The technicians were instructed to take and save four images per person in the following order: Left disc view, Left macular view, Right disc view, Right macular view.
Chest X-ray Images
xray-filepath.csv
Column Name | Description | Data Type | Example |
---|---|---|---|
patient_ngsci_id | Unique identifier for each respondent | string | 00ab… |
filepath | Path to file | string | path/to/file |
- Rows: 3,887
- Columns: 2
Patients will have at most one X-ray.