Discover how to access the datasets through Globus. Follow the instructions here.

Welcome to Nightingale Open Science

The Datasets

Waveforms

Microscopy Images

X-ray Images

Multiple Diagnostics

What makes these datasets special

Our datasets are curated around medical mysteries—heart attack, cancer metastasis, cardiac arrest, bone aging, Covid-19—where machine learning can be transformative. We designed these datasets with four key principles in mind:

  1. The core of each dataset is a large collection of medical images: x-rays, ECG waveforms, digital pathology (and more to come). These rich, high-dimensional signals are too complex for humans to fully see or process—so machine vision can add huge value.

  2. Each image is linked to at least one ground truth outcome: data on what happened to the patient, not a doctor’s interpretation of the image. This allows researchers to build algorithms that learn from nature—not from humans.

  3. The data are diverse: we work with health systems across the US and the world, including under-resourced ones whose data aren’t usually represented in machine learning. This lets the resulting algorithm speak to the needs of diverse populations.

  4. Access is secure and ethical: all data are completely deidentified, and as an extra precaution, no download is allowed. Only non-commercial use is allowed, so the knowledge generated from the data benefits everyone.


Copyright © 2021-2025 Nightingale Open Science. All rights reserved.