Nightingale Open Science is a platform that connects researchers with world-class medical data. We work closely with health systems around the world to create and curate datasets of medical images linked to ground-truth labels. We carefully deidentify the data and make it available for non-profit research on our cloud infrastructure.

We focus on datasets that will help researchers make breakthroughs for unsolved medical problems.

Consider sudden cardiac death, which kills 300,000 Americans every year. Many papers have been written on factors that put people at higher risk—but even after looking back at the vast majority of deaths, we still cannot  find an identifiable cause. Or cancer: improved screening since the 1990’s has helped us identify more small tumors—but we still haven’t been able to translate this into lower rates of late-stage diagnoses or death. 

What is current clinical research missing?

Collection of various medical scans and displays

We believe the key to solving these mysteries lies in the massive volumes of complex imaging data health systems produce every day: electrocardiogram waveforms, x-rays and CT scans, tissue biopsy images, and more. Today, these data are interpreted by humans, but our research is providing clues that machine learning can open up new ways of ‘seeing’ signals and patterns in the data that humans cannot.

 Unfortunately, existing medical data with the potential to shed light on these patterns have historically been siloed. By making this data accessible to broad groups of interdisciplinary researchers, we can begin to unlock discoveries that save lives, surfacing previously unknown patterns of disease.

Our Approach

Doctors and scientists using new technology and connecting over shared data

This is the vision underlying Nightingale Open Science: an open platform housing cutting-edge, deidentified medical datasets that are available to a diverse, global community of researchers.

Our goal is to foster researcher collaborations across disciplines, bringing together computer science researchers, clinicians, and economists around critical questions that will push the boundaries of medical research and spur the field of computational medicine.


What is computational medicine and why should I care about it?

Computational medicine is a new field at the intersection of medicine, statistics, and computation. But this field is being stymied by lack of data.

Fields like computer vision and natural language processing have benefitted from shared data where researchers can compete and collaborate on high value questions and problems - such as ImageNet for object detection and MNIST for digit recognition.

But computational researchers have no comparable datasets to answer critical questions in health and medicine. Making such datasets available is a key part of building this new field. Once researchers have the raw material they need to develop and apply new computational techniques to medicine, we expect to see similar leaps and bounds in our understanding and capabilities as occurred in other fields.

What kinds of data do you work with?

We focus on data that is high-dimensional, such as imaging and waveforms, which are ideally suited to machine learning.

We also emphasize linking these imaging data to ground-truth outcomes: what happened to the patient’s health, not just what a doctor thought about an image. This allows researchers to develop algorithms that learn from nature—not from humans.

Who do you work with?

We work with a variety of health systems in the United States and internationally to define specific and compelling research questions. We collaboratively build a dataset around those research questions and help the institution conduct analysis and gather findings. We then feature some of the deidentified data variables on our platform for researcher collaboration and competition. 

Interested in partnering with us? Contact us here.

What clinical areas do you focus on?

We are open to focusing on a broad range of clinical areas with our partners. Here are examples of work from previous partnerships:

  • Sudden Cardiac Death
    How can we better understand and predict adverse cardiac risk states? We are working with our partners to link electrocardiogram (ECG) waveforms with clinical outcomes to help identify new, subtle patterns and markers indicative of risk for heart attack.
  • Cancer
    There is still much about cancer’s progression and staging that we do not understand. Patients are commonly subjected to unnecessary procedures while others die from late-stage disease that should have been caught earlier. Linking biopsy specimen images to cancer registry data can enable us to better understand progression, therapeutic responses, and patient outcomes.
  • COVID-19
    Chest x-ray data linked to pulmonary outcomes will allow us to better understand and predict deterioration due to COVID, and other similar respiratory illnesses.

What privacy safeguards do you provide?

All data that is featured on Nightingale is completely de-identified and non-PHI. Our platform hosts researchers inside a secure, monitored computing environment tailored for cutting-edge AI research. 

All content on the Nightingale Open Science platform is strictly limited to non-commercial academic research.

Ready to get involved?