NeurIPS 2021 Workshop

Important Dates:

  • Submission Deadline: Monday, September 13, 11:59pm Anywhere on Earth (AoE)
  • Paper acceptance notifications: Wednesday, October 22, 11:59pm AoE
  • Camera ready deadline: Sunday, November 21, 11:59pm AoE
  • Workshop: Tuesday, December 14, Timezone TBD
Start Submission


Our workshop seeks research on critical medical questions leveraging ground-truth outcomes and the patient experience. Along with featured research, the workshop will launch a new data platform called Nightingale Open Science that provides researchers with massive new imaging datasets linked to ground truth outcomes, curated around some of the biggest unsolved medical problems of our time.

Learning from ground truth
Today, algorithms routinely achieve ‘expert-level’ performance interpreting medical imaging. This is a blessing and a curse: by reproducing doctors’ judgments, algorithms also reproduce their limitations and biases. This workshop will highlight research that goes beyond the clinician interpretation to focus on hard outcomes and the patient experience. Machine learning from ground truth solicits research submissions highlighting either (1) novel datasets that release ground-truth patient data targeted towards any of the unsolved medical problems listed below, (2) machine learning applications or algorithms that similarly focus on these unsolved medical problems and leveraging ground-truth patient data and labels (i.e., data & labels should not be subject to clinician interpretation), or (3) review articles, commentary, or descriptions of additional problems in medical datasets that motivate or caution against the use of ground-truth data over clinician-interpreted findings. Below, we offer a few use case examples of submission types we both encourage and discourage.

In parallel to featuring selected research submissions, the workshop will launch a new not-for-profit data platform, Nightingale Open Science, that provides researchers with new imaging and high-dimensional datasets linked to ground truth outcomes. These datasets, curated around unsolved medical problems like sudden cardiac death, cancer, pain, and aging, will raise a set of new, fascinating research questions and technical challenges: e.g., how to train algorithms on a set of noisy, potentially missing, selection-biased, and mis-measured labels; how to transfer learnings across geography and time; and how to develop new model interpretability methods that tie predictive features (e.g., in an ECG waveform) to underlying physiology (e.g., disturbances in cardiac conduction). 

Examples of Clinical Areas of Interest:

Please note that these are examples and not prescriptive -- submissions can include a range of critical clinical questions/problems.

  • Cancer metastasis. The United States performs 2 million breast biopsies, at a cost of $4 billion per year, to find out who needs treatments for early-stage breast cancer, when success rates are highest. But 98.6% of biopsies come back negative, and breast cancer remains the second cause of cancer death among women. Linking cancer biopsy slides to cancer registry data and Social Security mortality data would allow for the study of outliers: low-grade biopsies that progress, metastasize, or kill; high-grade biopsies that do not. This work can yield new insights into which cancers will spread and which can be left alone, and also about tumor biology, by identifying potentially new features of the image (e.g., stromal tissue) linked to prognosis.
  • Sudden cardiac death. This year, 350,000 people will suddenly drop dead, the vast majority with no apparent warning signs—even in retrospect. If clinicians knew who was at high risk, based on electrophysiological signals in their ECG waveform, they could better understand why. More importantly, they could consider implanting a cardiac defibrillator in them, instead of someone else: the majority of the 100,000 implanted every year in the United States, at a cost of $35,000 each, either never fire or misfire. Linking electrocardiogram waveforms (ECG) to death certificates and electronic health records will allow the study and possible prediction of sudden cardiac death.
  • Unexplained pain. The opiate epidemic has shown how widespread pain is—but what causes it? Decades of research have shown poor correlation between findings on imaging and patients’ reports of pain, e.g. MRI findings and low back pain. Linking lower extremity x-rays (hips, legs) to patient reports of pain, and downstream outcomes: fractures, joint replacements, etc can enable researchers and clinicians to gain a better understanding of the organic drivers of pain.

Examples of Encouraged and Discouraged Submissions:

Human radiologists may overlook causes of pain in disadvantaged groups. An algorithm could be developed to identify new signals of pain that are missed by physicians for various reasons —but how should the algorithm be trained?

The typical approach to this research question: train the algorithm on human radiologists’ interpretation, exemplified by the paper Automatic Knee Osteoarthritis Diagnosis from Plain Radiographs: A Deep Learning-Based Approach. This is an example of a research submission we discourage for the workshop. While certainly compelling, this example veers away from the focus of our workshop as the data is all based on clinician diagnoses and not ground truth:  signal the algorithm unearths are limited to clinician knowledge, and also subject to clinician bias.

Rather than using the physician interpretation, the algorithm can instead be trained using hard outcomes and the patient experience (the patient reported pain score) to unearth new patterns of pain that our current medical knowledge may not consider. This is an example of a research submission we encourage, and is well-captured in An algorithmic approach to reducing unexplained pain disparities in underserved populations which squarely meets the workshop’s objectives by focusing on ground truth over the clinician interpretation. In this case, the research team showed that radiologists indeed miss important causes of knee pain, particularly in black patients by training the algorithm to learn from the patient. The algorithm detected far more of all patients’ pain, compared to radiologists’ measures—but it was particularly good at explaining pain in black patients. Most strikingly, the algorithm could not be reduced to the elements in the radiologist’s detailed report: it had extracted rich new signal for pain, opening the door to future research.

Submission Guidelines:

We are participating in the ML4H Virtual Symposium shared submission system program. Under this program, to submit a work to our workshop, you should submit your work to the ML4H extended abstract track and in that submission form, indicate that you would like your work to be considered for the Machine Learning from Ground Truth workshop as well. Your work, reviews, rebuttal, and meta-review will then be forwarded to our decision committee for final decisions after the ML4H review process has concluded.

Please follow the style guidelines and instructions listed on the ML4H Call for Participation.
Note that decisions for ML4H will be made fully separately from decisions for our workshop, but your reviews and meta-reviews will be shared. We will contact authors directly when decisions are made.
Some of the accepted contributions will be invited to give a spotlight. Accepted submissions will be linked to from the workshop website, along with supplementary material and code. No formal proceedings will be provided.

Submission link:

Grants and Scholarship Opportunities:

One of our team’s central goals is to enable junior researchers to access meaningful data more easily. We will offer workshop attendee scholarships for researchers from underrepresented groups and institutions. We will also offer scholarships for wireless internet, given the remote nature of the conference, and compute resources to conduct research. The workshop will feature un-conference style networking sessions scheduled throughout the day targeted to junior researchers who want to explore new research areas. Details on these grants and scholarships will be provided to finalists.