Second High Risk Breast Cancer Prediction Contest Launched
February 3, 2023
Senthil Nachimuthu
Chief Medical Officer, Nightingale

Nightingale, Association for Health Learning & Inference (AHLI) and Providence St. Joseph Health are pleased to announce the second High Risk Breast Cancer Prediction Contest starting today, February 3, 2023. The contest will end on Wednesday May 3, 2023. In addition to cash prizes worth $10,000, the winners will be invited to present at the ML4H conference in December 2023, co-located with NeurIPS in New Orleans. All participants will receive free compute credits.


Every year, 40 million women get a mammogram; some go on to have an invasive biopsy to better examine a concerning area. Underneath these routine tests lies a deep—and disturbing—mystery. Since the 1990s, we have found far more ‘cancers’, which has in turn prompted vastly more surgical procedures and chemotherapy. But death rates from metastatic breast cancer have hardly changed.

There is already evidence that algorithms can predict which cancers will metastasize and harm patients on the basis of the biopsy image. Fascinatingly, these algorithms also hone in on features that humans neglect, for example, the nature of the non-cancerous tissue surrounding the tumor. But to date, the datasets linking biopsy images to patient outcomes—metastasis, death—have been far smaller than what is needed to apply modern approaches.

To advance medical knowledge about identifying features of cancers that will metastasize, we launched our first machine learning contest to identify the cancer stage from breast cancer biopsy slides that ended in January 2023 - the results are published here. We are now launching the second high risk breast cancer prediction contest based on our learnings from participants of the first contest as well as other users of this data set.

The second high risk breast cancer contest has the following updates:

  • We have selected a subset of training data for this contest to balance the cancer stages, as well as the race and ethnicities of patients.
  • A baseline score using the CLAM pre-trained model is being published as the 'score to beat' in this contest.
  • The scorer uses AUC (Area under the receiver operating characteristic curve) instead of the MSE (mean squared error).

We invite ML researchers worldwide to participate in this contest. Please share this message with others who might be interested. Click the link below to learn more about the contest and to participate.