Breast Cancer Metadata Collection

This study is the first to apply deep learning to analyze tumor-associated stroma, discovering novel diagnostic biomarkers for breast cancer.

Summary of Research

Histological assessments of biopsies form the basis of breast cancer diagnosis. Human pathologists are inherently subjective and misdiagnose 28% of cases, resulting in 40,000 American deaths annually. Moreover, the clinical standard for diagnoses has scarcely changed since 1928 and solely examines epithelial cells. Breast biopsies have a 9.1% false-negative rate due to non-representative epithelial samples; hence, analyzing the surrounding tumor-associated stroma would make diagnoses robust to these sampling errors. Recent machine learning advances have made evaluating stroma morphology possible, creating new opportunities for more reliable, data-driven diagnostic methods.

This study is the first to apply deep learning to analyze tumor-associated stroma, discovering novel diagnostic biomarkers for breast cancer that are robust to sampling errors. With these biomarkers, I developed the first stand alone stroma-based model that objectively diagnoses invasive ductal carcinomas. Using NCI data from 1057 women, I trained four state-of-the-art convolutional neural networks to classify images as either tumor-associated stroma or normal stroma. Each model underwent numerous iterations of hard negative mining and regularization to combat overfitting, resulting in an AUC score of 0.935. With stroma classifications for entire histopathology images, I implemented and
optimized four different machine learning algorithms to generate patient diagnosis: logistic regression, support vector machines, artificial neural networks, and random forests.

The random forest model achieved the highest diagnostic accuracy, attaining an AUC score of 0.921—a significant improvement over Ojansivu et al.’s (2013) AUC score of 0.84. These findings showcase the largely untapped clinical value of stroma-based biomarkers and their potential for guiding treatment decisions.

Leave a Reply

Breast Cancer Metadata Collection

This study is the first to apply deep learning to analyze tumor-associated stroma, discovering novel diagnostic biomarkers for breast cancer.

Scroll to top