Preventing and curing cancer has been a topic of immense research and discussion, but, because of its complexity, cancer is still a leading cause of death today. One avenue of rapid research is the intersection of machine learning and medicine — what if we could use AI to detect cancer?

Steven Song and Spencer Ellis
Spencer Ellis (left) and Steven Song (right)

Steven Song, a fourth-year M.D./Ph.D. candidate, is advised by Professor Robert Grossman, the Frederick H. Rawson Distinguished Service Professor in Medicine and Computer Science. Along with his co-first-authors Spencer Ellis and Derek Reiman, Song just published a paper pushing the boundaries of AI in medicine by predicting non-melanoma skin cancer in a resource-limited setting. Ellis recently graduated with an M.S. from the UChicago Department of Computer Science, and Reiman is a Research Assistant Professor at the Toyota Technological Institute at Chicago (TTIC).

Their paper, “AI-assisted Diagnosis of Nonmelanoma Skin Cancer in Resource-Limited Settings”, is an interdisciplinary collaboration between computer science, computational biology, public health, and the departments of dermatology and pathology within UChicago Medicine. Song recently presented the work at the annual meeting of the American Association for Cancer Research. The data used in their study consists of suspected cancerous skin biopsies of Bangladeshi individuals.

This data was collected by UChicago’s Institute for Population and Precision Health to study the prevention and intervention of increased nonmelanoma skin cancer in Bangladesh, due to arsenic contamination of the drinking water. While there has been extensive research done to advance machine learning in classifying and diagnosing cancer cases, much of it has been done in the resource-plentiful environments of the lab, where computational power, data, and expertise are not limitations.Steven Song presenting at conference

“We wanted the model to be useful, and to be feasible for further study under actual use cases, like on the ground in Bangladesh,” Song stated. “So, we came at it from this perspective of, can we develop these algorithms from a resource-constrained setting? That was the core idea of the paper that Spencer, Derek, and I had to tackle.”

For a machine-learning model to work in a resource-limited setting, it had to be adaptable and deployable without much effort or computational power. Song, Ellis, and Reiman purposely tried to limit themselves by avoiding computationally expensive methods that may otherwise excel at this task. To do this, they relied on existing digital pathology foundational models that had been developed over large datasets. In a zero-shot setting, without any additional model training, the authors show that these foundation models demonstrated the ability to extract meaningful information from previously unseen data. Further, they utilized the extracted embeddings of their small dataset to fit smaller models, which can be more easily developed and deployed in a computational and data-limited setting.

“When you are using the foundation models for inference, you’re just applying the model to new data to see what representation of the data the model pulls out,” Song explained. “That requires much fewer resources than tuning the foundation model, and it is also something that could be abstracted into an online service. Two of the foundation models that we evaluated are currently hosted on cloud computing resources, so if a user has internet access, they can utilize the lightweight classification models that we have developed.”

Their models demonstrated strong discriminatory performance between healthy tissue and three different nonmelanoma skin cancer subtypes: basal cell carcinoma, squamous cell carcinoma in situ, and invasive squamous cell carcinoma. While all three foundation models evaluated could be used for classification, they found that one model, PRISM, was better at the challenging task of distinguishing between two highly related skin cancer subtypes: squamous cell carcinoma in situ and invasive squamous cell carcinoma. Their analysis revealed that PRISM in particular would examine the entire tissue slide as opposed to a focused region, a necessary step for distinguishing in situ vs invasive pathology.

Overall, Song believes that this research is important because of its feasibility and ease of implementation, and he hopes that access to these models may be a step towards improving the health of underserved populations.

“This research is an interdisciplinary collaboration between computer science and medicine,” Song reflected. “I think that UChicago is such a great place to be, where experts across domains want to work together and make all this research possible.”

As for next steps, there is much active research going on to make an optimized machine learning model based on the individual strengths of different foundation models. Song highlights the importance of investigating the data collection process as well: creating digital tissue slides involves surgically removing a piece of or the whole tumor, preparing the biopsy, mounting it on glass slides, and imaging the slides. This process is not readily available for many underserved communities. Song is also currently working on several other projects at the intersection of machine learning and biomedicine, involving collaborations with the emergency, trauma, and critical care medicine departments.

To learn more about what Song and his colleagues are researching at the intersection of medicine and computer science, please visit The Center for Translational Data Science’s website here.

Related News

More UChicago CS stories from this research area.
headshot
UChicago CS News

University of Chicago Researchers Earn Top Honor for Adaptive Software Breakthrough

Aug 07, 2025
headshot
UChicago CS News

Alumni Spotlight: Shama Tirukkala ‘24 is a Fulbright Finalist

Aug 07, 2025
data points
UChicago CS News

Finding the “Goldilocks” Solution to a Classic Math Problem: A Breakthrough in Numerical Integration

Jul 29, 2025
UChicago CS News

Ten Years of MSCAPP: Where Public Policy Meets Coding

Jul 25, 2025
content warning label
UChicago CS News

Moderation at the Crossroads: How Generative AI Platforms Manage Creativity and Content Safety

Jul 21, 2025
UChicago CS News

Can a Doctor’s Notes Reveal When They’re Tired? New Research Illuminates the Hidden Signals of Physician Fatigue—And Raises Questions About AI in Healthcare

Jul 17, 2025
students looking at poster
UChicago CS News

2025 Midwest Machine Learning Symposium Demonstrates Regional Excellence

Jul 16, 2025
UChicago CS News

PhD Candidate Bogdan Stoica Receives Distinguished Artifact Evaluator Award for Championing Reproducibility in Computer Science

Jul 14, 2025
UChicago CS News

Report from GlobusWorld 2025: Going Beyond Data

Jul 10, 2025
headshots
UChicago CS News

University of Chicago PhD Graduates Secure Tenure-Track Faculty Positions Amid a Competitive Job Market

Jun 25, 2025
text to 3d example
UChicago CS News

Democratizing Digital Graphics: An Undergrad’s Unlikely Path To Putting Agency of 3D-Generation in Users’ Hands

Jun 17, 2025
headshot
UChicago CS News

Faculty Spotlight: Get to Know Kexin Pei

Jun 03, 2025
arrow-down-largearrow-left-largearrow-right-large-greyarrow-right-large-yellowarrow-right-largearrow-right-smallbutton-arrowclosedocumentfacebookfacet-arrow-down-whitefacet-arrow-downPage 1CheckedCheckedicon-apple-t5backgroundLayer 1icon-google-t5icon-office365-t5icon-outlook-t5backgroundLayer 1icon-outlookcom-t5backgroundLayer 1icon-yahoo-t5backgroundLayer 1internal-yellowinternalintranetlinkedinlinkoutpauseplaypresentationsearch-bluesearchshareslider-arrow-nextslider-arrow-prevtwittervideoyoutube