New CS/Stats Prof. Rebecca Willett Explores Foundations, Applications of Data Science

In a world increasingly governed by data, it is critical that it is applied properly. As data-driven methods gain popularity for driving public policy, business operations, hiring, basic science, medical decisions, and virtually every other aspect of our lives, it’s important that people understand the foundations of data science and apply it appropriately. If not, bias, spurious correlations, and other statistical landmines can distort results, with what could now be grave human and societal consequences.

Joint Computer Science and Statistics Professor Rebecca Willett helps neuroscientists, physicians, astronomers, climate researchers, and even farmers avoid these missteps and maximize the discovery potential of data. Through a combination of fundamental research and diverse interdisciplinary collaboration, Willett has advanced the practice of data science into new fields and deeper insights. After faculty positions at Duke University and the University of Wisconsin, she joined UChicago in summer 2018 to both continue her work and help build new data science research and education initiatives.

“I was really excited that the university was investing in data science in a large scale way,” Willett said. “I thought there was a lot of opportunity for growth and building and I was enthusiastic to play a central role in those efforts.”

Shortly after her arrival at UChicago, Willett led a multi-university team awarded an NSF grant to find new “El Nino”-like weather patterns, using data science for improved seasonal forecasting using growing quantities of climate measurements. The project is emblematic of her research portfolio, which features many projects where her group develops new fundamental methodology and theory that are inspired by challenges faced by domain scientists with difficult, data-intensive questions, including image analysis, signal processing, and using machine learning for prediction and optimization.

For example, she recently helped a neuroscience group develop algorithms to segment and parameterize images of neural tissue, in order to test a new method for controlling the growth of stem cells. In another project, she helped develop the image processing methods within a smartphone app farmers can use to quickly measure corn kernels for dairy cow feed and make critical, on-the-fly decisions about harvesting methods to improve cow nutrition.

Other projects help researchers deal with high-dimensional data, where there is abundance of features associated with each data point. For example, data science methods help avoid false conclusions from linking medical and genetic data, where the sheer scale of possible connections can create misleading correlations.  

“A pervasive theme is how to draw reliable conclusions from data,” Willett said, “especially when data are high-dimensional. For instance, we record vast quantities of data about each patient’s health history, including test results, treatments, demographic information, family history, imaging data, genetic information, and physician notes. Such large numbers of features makes it difficult to tease out risk factors for health conditions that were previously unrecognized. It becomes even more challenging as we strive to ensure methods are robust to errors in health records, lab tests not conducted, or treatments that were untried.”

“In general, mitigating the challenges associated with high-dimensional data is a key research thrust in data science, and relies upon developing novel geometric representations of data and incorporating physical models as much as possible”

Willett’s co-appointment reflects the combination of skills needed to address these questions in a practical way. While many of the methods to analyze data are steeped in statistics, computer science helps her understand how effective methods can be computed in reasonable time, and what could potentially go wrong.

“For some projects, I have developed novel software and tools that practitioners or researchers in other fields can use on their data ,” Willett said. “For others I have examined methods that are  already in use and developed theory to better characterize whether these methods are reasonable, whether there are some pitfalls we should be aware of, and where there may be room for improvement”

That philosophy aligns with UChicago initiatives to launch new programming and collaborations in data science that combine efforts from the Departments of Statistics and Computer Science, including a course debuting this fall co-taught by department chairs Dan Nicolae and Michael Franklin.

“The fact that CS and Stats work so well together and the unified vision of the two departments means a lot,” Willett said. “I think that's going to allow us to establish ourselves as a world-class machine learning and data science group.”

Related News

More UChicago CS stories from this research area.
Michael Franklin
UChicago CS News

Mike Franklin, Dan Nicolae Receive 2023 Arthur L. Kelly Faculty Prize

Jun 02, 2023
UChicago CS News

PhD Student Kevin Bryson Receives NSF Graduate Research Fellowship to Create Equitable Algorithmic Data Tools

Apr 14, 2023
UChicago CS News

Computer Science Displays Catch Attention at MSI’s Annual Robot Block Party

Apr 07, 2023
UChicago CS News

UChicago / School of the Art Institute Class Uses Art to Highlight Data Privacy Dangers

Apr 03, 2023
UChicago CS News

UChicago, Stanford Researchers Explore How Robots and Computers Can Help Strangers Have Meaningful In-Person Conversations

Mar 29, 2023
UChicago CS News

Virtual Bakery Game Serves Up Both Cupcakes and Quantum Concepts For K-12 Students

Mar 27, 2023
Students posing at competition
UChicago CS News

UChicago Undergrad Team Places Second Overall In Regionals For World’s Largest Programming Competition

Mar 17, 2023
UChicago CS News

Postdoc Alum John Paparrizos Named ICDE Rising Star

Mar 15, 2023
UChicago CS News

New EAGER Grant to Asst. Prof. Eric Jonas Will Explore ML for Quantum Spectrometry

Mar 03, 2023
UChicago CS News

Asst. Prof. Rana Hanocka Receives NSF Grant to Develop New AI-Driven 3D Modeling Tools

Feb 28, 2023
Young students on computers
UChicago CS News

UChicago and NYU Research Team Finds Edtech Tools Could Pose Privacy Risks For Students

Feb 21, 2023
UChicago CS News

Assistant Professor Chenhao Tan Receives Sloan Research Fellowship

Feb 15, 2023
arrow-down-largearrow-left-largearrow-right-large-greyarrow-right-large-yellowarrow-right-largearrow-right-smallbutton-arrowclosedocumentfacebookfacet-arrow-down-whitefacet-arrow-downPage 1CheckedCheckedicon-apple-t5backgroundLayer 1icon-google-t5icon-office365-t5icon-outlook-t5backgroundLayer 1icon-outlookcom-t5backgroundLayer 1icon-yahoo-t5backgroundLayer 1internal-yellowinternalintranetlinkedinlinkoutpauseplaypresentationsearch-bluesearchshareslider-arrow-nextslider-arrow-prevtwittervideoyoutube