The flourishing discipline of data science extracts new discoveries and knowledge from the rapidly growing pools of data available in science, industry, and society. To do so, data science combines expertise from computer science, statistics, and mathematics, finding new approaches and best practices for working with massive, complex, and continuously changing datasets.
The Transdisciplinary Research In Principles Of Data Science (TRIPODS) program of the National Science Foundation supports this work by bridging these three disciplines for research and training. In its first round of Phase II funding, the TRIPODS program awarded $12.5 million to the Institute for Foundations of Data Science (IFDS), a four-university collaboration among the Universities of Washington, Wisconsin-Madison, California Santa Cruz, and Chicago.
IFDS research on complex data sets will “lead to improved accuracy and decreased bias in algorithmic decision making processes, as well as methods to cope with ever-changing data that may be corrupted by noise or even malicious intent,” the NSF said in an announcement. Its work helps establish the maturing field of data science while also advancing machine learning and other artificial intelligence approaches.
University of Chicago faculty involved with the institute include Rebecca Willett, Professor of Computer Science and Statistics, Rina Foygel Barber, Professor of Statistics, and Mary Silber, Director of the Committee on Computational and Applied Mathematics (CCAM) and Professor of Statistics.
Willett was one of the founding principal investigators of IFDS when it was originally funded by the NSF in 2017 as part of the TRIPODS Phase I program. For the first three years, she credits the group with connecting scientists, disciplines, and students to research critical topics such as complexity, robustness, and ethics in data science.
“This remarkable program incorporates statistics, theoretical computer science, and mathematics into data science and machine learning in a principled way,” Willett said. “Building robustness to corrupted data, understanding how much data is necessary for various machine learning systems to perform reliably, quantifying our confidence in predictions based on data, and ensuring that data-centric algorithms achieve ethical standards are all facilitated by this institute via cross-disciplinary joint efforts.”
In addition to supporting research, IFDS activities also help train the next generation of data scientists through workshops, summer schools for high school students, and research assistant programs. The expanded role of UChicago in the second phase of the institute will provide students across computer science, statistics, and mathematics with new opportunities to participate in these programs and work with leading researchers at the partner institutions.
For more coverage of the TRIPODS Phase II funding, see announcements from the NSF and the University of Washington, and the University of Wisconsin. A second data science institute, the Foundations of Data Science Institute, was also funded as part of the $25 million in NSF awards.