Three projects led by UChicago CS faculty were part of a $6.5 million round of research funding on cybersecurity and secure critical infrastructure from the C3.ai Digital Transformation Institute.
Research proposals from Professors Nick Feamster, Ben Zhao, and Heather Zheng were among the 24 projects chosen by the institute in its third round of proposals. The projects tackle important challenges and create new tools around security vulnerabilities on the internet and in machine learning systems
“Cybersecurity is an immediate existential issue,” said Thomas M. Siebel, chairman and CEO of C3 AI, a leading enterprise AI software provider. “We are equipping top scientists with the means to advance technology to help secure critical infrastructure.”
The University of Chicago is one of 10 member institutions in the C3.ai Digital Transformation Institute, which was formed to accelerate the benefits of artificial intelligence for business, government, and society. Previous funding rounds have supported research on COVID-19 and climate and energy.
Read about the projects below and the full cohort of funded research here.
Continuously and Automatically Discovering and Remediating Internet-Facing Security Vulnerabilities
Nick Feamster (UChicago), Zakir Durumeric (Stanford), Prateek Mittal (Princeton)
The project has two themes: (1) Developing and applying fingerprinting tools and techniques to automatically generate fingerprints for known vulnerabilities and other security weaknesses; and (2) Designing, implementing, and deploying large-scale scanning techniques to uncover these vulnerabilities in a broad array of settings (such as industrial control and other cyber-physical settings). The approaches that we propose to develop extend a rich body of previous work in both supervised machine learning (to detect, fingerprint, and inventory vulnerable infrastructure), unsupervised machine learning (to detect anomalous device behavior), and large-scale Internet scanning.
Fundamental Limits on the Robustness of Supervised Machine Learning Algorithms
Ben Zhao (UChicago), Daniel Cullina (Penn State), Arjun Nitin Bhagoji (UChicago)
Determining fundamental bounds on robustness for machine learning algorithms is of critical importance for securing cyberinfrastructure. Machine learning is ubiquitous but prone to severe vulnerabilities, particularly at deployment. Adversarial modifications of inputs can induce misclassification, with catastrophic consequences in safety-critical systems. This team will develop a framework to obtain lower bounds on robustness for any supervised learning algorithm (classifier), when the data distribution and adversary are specified. The framework will work with a general class of distributions and adversaries, encompassing most proposed in prior work. It can be extended to get lower bounds on robustness for any pre-trained feature extractor or family of classifiers and for multiple attackers operating in tandem. Its implications for training and deploying robust models are numerous and consequential. Perhaps the most important is enabling algorithm designers to get a robustness score for either a specific classifier or a family of classifiers. For any adversary, they can compute this score as the gap to the optimal performance possible. The optimal performance is the equilibrium of a classification game between the adversary and classifiers. Robustness scores can also be determined for pre-trained feature extractors, widely used in transfer learning, enabling designers to pick robust feature extractors. Robust training can also be improved via byproducts of the framework, which enables the identification of hard points, provides optimal soft labels for use during training, and enables better architecture search for robustness by identifying model layers and hyperparameters that impact robustness.
Robust and Scalable Forensics for Deep Neural Networks
Ben Zhao (UChicago), Heather Zheng (UChicago), Bo Li (University of Illinois at Urbana-Champaign)
For external-facing systems in real world settings, few if any security measures offer full protection against all attacks. In practice, digital forensics and incident response (DFIR) provide a complementary security tool that focuses on using post-attack evidence to trace back a successful attack to its root cause. Not only can forensic tools help identify (and patch) points of vulnerability responsible for successful attacks (e.g., breached servers, unreliable data-labeling services), but also provide a strong deterrent against future attackers with the threat of post-attack identification. This is particularly attractive for machine learning systems, where defenses are routinely broken soon after release by more powerful attacks. This team plans to build forensic tools to boost the security of deployed ML systems using post-attack analysis to identify key factors leading to a successful attack. We consider two broad types of attacks: “poison” attacks, where corrupted training data embeds misbehaviors into a model during training, and “inference-time” attacks, where an input is augmented by a model-specific adversarial perturbation. For poison attacks, we propose two complementary methods to identify the training data responsible for the misbehavior, one using selective unlearning and one using computation of the Shapley value from game theory. For inference time attacks, we will explore use of hidden labels to shift feature representations, making it possible to identify the source model of an adversarial example. Given promising early results, our goal is both a principled understanding of these approaches, and a suite of usable software tools.