Date & Time:
February 4, 2025 2:00 pm – 3:00 pm
Location:
Crerar 390, 5730 S. Ellis Ave., Chicago, IL,
02/04/2025 02:00 PM 02/04/2025 03:00 PM America/Chicago Sadhika Malladi (Princeton)- Deep Learning Theory in the Age of Generative AI Crerar 390, 5730 S. Ellis Ave., Chicago, IL,

Abstract: Large neural networks, like language models (LMs), have demonstrated remarkable success in executing complex tasks, but little is understood about why these models work and how various design choices affect model behavior. Performing thorough empirical ablations to understand modern-day training paradigms is generally computationally infeasible, underscoring the need for theory-driven insights and improvements. However, traditional theoretical analysis of deep networks usually requires restrictive assumptions that are far from practical settings.

In this talk, I will present flexible yet rigorous theoretical frameworks for understanding LM pre-training and fine-tuning, along with their algorithmic implications. For fine-tuning, I propose a formal understanding of fine-tuning that motivates the design of MeZO, a zeroth-order optimizer that reduces memory consumption by up to 12x while preserving performance. I will also discuss recent work exposing surprising failure modes of preference learning, a specialized form of fine-tuning used to steer LMs to exhibit desired behaviors. In the pre-training regime, I use stochastic differential equations (SDEs) to design principled and efficient hyperparameter selection algorithms for highly distributed training settings. I will conclude by exploring promising directions for co-developing deep learning theory and practice.

Speakers

Sadhika Malladi

PhD Candidate, Princeton University

Sadhika Malladi is a final-year PhD student in Computer Science at Princeton University advised by Sanjeev Arora. Her research advances deep learning theory to capture modern-day training settings, yielding practical training improvements and meaningful insights into model behavior. She has co-organized multiple workshops, including Mathematical and Empirical Understanding of Foundation Models at ICLR 2024 and Mathematics for Modern Machine Learning (M3L) at NeurIPS 2024. She was named a 2025 Siebel Scholar.

Related News & Events

Video

“Machine Learning Foundations Accelerate Innovation and Promote Trustworthiness” by Rebecca Willett

Jan 26, 2024
Video

Nightshade: Data Poisoning to Fight Generative AI with Ben Zhao

Jan 23, 2024
In the News

In The News: U.N. Officials Urge Regulation of Artificial Intelligence

"Security Council members said they feared that a new technology might prove a major threat to world peace."
Jul 27, 2023
UChicago CS News

UChicago Computer Scientists Bring in Generative Neural Networks to Stop Real-Time Video From Lagging

Jun 29, 2023
UChicago CS News

Computer Science Displays Catch Attention at MSI’s Annual Robot Block Party

Apr 07, 2023
UChicago CS News

UChicago, Stanford Researchers Explore How Robots and Computers Can Help Strangers Have Meaningful In-Person Conversations

Mar 29, 2023
UChicago CS News

Postdoc Alum John Paparrizos Named ICDE Rising Star

Mar 15, 2023
UChicago CS News

New EAGER Grant to Asst. Prof. Eric Jonas Will Explore ML for Quantum Spectrometry

Mar 03, 2023
UChicago CS News

Assistant Professor Chenhao Tan Receives Sloan Research Fellowship

Feb 15, 2023
UChicago CS News

UChicago Scientists Develop New Tool to Protect Artists from AI Mimicry

Feb 13, 2023
In the News

Professors Rebecca Willett and Ben Zhao Discuss the Future of AI on Public Radio

Jan 26, 2023
UChicago CS News

UChicago Launches Transform Accelerator for Data Science & Emerging AI Startups

Jan 19, 2023
arrow-down-largearrow-left-largearrow-right-large-greyarrow-right-large-yellowarrow-right-largearrow-right-smallbutton-arrowclosedocumentfacebookfacet-arrow-down-whitefacet-arrow-downPage 1CheckedCheckedicon-apple-t5backgroundLayer 1icon-google-t5icon-office365-t5icon-outlook-t5backgroundLayer 1icon-outlookcom-t5backgroundLayer 1icon-yahoo-t5backgroundLayer 1internal-yellowinternalintranetlinkedinlinkoutpauseplaypresentationsearch-bluesearchshareslider-arrow-nextslider-arrow-prevtwittervideoyoutube