Date & Time:
January 24, 2022 4:30 pm – 5:30 pm
Location:
Live Stream
01/24/2022 04:30 PM 01/24/2022 05:30 PM America/Chicago Qi Lei (Princeton) – Theoretical Foundations of Pretrained Models DSI/CS/Statistics Joint Candidate Talk Live Stream

Event Recap

A pre-trained model refers to any model trained on broad data at scale and can be adapted (e.g., fine-tuned) to a wide range of downstream tasks. The rise of pre-trained models (e.g., BERT, GPT-3, CLIP, Codex, MAE) transforms applications in various domains and aligns with how humans learn. Humans and animals first establish their concepts or impressions from different data domains and data modalities. The learned concepts then help them learn specific tasks with minimal external instructions. Accordingly, we argue that a pre-trained model follows a similar procedure through the lens of deep representation learning. 1) Learn a data representation that filters out irrelevant information from the training tasks; 2) Transfer the data representation to downstream tasks with few labeled samples and simple models.

This talk establishes some theoretical understanding for pre-trained models under different settings, ranging from supervised pretraining, meta-learning, and self-supervised learning to domain adaptation or domain generalization. I will discuss the sufficient (and sometimes necessary) conditions for pre-trained models to work based on the statistical relation between training and downstream tasks. The theoretical analyses partly answer how they work, when they fail, guide technical decisions for future work, and inspire new methods in pre-trained models.

Host: Rebecca Willett

Theoretical Foundations of Pretrained Models

Watch via Live Stream.

A pre-trained model refers to any model trained on broad data at scale and can be adapted (e.g., fine-tuned) to a wide range of downstream tasks. The rise of pre-trained models (e.g., BERT, GPT-3, CLIP, Codex, MAE) transforms applications in various domains and aligns with how humans learn. Humans and animals first establish their concepts or impressions from different data domains and data modalities. The learned concepts then help them learn specific tasks with minimal external instructions. Accordingly, we argue that a pre-trained model follows a similar procedure through the lens of deep representation learning. 1) Learn a data representation that filters out irrelevant information from the training tasks; 2) Transfer the data representation to downstream tasks with few labeled samples and simple models.

This talk establishes some theoretical understanding for pre-trained models under different settings, ranging from supervised pretraining, meta-learning, and self-supervised learning to domain adaptation or domain generalization. I will discuss the sufficient (and sometimes necessary) conditions for pre-trained models to work based on the statistical relation between training and downstream tasks. The theoretical analyses partly answer how they work, when they fail, guide technical decisions for future work, and inspire new methods in pre-trained models.

Host: Rebecca Willett

Speakers

Qi Lei

Associate Research Scholar, Princeton University

Qi Lei is an associate research scholar at the ECE department of Princeton University. She received her Ph.D. from Oden Institute for Computational Engineering & Sciences at UT Austin. She visited the Institute for Advanced Study (IAS)/Princeton for the Theoretical Machine Learning Program from 2019-2020. Before that, she was a research fellow at Simons Institute for the Foundations of Deep Learning Program. Her research aims to develop sample- and computationally efficient machine learning algorithms and bridge the theoretical and empirical gap in machine learning. Qi has received several awards, including the Outstanding Dissertation Award, National Initiative for Modeling and Simulation Graduate Research Fellowship, Computing Innovative Fellowship, and Simons-Berkeley Research Fellowship.

Related News & Events

In the News

UChicago CS Researchers Share in Special Prize on COVID-19 Research

Dec 01, 2022
Haifeng Xu
UChicago CS News

New CS and DSI Faculty Haifeng Xu Brings Strategic Intelligence to NeurIPS 2022

Nov 28, 2022
UChicago CS News

UChicago CS Research Finds New Angle on Database Query Processing with Geometry

Nov 08, 2022
UChicago CS News

UChicago AI Summit Examines Promise and Concerns for Science and Society

Nov 01, 2022
UChicago CS News

New Schmidt Futures Fellowship at UChicago to Foster Next Generation of AI-Driven Scientists

Oct 26, 2022
UChicago CS News

New UpDown Project Uses “Intelligent Data Movement” to Accelerate Graph Analytics

Oct 21, 2022
UChicago CS News

Five UChicago CS Students Named to Siebel Scholars 2023 Class

Sep 22, 2022
UChicago CS News

UChicago CS Students Emily Wenger and Xu Zhang Receive Harper Fellowships

Sep 14, 2022
UChicago CS News

Asst. Prof. Aloni Cohen Receives Award For Revealing Flaws in Deidentifying Data

Sep 09, 2022
UChicago CS News

First In-Person Robotics Class Lets Students See Code Come To (Artificial) Life

Sep 06, 2022
UChicago CS News

UChicago/Argonne Researchers Will Cultivate AI Model “Gardens” With $3.5M NSF Grant

Aug 30, 2022
UChicago CS News

UChicago Hosts NSF Workshop on Frontiers of Quantum Advantage

Aug 15, 2022
arrow-down-largearrow-left-largearrow-right-large-greyarrow-right-large-yellowarrow-right-largearrow-right-smallbutton-arrowclosedocumentfacebookfacet-arrow-down-whitefacet-arrow-downPage 1CheckedCheckedicon-apple-t5backgroundLayer 1icon-google-t5icon-office365-t5icon-outlook-t5backgroundLayer 1icon-outlookcom-t5backgroundLayer 1icon-yahoo-t5backgroundLayer 1internal-yellowinternalintranetlinkedinlinkoutpauseplaypresentationsearch-bluesearchshareslider-arrow-nextslider-arrow-prevtwittervideoyoutube