Samory Kpotufe (Columbia) – Some Recent Insights on Transfer Learning

Date & Time:

March 13, 2020 10:30 am – 11:30 am

Location:

Crerar 390, 5730 S. Ellis Ave., Chicago, IL,

03/13/2020 10:30 AM 03/13/2020 11:30 AM America/Chicago Samory Kpotufe (Columbia) – Some Recent Insights on Transfer Learning Crerar 390, 5730 S. Ellis Ave., Chicago, IL,

Some Recent Insights on Transfer Learning

A common situation in Machine Learning is one where training data is not fully representative of a target population due to bias in the sampling mechanism or high costs in sampling the target population; in such situations, we aim to ’transfer’ relevant information from the training data (a.k.a. source data) to the target application. How much information is in the source data? How much target data should we collect if any? These are all practical questions that depend crucially on ‘how far’ the source domain is from the target. However, how to properly measure ‘distance’ between source and target domains remains largely unclear.

In this talk we will argue that much of the traditional notions of ‘distance’ (e.g. KL-divergence, extensions of TV such as D_A discrepancy, density-ratios, Wasserstein distance) can yield an over-pessimistic picture of transferability. Instead, we show that some new notions of ‘relative dimension’ between source and target (which we simply term ‘transfer-exponents’) capture a continuum from easy to hard transfer. Transfer-exponents uncover a rich set of situations where transfer is possible even at fast rates, helps answer questions such as the benefit of unlabeled or labeled target data, yields a sense of optimal vs suboptimal transfer heuristics, and have interesting implications for related problems such as multi-task learning.

Finally, transfer-exponents provide guidance as to *how* to efficiently sample target data so as to guarantee improvement over source data alone. We illustrate these new insights through various simulations on controlled data, and on the popular CIFAR-10 image dataset.

The talk is based on work with Guillaume Martinet, and ongoing work with Steve Hanneke.

Host: Eric Jonas

Samory Kpotufe

Associate Professor, Columbia University

I graduated (Sept 2010) in Computer Science at the University of California, San Diego, advised by Sanjoy Dasgupta. I then was a researcher at the Max Planck Institute for Intelligent Systems. At the MPI I worked in the department of Bernhard Schoelkopf, in the learning theory group of Ulrike von Luxburg. Following this, I spent a couple years as an Assistant Research Professor at the Toyota Technological Institute at Chicago. I then spent 4 years at ORFE, Princeton University as an Assistant Professor.

Resources

Community

What’s Real and What’s Not? Watermarking to Identify AI-Generated Text

Enhancing Multitasking Efficiency: The Role of Muscle Stimulation in Reducing Mental Workload

From wildfires to bird calls: Sage redefines environmental monitoring

“Machine Learning Foundations Accelerate Innovation and Promote Trustworthiness” by Rebecca Willett

Nightshade: Data Poisoning to Fight Generative AI with Ben Zhao

Ian Foster – Better Information Faster: Programming the Continuum

Some Recent Insights on Transfer Learning

Samory Kpotufe

“Machine Learning Foundations Accelerate Innovation and Promote Trustworthiness” by Rebecca Willett

Nightshade: Data Poisoning to Fight Generative AI with Ben Zhao

In The News: U.N. Officials Urge Regulation of Artificial Intelligence

UChicago Computer Scientists Bring in Generative Neural Networks to Stop Real-Time Video From Lagging

Computer Science Displays Catch Attention at MSI’s Annual Robot Block Party

UChicago, Stanford Researchers Explore How Robots and Computers Can Help Strangers Have Meaningful In-Person Conversations

Postdoc Alum John Paparrizos Named ICDE Rising Star

New EAGER Grant to Asst. Prof. Eric Jonas Will Explore ML for Quantum Spectrometry

Assistant Professor Chenhao Tan Receives Sloan Research Fellowship

UChicago Scientists Develop New Tool to Protect Artists from AI Mimicry

Professors Rebecca Willett and Ben Zhao Discuss the Future of AI on Public Radio

UChicago Launches Transform Accelerator for Data Science & Emerging AI Startups