Kristina Toutanova (Google) – Advances and Limitations in Generalization via Self-Supervised Pretrained Representations

Date & Time:

October 18, 2021 11:00 am – 12:00 pm

Location:

Zoom

10/18/2021 11:00 AM 10/18/2021 12:00 PM America/Chicago Kristina Toutanova (Google) – Advances and Limitations in Generalization via Self-Supervised Pretrained Representations TTIC Zoom

Advances and Limitations in Generalization via Self-Supervised Pretrained Representations

Pretrained neural representations, learned from unlabeled text, have recently led to substantial improvements across many natural language problems. Yet some components of these models are still brittle and heuristic, and sizable human labeled data is typically needed to obtain competitive performance on end tasks.

I will first talk about recent advances from our team leading to (i) improved multi-lingual generalization and ease of use through tokenization-free pretrained representations and (ii) better few-shot generalization for underrepresented task categories via neural language model-based example extrapolation. I will then point to limitations of generic pre-trained representations when tasked with handling both language variation and out-of-distribution compositional generalization, and the relative performance of induced symbolic representations.

This talk is part of the TTIC Colloquium and will be presented on Zoom, register here for details.

Host: Toyota Technology Institute at Chicago

Kristina Toutanova

Research Scientist, Google Research

Kristina Toutanova is a research scientist at Google Research in Seattle and an affiliate faculty at the University of Washington. She obtained her Ph.D. from the Computer Science Department at Stanford University with Christopher Manning, and her MSc in Computer Science from Sofia University, Bulgaria. Prior to joining Google in 2017, she was a researcher at Microsoft Research, Redmond. Kristina focuses on modeling the structure of natural language using machine learning, most recently in the areas of representation learning, question answering, information retrieval and semantic parsing. Kristina is a past co-editor in chief of TACL, a program co-chair for ACL 2014, and a general chair for NAACL 2021.

Resources

Community

What’s Real and What’s Not? Watermarking to Identify AI-Generated Text

Enhancing Multitasking Efficiency: The Role of Muscle Stimulation in Reducing Mental Workload

From wildfires to bird calls: Sage redefines environmental monitoring

“Machine Learning Foundations Accelerate Innovation and Promote Trustworthiness” by Rebecca Willett

Nightshade: Data Poisoning to Fight Generative AI with Ben Zhao

Ian Foster – Better Information Faster: Programming the Continuum

Advances and Limitations in Generalization via Self-Supervised Pretrained Representations

Kristina Toutanova

What’s Real and What’s Not? Watermarking to Identify AI-Generated Text

Enhancing Multitasking Efficiency: The Role of Muscle Stimulation in Reducing Mental Workload

From wildfires to bird calls: Sage redefines environmental monitoring

Unlocking the Future of AI: How CacheGen is Revolutionizing Large Language Models

UChicago Partners With UMass On NSF Expedition To Elevate Computational Decarbonization As A New Field In Computing

Assistant Professor Raul Castro Fernandez Awarded NSF CAREER Grant to investigate Data-sharing Markets

Empowering Middle School Girls in Tech: compileHER’s <prompt/HER> Capstone Event

Haifeng Xu Wins Best Paper Award at Leading AI Conference for Pioneering Research on Mechanism Design for LLMs

Fred Chong Receives Quantrell Award for Excellence in Teaching

Unveiling Attention Receipts: Tangible Reflections on Digital Consumption

NASA to Launch UChicago Undergraduates’ Satellite

University of Chicago Computer Science Researchers To Present Ten Papers at CHI 2024