Date & Time:
March 8, 2021 2:00 pm – 3:00 pm
Location:
Zoom
03/08/2021 02:00 PM 03/08/2021 03:00 PM America/Chicago Zhuoran Yang (Princeton) – Demystifying (Deep) Reinforcement Learning: The Pessimist, The Optimist, and Their Provable Efficiency Department of Statistics Zoom

Demystifying (Deep) Reinforcement Learning: The Pessimist, The Optimist, and Their Provable Efficiency

For Zoom information, join the Department of Statistics seminar listserv.

Coupled with powerful function approximators such as deep neural networks, reinforcement learning (RL) achieves tremendous empirical successes. However, its theoretical understandings lag behind. In particular, it remains unclear how to provably attain the optimal policy with a finite regret or sample complexity. In this talk, we will present the two sides of the same coin, which demonstrates an intriguing duality between pessimism and optimism.

– In the offline setting, we aim to learn the optimal policy based on a dataset collected a priori. Due to a lack of active interactions with the environment, we suffer from the insufficient coverage of the dataset. To maximally exploit the dataset, we propose a pessimistic least-squares value iteration algorithm, which achieves a minimax-optimal sample complexity.

– In the online setting, we aim to learn the optimal policy by actively interacting with an environment. To strike a balance between exploration and exploitation, we propose an optimistic least-squares value iteration algorithm, which achieves a sqrt{T} regret in the presence of linear, kernel, and neural function approximators.

Zhuoran Yang

PhD Student, Princeton University

Zhuoran Yang is a final-year Ph.D. student in the Department of Operations Research and Financial Engineering at Princeton University, advised by Professor Jianqing Fan and Professor Han Liu. Before attending Princeton, He obtained a Bachelor of Mathematics degree from Tsinghua University. His research interests lie in the interface between machine learning, statistics, and optimization. The primary goal of his research is to design a new generation of machine learning algorithms for large-scale and multi-agent decision-making problems, with both statistical and computational guarantees. Besides, he is also interested in the application of learning-based decision-making algorithms to real-world problems that arise in robotics, personalized medicine, and computational social science. 

Related News & Events

Students posing at competition
No Name

UChicago Undergrad Team Places Second Overall In Regionals For World’s Largest Programming Competition

Mar 17, 2023
Haifeng Xu
No Name

New CS and DSI Faculty Haifeng Xu Brings Strategic Intelligence to NeurIPS 2022

Nov 28, 2022
No Name

UChicago CS Research Finds New Angle on Database Query Processing with Geometry

Nov 08, 2022
No Name

Asst. Prof. Aloni Cohen Receives Award For Revealing Flaws in Deidentifying Data

Sep 09, 2022
No Name

UChicago Hosts NSF Workshop on Frontiers of Quantum Advantage

Aug 15, 2022
No Name

New 2022-23 Faculty Add Expertise in Linguistics, Visualization, Economics, and Data Science Education

Aug 11, 2022
No Name

UChicago Co-Leads $10 Million NSF Institute on Foundations of Data Science

Aug 09, 2022
No Name

Bill Fefferman Comments on New Standards for Quantum-Proof Cryptography

Jul 07, 2022
No Name

UChicago London Colloquium Features Data Science, Quantum Research

Jul 01, 2022
No Name

Faculty Bill Fefferman and Chenhao Tan Receive Google Research Scholar Awards

Jun 21, 2022
No Name

First-Year PhD Student Co-Authors Outstanding Paper Award Winner at TQC 2022

Apr 28, 2022
No Name

Quanta Magazine Features Prof. Bill Fefferman’s Work on Quantum Algorithms

Jan 20, 2022
arrow-down-largearrow-left-largearrow-right-large-greyarrow-right-large-yellowarrow-right-largearrow-right-smallbutton-arrowclosedocumentfacebookfacet-arrow-down-whitefacet-arrow-downPage 1CheckedCheckedicon-apple-t5backgroundLayer 1icon-google-t5icon-office365-t5icon-outlook-t5backgroundLayer 1icon-outlookcom-t5backgroundLayer 1icon-yahoo-t5backgroundLayer 1internal-yellowinternalintranetlinkedinlinkoutpauseplaypresentationsearch-bluesearchshareslider-arrow-nextslider-arrow-prevtwittervideoyoutube