Date & Time:
December 6, 2019 10:30 am – 11:30 am
Location:
TTIC 526, 6045 S. Kenwood Ave., Chicago, IL,
12/06/2019 10:30 AM 12/06/2019 11:30 AM America/Chicago Zhaoran Wang (Northwestern) – Computational and Statistical Efficiency of Policy Optimization UChicago CS/Toyota Technological Institute of Chicago Machine Learning Seminar Series TTIC 526, 6045 S. Kenwood Ave., Chicago, IL,

On the Computational and Statistical Efficiency of Policy Optimization in (Deep) Reinforcements Learning

Coupled with powerful function approximators such as neural networks, policy optimization plays a key role in the tremendous empirical successes of deep reinforcement learning. In sharp contrast, the theoretical understandings of policy optimization remain rather limited from both the computational and statistical perspectives. From the perspective of computational efficiency, it remains unclear whether policy optimization converges to the globally optimal policy in a finite number of iterations, even given infinite data. From the perspective of statistical efficiency, it remains unclear how to attain the globally optimal policy with a finite regret or sample complexity.

To address the computational question, I will show that, under suitable conditions, natural policy gradient/proximal policy optimization/trust-region policy optimization (NPG/PPO/TRPO) converges to the globally optimal policy at a sublinear rate, even when it is coupled with neural networks. To address the statistical question, I will present an optimistic variant of NPG/PPO/TRPO, namely OPPO, which incorporates exploration in a principled manner and attains a sqrt{T}-regret. 

(Joint work with Qi Cai, Chi Jin, Jason Lee, Boyi Liu, Zhuoran Yang) 

Host: Mladen Kolar

Zhaoran Wang

Assistant Professor of Industrial Engineering and Management Sciences, Northwestern University

Zhaoran Wang is an assistant professor at Northwestern University, working at the interface of machine learning, statistics, and optimization. He is the recipient of the AISTATS (Artificial Intelligence and Statistics Conference) notable paper award, ASA (American Statistical Association) best student paper in statistical learning and data mining, INFORMS (Institute for Operations Research and the Management Sciences) best student paper finalist in data mining, and the Microsoft fellowship.

Related News & Events

No Name

Prof. Rebecca Willett awarded the SIAG DATA Career prize

Feb 21, 2024
No Name

Argonne scientists use AI to identify new materials for carbon capture

Feb 19, 2024
No Name

Alumni Spotlight: Dixin Tang, Assistant Professor of Computer Science at UT Austin

Feb 05, 2024
No Name

NetMicroscope Uses AI to Improve Network Monitoring for a Better Internet Experience

Feb 01, 2024
No Name

NeurIPS 2023 Award-winning paper by DSI Faculty Bo Li, DecodingTrust, provides a comprehensive framework for assessing trustworthiness of GPT models

Feb 01, 2024
No Name

New research unites quantum engineering and artificial intelligence

Jan 29, 2024
Video

“Machine Learning Foundations Accelerate Innovation and Promote Trustworthiness” by Rebecca Willett

Jan 26, 2024
Video

Nightshade: Data Poisoning to Fight Generative AI with Ben Zhao

Jan 23, 2024
No Name

Exploring 3D Paintbrush: An AI That Colors with Words

Jan 22, 2024
No Name

Group From UChicago CS To Present Four Papers at Most Prestigious International Quantum Conference

Jan 09, 2024
No Name

Alumni Spotlight: Get To Know Emily Wenger, a 2023 CS Graduate Who Was Just Named To The Forbes 30 Under 30 List

Nov 29, 2023
No Name

Three UChicago PhD Students From The Department of Computer Science Named To Forbes 30 Under 30 List

Nov 28, 2023
arrow-down-largearrow-left-largearrow-right-large-greyarrow-right-large-yellowarrow-right-largearrow-right-smallbutton-arrowclosedocumentfacebookfacet-arrow-down-whitefacet-arrow-downPage 1CheckedCheckedicon-apple-t5backgroundLayer 1icon-google-t5icon-office365-t5icon-outlook-t5backgroundLayer 1icon-outlookcom-t5backgroundLayer 1icon-yahoo-t5backgroundLayer 1internal-yellowinternalintranetlinkedinlinkoutpauseplaypresentationsearch-bluesearchshareslider-arrow-nextslider-arrow-prevtwittervideoyoutube