Yuanhao Wang (Princeton)- Is RLHF more difficult than standard RL? A view from reductions

Date & Time:

October 25, 2023 11:00 am – 12:00 pm

10/25/2023 11:00 AM 10/25/2023 12:00 PM America/Chicago Yuanhao Wang (Princeton)- Is RLHF more difficult than standard RL? A view from reductions

Reinforcement learning from Human Feedback (RLHF) learns from preference signals, while standard Reinforcement Learning (RL) directly learns from reward signals. Preferences arguably contain less information than rewards, which makes preference-based RL seemingly more difficult. This paper theoretically proves that, for a wide range of preference models, we can solve preference-based RL directly using existing algorithms and techniques for reward-based RL, with small or no extra costs. Specifically, (1) for preferences that are drawn from reward-based probabilistic models, we reduce the problem to robust reward-based RL that can tolerate small errors in rewards; (2) for general arbitrary preferences where the objective is to find the von Neumann winner, we reduce the problem to multiagent reward-based RL which finds Nash equilibria for factored Markov games under a restricted set of policies. The latter case can be further reduced to adversarial MDP when preferences only depend on the final state. We instantiate all reward-based RL subroutines by concrete provable algorithms, and apply our theory to a large class of models including tabular MDPs and MDPs with generic function approximation. We further provide guarantees when K-wise comparisons are available.

Speakers

Yuanhao Wang

PhD Student

Yuanhao Wang is a fourth-year PhD student at the Computer Science Department of Princeton University. He is advised by Chi Jin. Prior to Princeton, he received his bachelor’s degree in Computer Science from Yao Class at Tsinghua University. His research interests include reinforcement learning theory, learning in games and minimax optimization. He has received the best paper award in the ICLR 2022 workshop on Gamification and Multiagent Solutions.

Resources

Community

What’s Real and What’s Not? Watermarking to Identify AI-Generated Text

Enhancing Multitasking Efficiency: The Role of Muscle Stimulation in Reducing Mental Workload

From wildfires to bird calls: Sage redefines environmental monitoring

“Machine Learning Foundations Accelerate Innovation and Promote Trustworthiness” by Rebecca Willett

Nightshade: Data Poisoning to Fight Generative AI with Ben Zhao

Ian Foster – Better Information Faster: Programming the Continuum

Speakers

Yuanhao Wang

Unveiling Attention Receipts: Tangible Reflections on Digital Consumption

Five UChicago CS students named to Siebel Scholars Class of 2024

UChicago Computer Scientists Design Small Backpack That Mimics Big Sensations

Computer Science Class Shows Students How To Successfully Create Circuit Boards Without Engineering Experience

UChicago CS Researchers Shine at CHI 2023 with 12 Papers and Multiple Awards

New Prototypes AeroRigUI and ThrowIO Take Spatial Interaction to New Heights – Literally

Computer Science Displays Catch Attention at MSI’s Annual Robot Block Party

UChicago, Stanford Researchers Explore How Robots and Computers Can Help Strangers Have Meaningful In-Person Conversations

Asst. Prof. Rana Hanocka Receives NSF Grant to Develop New AI-Driven 3D Modeling Tools

UChicago and NYU Research Team Finds Edtech Tools Could Pose Privacy Risks For Students

Assistant Professor Chenhao Tan Receives Sloan Research Fellowship

High School Students Find Their Place in Computing Through Wearables Workshop