Learning When Objectives Are Hard to Specify
Real world deployment of learning systems which interact with humans requires aligning what these systems optimize for with underlying human objectives and values. A major hurdle towards accomplishing this has been that it is hard for humans to precisely specify what it means to do the desired task well.
In the first part, we will take a multi-criteria viewpoint towards these underlying objectives and develop a framework for selection of such value-aligned models when the data comprises pairwise comparisons across multiple different criteria. In the second part, I will present our work on understanding the consequence of over optimizing a misspecified objective. We find evidence about the existence of phase transitions which could pose challenges to safe deployment of such learning systems.
Host: Nati Srebro
I am a graduate student in the Computer Science department at UC Berkeley where I am fortunate to be advised by Peter Bartlett and Anca Dragan. I am interested in problems at the intersection of statistics, optimization and machine learning, in particular on aspects of aligning the objectives of the ML models with those of humans.
Before coming to Berkeley, I spent 2 wonderful years working at Microsoft Research Bangalore with Prateek Jain and Manik Varma. I completed my undergraduate studies at IIT Delhi with a major in Computer Science where I was advised by Parag Singla for my undergraduate thesis.