Algorithms play a large and ever-increasing role in daily life. Behind the scenes of casual activities such as hailing a rideshare, buying an item online, or deciding what movie to watch, machine learning algorithms set prices and offer recommendations. Models make major decisions every second in digital platforms, finance, security, medicine, and other industries. But at some point, these computational “agents” must interact with the intelligence of humans, leaving them open to manipulation and inefficiency.
Haifeng Xu, a new assistant professor in the Department of Computer Science and the Data Science Institute, wants to design better algorithms that strategically work with humans on the most challenging problems. His research has taken him into subjects such as the economics of data and machine learning, information design, and theoretical questions around uncertainty and complexity. Three co-authored papers accepted to this year’s edition of the prestigious Conference on Neural Information Processing Systems (NeurIPS), represent the breadth of this work, as Xu establishes his SIGMA research lab at UChicago.
The three papers describe new approaches for optimizing advertisers’ bidding strategy, determining the value of data used in machine learning models, and reverse engineering the motivations of economic actors in an adversarial struggle. Together, the findings connect Xu’s earlier work at the University of Virginia, Harvard, and USC on information design to his current interests in the economics of data.
“Information design is about how to use information in a strategic way that can improve a system’s total welfare,” Xu said. “If you think about a machine learning algorithm, it’s really trying to distill information from data. But when you have this information, how should you use it? How can you disseminate it to the people who really need it? One solution is to design a market for your information, for data or for learning algorithms.”
In “CS-Shapley: Class-wise Shapley Values for Data Valuation in Classification,” Xu and co-authors Stephanie Schoch and Yangfeng Ji explore one important question in creating these markets: what is the value of data to a machine learning model? Researchers often look at how the data improves the overall accuracy in a model that, for example, classifies photos of animals. Xu and colleagues developed a new approach looking at the data’s “in-class” influence – how much a particular cat photo improves the ability to identify cats, rather than how it is used for dogs or other animals. The approach they created not only provided a new method of valuing data, it actually improved the performance of these models.
“What really surprised us was when we applied it to real experiments, and it worked better,” Xu said. “We were able to significantly improve most of the benchmarks you find in the literature by giving a better evaluation about the value of the data.”
Another paper, “Incrementality Bidding via Reinforcement Learning under Mixed and Delayed Rewards” with Ashwinkumar Badanidiyuru and Zhe Feng of Google Research and Tianxi Li of the University of Virginia, tackled a different kind of strategic model. On popular websites, advertisers bid in auctions to show their ads to users, a complex sequential decision making problem with delayed and uncertain rewards (whether the user makes a purchase). The researchers developed a new reinforcement learning algorithm for learning and estimating incrementality, which is the causal effect of showing an ad to a consumer, enabling smarter bidding strategies.
The work is both an example of Xu’s vision for designing algorithms that can act optimally in highly competitive environments and, in its collaboration with Google, the unique strengths of an academic/industry partnership. “It’s something that I am very excited about: the innovation really comes from interdisciplinary thinking about a real-world problem,” he said.
Xu’s third NeurIPS paper, with his PhD student Jibang Wu at UChicago as well as Weiran Shen, and Fei Fang, attacked a fundamental problem in economics called Stackelberg Games. Here, the goal is to observe the behavior of a system, such as an online shopping platform or a stock market, and use it to reverse engineer the motivations of its participants. Xu and his collaborators created a more efficient solution by moving away from the classical economic expectation that people always make the best possible decision.
“Bounded rationality has always been a stringent assumption for economic analysis; there has been a whole literature in behavioral economics to study boundedly rational agent behaviors ,” Xu said. “Interestingly, we show that this relaxation actually kills two birds with one stone. First of all, it makes the model more realistic, and secondly, it actually speeds up the learning of the agent’s preferences from observing their boundedly rational behaviors.”
The insights from this model could be used in various real-world situations, Xu said, including a cybersecurity scenario where the defenders are trying to learn the incentives of an attacker in order to predict and thwart their future activities.
The interdisciplinary methods and applications of such work helped draw Xu to the University of Chicago, where he hopes to form new bridges between UChicago CS, the Data Science Institute, and the Booth School of Business. One eventual aim of his interest in the economic science of data and AI is to build a market for AI algorithms, an “Uber for machine learning” where effective model design and data sharing is incentivized.
“UChicago really cares about creating new disciplines and foundational work, and these are the things that I have personally been very excited about as well,” Xu said. “The economic science of data is a very, very new discipline, and I feel like this is one of the best and most suitable places to make it happen.”