Please join us for our monthly Chicago Data Night, cohosted by ChiData and DSI. Practitioners, academics, and aficionados within the Chicago area are all invited to be part of a community at the intersection of industry and academia, brought together by a mutual interest in data. Each month, guest speakers will cover a specific data-related topic.
Hors d’oeuvres and drinks will be provided. Admission is free, and we strongly encourage filling out an RSVP here.
4:00pm: Doors Open
5:00pm: Welcome Remarks
5:10pm: Guest Speaker
The Auditorium at 1871
Merchandise Mart, #1212
222 W Merchandise Mart Plaza
Chicago, IL 60654
Abstract: Data shapes our social, economic, cultural, and technological environments. Data is valuable, so people seek it, inducing data to flow. The resulting dataflows distribute data and thus value. For example, large Internet companies profit from accessing data from their users, and engineers of large language models seek large and diverse data sources to train powerful models. It is possible to judge the impact of data in an environment by analyzing how the dataflows in that environment impact the participating agents. My research hypothesizes that it is also possible to design (better) data environments by controlling what dataflows materialize; not only can we analyze environments but also synthesize them. In this talk, I present the research agenda on “data ecology,” which seeks to build the principles, theory, algorithms, and systems to design beneficial data environments. I will also present examples of data environments my group has designed, including data markets for machine learning, data-sharing, and data integration. I will conclude by discussing the impact of dataflows in data governance and how the ideas are interwoven with the concepts of trust, privacy, and the elusive notion of “data value.” As part of the technical discussion, I will complement the data market designs with the design of a data escrow system that permits controlling dataflows.
In my research I build high-performance systems for discovering, preparing, and processing data. I often use techniques from data management, statistics, and machine learning. At MIT I work with professors Sam Madden and Mike Stonebraker. Before MIT, I completed my PhD at Imperial College London with Peter Pietzuch.