The modern Internet is driven by AI-centric services that determine how we interact with technology and society on a daily basis. The exponential rise in AI is largely fueled by the design, development, and deployment of domain-specific software and hardware that have yielded orders of magnitude improvements for deep learning. Despite these efforts, this talk focuses on an important, yet under-studied area: systems for deep learning-based personalized recommendation. Personalized recommendations form the backbone of our interaction with the Internet including search, e-commerce, streaming, and social media. Systems play a crucial role in enabling accurate, efficient, and sustainable recommendation engines.
In this talk I show how modern deep learning-based personalized recommendation engines not only consume the majority of AI training and inference cycles in production data centers, but also introduce unique system design challenges to efficient execution. To tackle these challenges, I design solutions across the software and hardware stack to optimize inference efficiency by jointly considering application-level characteristics, unique neural network model architectures, data-center scale implications, and the underlying hardware. Given the rapidly growing infrastructure demands posed by AI and recommendation engines, my work highlights that systems must go beyond performance, power, and energy efficiency to consider environmental footprint as a first order design target to enable sustainable computing. Finally, I chart paths to designing future systems that enable emerging AI-driven applications by balancing performance, efficiency, sustainability, and privacy.
Udit Gupta is a PhD student at Harvard University and visiting research scientist at Facebook AI Research. His research interests focus on enabling next-generation responsible AI platforms by designing novel computer systems and hardware. His recent work focuses on the optimization of data center-scale deep learning-based personalized recommendation engines (HPCA 2020, ISCA 2020, MICRO 2021, ASPLOS 2021) and enabling sustainable computing by considering the environmental impact of end-to-end hardware life cycles (HPCA 2021, MLSys 2022). Udit’s work has been evaluated at-scale in production data centers and incorporated into standardized benchmarks and infrastructure used by the research community. His research has been recognized as an IEEE MICRO Top Picks honorable mention in 2020 and received an IEEE MICRO Top Picks award in 2021, as well as nominated for best paper at PACT 2019 and DAC 2018. In addition to research, Udit is passionate about building interdisciplinary communities. He has co-founded the PeRSonAl (personalized recommendation systems and algorithms) workshop and CLEAR (computing landscapes with environmental accountability and responsibility) workshops co-located at systems and machine learning conferences like ASPLOS, ISCA, and MLSys. He is also the co-chair of the Computer Architecture Student Association.