Deqing Fu (USC)- Algorithmic Perspectives on Understanding Transformers
Abstract: In this talk, we explore how algorithmic insights and tools from optimization theory and Fourier transforms can shed light on the mechanisms underlying Transformers’ ability to solve fundamental computational tasks, including linear regression and addition. We will examine the interplay between architectural design and pre-training data in enabling Transformers to learn these mechanisms effectively. Lastly, we will discuss recent advancements in directly mapping numbers to their Fourier representations, eliminating the tokenization step entirely for numbers to improve efficiency and accuracy.
Speakers

Deqing Fu
Deqing Fu is a third-year Ph.D. student in Computer Science at the University of Southern California (USC). His research focuses on deep learning theory, natural language processing, and the interpretability of AI systems. He is co-advised by Prof. Vatsal Sharan in the USC Theory Group and Prof. Robin Jia in the USC NLP Group. Prior to his Ph.D., Deqing earned his undergraduate and master’s degrees in mathematics and statistics from the University of Chicago.