Deming Chen (Urbana-Champaign)- Fully Automated PyTorch-to-Accelerator Design Flow
Abstract: In this talk, we introduce a new design flow, ScaleHLS, that established a new High-Level Synthesis (HLS) solution translating AI models described in PyTorch to customized AI accelerators automatically. By adopting PyTorch as input for AI designs (instead of traditional C/C++ for HLS), the lines of code and design simulation time can be reduced by about 10× and 100×, respectively. Meanwhile, despite being fully automated and able to handle various applications, this new flow achieves a 1.29x higher throughput over DNNBuilder, a state-of-the-art RTL-based neural network accelerator on FPGAs. Such AI model-to-RTL flows pave the way for a new wave of HLS that could drive the high-productivity designs of AI circuits with high density, high-energy efficiency, low cost, and short design cycle. Meanwhile, we are also facing existing and new challenges for such HLS solutions, such as ensuring the correctness of the high-level design, accommodating accurate low-level timing/energy information, handling the complexity of 3D circuits and/or chiplet-based design flows, and achieving all these in a highly scalable manner.
Speakers

Deming Chen
Deming Chen is the Abel Bliss Professor in the Grainger College of Engineering at the University of Illinois Urbana-Champaign. His research interests include hybrid cloud systems, machine learning and AI, reconfigurable and heterogeneous computing, security and confidential computing, and system-level design methodologies. He has published over 290 research papers, received 10 Best Paper Awards and an ACM/SIGDA TCFPGA Hall-of-Fame Paper Award, and delivered more than 160 invited talks. His work has had a significant impact, with open-source solutions adopted by industry, such as FCUDA, DNNBuilder, CSRNet, SkyNet, ScaleHLS, and Medusa. Notably, Medusa has been integrated into Nvidia’s TensorRT-LLM, improving the speed of large language model (LLM) execution by 1.9-3.6x. He is an IEEE Fellow, an ACM Distinguished Speaker, and the Editor-in-Chief of ACM Transactions on Reconfigurable Technology and Systems (TRETS). Under his leadership, the impact factor of ACM TRETS has increased by 3.8 times. He serves as the Illinois Director of the IBM-Illinois Discovery Accelerator Institute and the Director of the AMD-Xilinx Center of Excellence. Additionally, he has been involved in several startup companies, including AutoESL and Inspirit IoT. He received his Ph.D. in Computer Science from UCLA in 2005.