NSF Grant Backs funcX — A Smart, Automated Delegator for Computational Research

Computational scientific research is no longer one-size-fits-all. The massive datasets created by today’s cutting-edge instruments and experiments — telescopes, particle accelerators, sensor networks and molecular simulations — aren’t best processed and analyzed by a single type of machine. For faster and more efficient discovery, data can be chopped up and shipped to specialized resources, including supercomputers, campus clusters, cloud data centers, and “accelerators” optimized for specific tasks such as machine learning or visualization.

But delegating chunks of data and analysis functions to their ideal destination isn’t trivial. A team led by UChicago CS researchers Ian Foster and Kyle Chard and Daniel S. Katz of the National Center for Supercomputing Applications at the University of Illinois seeks to streamline the process with funcX, a new distributed “function-as-a-service” (FaaS) platform that makes it easier for researchers to easily and automatically delegate their computational workload. With a pair of grants to the Universities of Chicago and Illinois from the National Science Foundation totalling $3.14 million, the team will work with several large science projects and cyberinfrastructure partners to build and test this new system for “computationally fluid” research.

“Exploding data volumes and velocities, new computational methods and platforms, and ubiquitous connectivity demand new approaches to computation in the sciences,” said Ian Foster, Director of Argonne’s Data Science and Learning Division, Argonne Senior Scientist and Distinguished Fellow and the Arthur Holly Compton Distinguished Service Professor of Computer Science at the University of Chicago. “funcX makes it easy for scientists to run computations wherever it makes the most sense and move computations between resources. It will enable new science, reduce barriers and democratize access to advanced cyberinfrastructure, and enable researchers to compute wherever is most efficient.”

The scientific software used to process and extract discoveries from experimental data is typically made up of tens to thousands of smaller functions, blocks of code that handle individual jobs in the long pipeline of data analysis. These programs can be run in their entirety on a single system — be it a laptop, a campus cluster, or a supercomputer — but that uniform approach may not be optimal. Some complex tasks may need to run on high-performance computing resources, but some specialized functions may be better served by GPU accelerators, and the more routine jobs could be tackled by small, energy-efficient computers. 

Many obstacles prevent scientists from splitting their applications in this manner. Software is usually designed to run on a single type of machine or system, even if in parallel, and it is difficult to write code that splits off functions and sends them to different destinations. Beyond the programming, additional barriers exist for moving data to the desired computational endpoint, obtaining the proper security authorizations, and scheduling usage between different remote resources based on availability. 

funcX solves this problem by adapting the serverless “function-as-a-service” (FaaS) model for the specialized needs and resources of the science and engineering research community. Many of today’s mobile apps and Internet of Things devices use this approach for their sporadic computational demand — offloading more intensive tasks that can’t be handled on their smaller, local computers. 

“The research ecosystem comprises a range of existing systems: from HPC clusters to clouds and supercomputers,” said Chard, Research Assistant Professor at UChicago CS. “Researchers have varied workloads, large amounts of distributed data, and dynamic collaborations, so there isn't a one-size-fits-all mapping from workloads to systems. funcX will make it easy to distribute these functions to different computing systems without needing to think about their differences, and will integrate with the scientific ecosystem for authentication, data management, and heterogeneous computing infrastructure.”

The funcX platform builds upon two existing research technologies: Globus, a research data management platform created at the University of Chicago and Argonne National Laboratory, and Parsl, a Python library for executing parallel workflows created at the University of Chicago, Argonne National Laboratory, and the University of Illinois. funcX will also use Amazon Web Services for hosting management services, and integrate cyberinfrastructure from campuses and national laboratories, such as Blue Waters, Stampede2, and XSEDE

The project will work with a wide range of scientific partners studying urban science, materials science, quantum chemistry, neuroanatomy, cosmology, and other data-intensive subjects. These partners will provide use cases and testbeds for the funcX platform, working with the designers to create powerful, flexible features and an ecosystem of users and computational endpoints. 

“We've got an amazing group of collaborators from different science domains, research computing centers and national cyberinfrastructure providers, and NSF software institutes that will help shape this project,” Chard said. “We see this engagement as vital for not only understanding needs across a range of domains but also for piloting the system and demonstrating value in their domain. This will help us ensure that funcX is not only valuable for our partners but also more broadly to all researchers.”

This project is a collaboration between the Universities of Chicago and Illinois, following their successful partnership in developing Parsl. 

“Since its creation by NSF and the state of Illinois in 1986, NCSA has been a leader in working with and supporting scientific and engineering communities through the development, deployment, and use of new computing and software technologies,” said Katz, Assistant Director for Scientific Software and Applications at the National Center for Supercomputing Applications (NCSA) at the University of Illinois. “This project exemplifies our effort to develop leading-edge software that enables new research on our systems.”

Follow the project online at funcx.org.

Related News

More UChicago CS stories from this research area.
UChicago CS News

New 2022-23 CS Faculty Add Expertise in Linguistics, Visualization, Economics, and Data Science Education

Aug 11, 2022
In the News

UChicago Co-Leads $10 Million NSF Institute on Foundations of Data Science

Aug 09, 2022
UChicago CS News

UChicago CS Faculty Receive Industry Grants From J.P. Morgan, Google

Jul 19, 2022
In the News

Bill Fefferman Comments on New Standards for Quantum-Proof Cryptography

Jul 07, 2022
UChicago CS News

UChicago London Colloquium Features Data Science, Quantum Research

Jul 01, 2022

Is it Ethical to Use Facial Imaging in Decision-Making?

Jun 28, 2022
UChicago CS News

Single Sign-On Migration for Chameleon Project Receives PEARC Best Paper Award

Jun 27, 2022
UChicago CS News

EPiQC Post-Doc Pens Op-Ed on Potential of Quantum Computing for Chemistry

Jun 24, 2022
UChicago CS News

Faculty Bill Fefferman and Chenhao Tan Receive Google Research Scholar Awards

Jun 21, 2022
UChicago CS News

Two Incoming UChicago CS PhD Students Receive Department of Energy Fellowship

Jun 16, 2022
UChicago CS News

Prof. Yanjing Li Receives Under-40 Innovators Award from DAC

Jun 15, 2022
UChicago CS News

UChicago CS Chair Michael Franklin Part of SIGMOD Award-Winning Team

Jun 15, 2022
arrow-down-largearrow-left-largearrow-right-large-greyarrow-right-large-yellowarrow-right-largearrow-right-smallbutton-arrowclosedocumentfacebookfacet-arrow-down-whitefacet-arrow-downPage 1CheckedCheckedicon-apple-t5backgroundLayer 1icon-google-t5icon-office365-t5icon-outlook-t5backgroundLayer 1icon-outlookcom-t5backgroundLayer 1icon-yahoo-t5backgroundLayer 1internal-yellowinternalintranetlinkedinlinkoutpauseplaypresentationsearch-bluesearchshareslider-arrow-nextslider-arrow-prevtwittervideoyoutube