NSF Grant Backs funcX — A Smart, Automated Delegator for Computational Research

Computational scientific research is no longer one-size-fits-all. The massive datasets created by today’s cutting-edge instruments and experiments — telescopes, particle accelerators, sensor networks and molecular simulations — aren’t best processed and analyzed by a single type of machine. For faster and more efficient discovery, data can be chopped up and shipped to specialized resources, including supercomputers, campus clusters, cloud data centers, and “accelerators” optimized for specific tasks such as machine learning or visualization.

But delegating chunks of data and analysis functions to their ideal destination isn’t trivial. A team led by UChicago CS researchers Ian Foster and Kyle Chard and Daniel S. Katz of the National Center for Supercomputing Applications at the University of Illinois seeks to streamline the process with funcX, a new distributed “function-as-a-service” (FaaS) platform that makes it easier for researchers to easily and automatically delegate their computational workload. With a pair of grants to the Universities of Chicago and Illinois from the National Science Foundation totalling $3.14 million, the team will work with several large science projects and cyberinfrastructure partners to build and test this new system for “computationally fluid” research.

“Exploding data volumes and velocities, new computational methods and platforms, and ubiquitous connectivity demand new approaches to computation in the sciences,” said Ian Foster, Director of Argonne’s Data Science and Learning Division, Argonne Senior Scientist and Distinguished Fellow and the Arthur Holly Compton Distinguished Service Professor of Computer Science at the University of Chicago. “funcX makes it easy for scientists to run computations wherever it makes the most sense and move computations between resources. It will enable new science, reduce barriers and democratize access to advanced cyberinfrastructure, and enable researchers to compute wherever is most efficient.”

The scientific software used to process and extract discoveries from experimental data is typically made up of tens to thousands of smaller functions, blocks of code that handle individual jobs in the long pipeline of data analysis. These programs can be run in their entirety on a single system — be it a laptop, a campus cluster, or a supercomputer — but that uniform approach may not be optimal. Some complex tasks may need to run on high-performance computing resources, but some specialized functions may be better served by GPU accelerators, and the more routine jobs could be tackled by small, energy-efficient computers. 

Many obstacles prevent scientists from splitting their applications in this manner. Software is usually designed to run on a single type of machine or system, even if in parallel, and it is difficult to write code that splits off functions and sends them to different destinations. Beyond the programming, additional barriers exist for moving data to the desired computational endpoint, obtaining the proper security authorizations, and scheduling usage between different remote resources based on availability. 

funcX solves this problem by adapting the serverless “function-as-a-service” (FaaS) model for the specialized needs and resources of the science and engineering research community. Many of today’s mobile apps and Internet of Things devices use this approach for their sporadic computational demand — offloading more intensive tasks that can’t be handled on their smaller, local computers. 

“The research ecosystem comprises a range of existing systems: from HPC clusters to clouds and supercomputers,” said Chard, Research Assistant Professor at UChicago CS. “Researchers have varied workloads, large amounts of distributed data, and dynamic collaborations, so there isn't a one-size-fits-all mapping from workloads to systems. funcX will make it easy to distribute these functions to different computing systems without needing to think about their differences, and will integrate with the scientific ecosystem for authentication, data management, and heterogeneous computing infrastructure.”

The funcX platform builds upon two existing research technologies: Globus, a research data management platform created at the University of Chicago and Argonne National Laboratory, and Parsl, a Python library for executing parallel workflows created at the University of Chicago, Argonne National Laboratory, and the University of Illinois. funcX will also use Amazon Web Services for hosting management services, and integrate cyberinfrastructure from campuses and national laboratories, such as Blue Waters, Stampede2, and XSEDE

The project will work with a wide range of scientific partners studying urban science, materials science, quantum chemistry, neuroanatomy, cosmology, and other data-intensive subjects. These partners will provide use cases and testbeds for the funcX platform, working with the designers to create powerful, flexible features and an ecosystem of users and computational endpoints. 

“We've got an amazing group of collaborators from different science domains, research computing centers and national cyberinfrastructure providers, and NSF software institutes that will help shape this project,” Chard said. “We see this engagement as vital for not only understanding needs across a range of domains but also for piloting the system and demonstrating value in their domain. This will help us ensure that funcX is not only valuable for our partners but also more broadly to all researchers.”

This project is a collaboration between the Universities of Chicago and Illinois, following their successful partnership in developing Parsl. 

“Since its creation by NSF and the state of Illinois in 1986, NCSA has been a leader in working with and supporting scientific and engineering communities through the development, deployment, and use of new computing and software technologies,” said Katz, Assistant Director for Scientific Software and Applications at the National Center for Supercomputing Applications (NCSA) at the University of Illinois. “This project exemplifies our effort to develop leading-edge software that enables new research on our systems.”

Follow the project online at funcx.org.

Related News

More UChicago CS stories from this research area.

Argonne scientists use AI to identify new materials for carbon capture

Feb 19, 2024

NeurIPS 2023 Award-winning paper by DSI Faculty Bo Li, DecodingTrust, provides a comprehensive framework for assessing trustworthiness of GPT models

Feb 01, 2024

New research unites quantum engineering and artificial intelligence

Jan 29, 2024
Video

“Machine Learning Foundations Accelerate Innovation and Promote Trustworthiness” by Rebecca Willett

Jan 26, 2024
Video

Nightshade: Data Poisoning to Fight Generative AI with Ben Zhao

Jan 23, 2024

Group From UChicago CS To Present Four Papers at Most Prestigious International Quantum Conference

Jan 09, 2024

UChicago Undergrad Analyzes Machine Learning Models Used By CPD, Uncovers Lack of Transparency About Data Usage

Oct 31, 2023

Research Suggests That Privacy and Security Protection Fell To The Wayside During Remote Learning

A qualitative research study conducted by faculty and students at the University of Chicago and University of Maryland revealed key...
Oct 18, 2023

Five UChicago CS students named to Siebel Scholars Class of 2024

Oct 02, 2023

UChicago Researchers Win Internet Defense Prize and Distinguished Paper Awards at USENIX Security

Sep 05, 2023

In The News: U.N. Officials Urge Regulation of Artificial Intelligence

"Security Council members said they feared that a new technology might prove a major threat to world peace."
Jul 27, 2023

UChicago Scientists Make New Discovery Proving Entanglement Is Responsible for Computational Hardness In Quantum Systems

Jul 25, 2023
arrow-down-largearrow-left-largearrow-right-large-greyarrow-right-large-yellowarrow-right-largearrow-right-smallbutton-arrowclosedocumentfacebookfacet-arrow-down-whitefacet-arrow-downPage 1CheckedCheckedicon-apple-t5backgroundLayer 1icon-google-t5icon-office365-t5icon-outlook-t5backgroundLayer 1icon-outlookcom-t5backgroundLayer 1icon-yahoo-t5backgroundLayer 1internal-yellowinternalintranetlinkedinlinkoutpauseplaypresentationsearch-bluesearchshareslider-arrow-nextslider-arrow-prevtwittervideoyoutube