Asst. Prof. Aaron Elmore Receives CAREER Award for Resource-Efficient Databases

For decades, the number one mission of databases was speed. Researchers and software developers raced to design systems that sifted through ever larger and complex datasets under the hood, returning desired answers to their users as quickly as possible. With Moore’s Law scaling up the available computation, resource use wasn’t usually a concern. But that luxury is coming to an end, and resource efficiency is a new priority as more computation shifts to the pay-for-compute cloud and remote devices.

Aaron Elmore, assistant professor at UChicago CS, develops database models that address this need, giving users the power to sacrifice speed for reduced resource use and cost. His approach, intermittent query processing (IQP), grafts machine learning prediction to database processing, providing more efficient computation to systems working with bursty data or intermittent monitoring. As a new recipient of the CAREER award, the National Science Foundation’s most prestigious award in support of early-career faculty, Elmore will continue designing these innovative systems for data-driven applications.

[Read about the five UChicago CS faculty who received NSF CAREER awards in the 2021 cycle.]

Many modern technologies, such as Internet of Things devices, sensors, or e-commerce, produce “bursty data” — data that comes in spaced-out, unpredictable packets. But databases are typically designed for either relatively static datasets that are updated infrequently, or more recently, steady streams of data. For these database types, systems use batch processing or continuous query strategies, but bursty data is a poor fit for either framework.

“Database systems traditionally were designed such that when you ask them to do something, they do everything they can do to get you that answer right now, or if data is constantly coming in, as soon as they get data they update the answer,” Elmore said. “My vision was to think about what we can do when we know that either the data, or a user’s interest, is going to be bursty. In this case, how can we be smart about deferring work to save resources?”

An example might be a bank keeping track of customer accounts and wanting to monitor balances that are higher than the average. While the overall average can be easily updated as new records are added to the database, searching for outliers requires comparing every account to the average every time it changes, a more computationally intensive task. IQP analyzes and separates out these “easy” and “hard” tasks, and then schedules them at the frequency that meets the user’s expectations, whether that’s determined by time, cost, or available computation.

“It might be a case where you’re on an edge or a battery-powered device, and you have limited resources. Or you could be on a cloud, where you’re paying for everything that you do,” Elmore said. “We wanted a knob that somebody could turn and say, ‘if I slow things down, or if I’m willing to trade this off, can I save some amount of money?’”

Elmore’s system folds in machine learning to help make these decisions by predicting when new data will arrive and how long different tasks will take. Because the database language SQL is declarative — users specify what they want, not how to do it — these estimates aren’t simple, Elmore said. So IQP builds a model on previous runs of the query, predicting future runtimes and resource usage. Users can then make their choices about how much they prioritize cost versus performance, and the database will automatically find the level of operation that satisfies those high-level goals.

Thus far, Elmore has created early versions of IQP in Spark and with the programming language Rust. The latter project, nicknamed CrustyDB, is used in his Introduction to Database Systems course, co-taught with Assistant Professor Raul Castro Fernandez. The IQP system is also part of a project with Sanjay Krishnan and Michael Franklin called CrocodileDB — so named because crocodiles remain motionless for long periods of time before they quickly strike, just like a database running intermittent queries.

The research involved in developing these systems goes hand in hand with Elmore’s teaching and mentoring work. The pedagogical databases he uses to teach database principles to computer science and Master’s in Computational Analysis and Public Policy (MS-CAPP) students include many IQP features, allowing students with an interest in database research to easily contribute to the project. Elmore has also taught younger students database basics as a mentor in the “BigDataX: From Theory to Practice in Big Data computing at eXtreme Scales” REU program and through the compileHer student group and their annual tech capstone event for middle school girls.

“As the value of data continues to grow, database systems are increasingly a critical part for teams working on data science, AI, big data, or IoT platforms,” Elmore said. “As faculty, it is essential to train the generation working with data to understand the solutions, principles, and opportunities of database systems.”

Related News

More UChicago CS stories from this research area.

NeurIPS 2023 Award-winning paper by DSI Faculty Bo Li, DecodingTrust, provides a comprehensive framework for assessing trustworthiness of GPT models

Feb 01, 2024
Video

“Machine Learning Foundations Accelerate Innovation and Promote Trustworthiness” by Rebecca Willett

Jan 26, 2024
Video

Nightshade: Data Poisoning to Fight Generative AI with Ben Zhao

Jan 23, 2024

UChicago Undergrad Analyzes Machine Learning Models Used By CPD, Uncovers Lack of Transparency About Data Usage

Oct 31, 2023

In The News: U.N. Officials Urge Regulation of Artificial Intelligence

"Security Council members said they feared that a new technology might prove a major threat to world peace."
Jul 27, 2023

UChicago Computer Scientists Bring in Generative Neural Networks to Stop Real-Time Video From Lagging

Jun 29, 2023

UChicago Team Wins The NIH Long COVID Computational Challenge

Jun 28, 2023

UChicago Assistant Professor Raul Castro Fernandez Receives 2023 ACM SIGMOD Test-of-Time Award

Jun 27, 2023
Michael Franklin

Mike Franklin, Dan Nicolae Receive 2023 Arthur L. Kelly Faculty Prize

Jun 02, 2023

PhD Student Kevin Bryson Receives NSF Graduate Research Fellowship to Create Equitable Algorithmic Data Tools

Apr 14, 2023

Computer Science Displays Catch Attention at MSI’s Annual Robot Block Party

Apr 07, 2023

UChicago / School of the Art Institute Class Uses Art to Highlight Data Privacy Dangers

Apr 03, 2023
arrow-down-largearrow-left-largearrow-right-large-greyarrow-right-large-yellowarrow-right-largearrow-right-smallbutton-arrowclosedocumentfacebookfacet-arrow-down-whitefacet-arrow-downPage 1CheckedCheckedicon-apple-t5backgroundLayer 1icon-google-t5icon-office365-t5icon-outlook-t5backgroundLayer 1icon-outlookcom-t5backgroundLayer 1icon-yahoo-t5backgroundLayer 1internal-yellowinternalintranetlinkedinlinkoutpauseplaypresentationsearch-bluesearchshareslider-arrow-nextslider-arrow-prevtwittervideoyoutube