An Automated Menu for LHC Data and the Search for Dark Matter

The Large Hadron Collider delivers an enormous smorgasbord of data. Normal operation of the particle collider at Europe’s CERN laboratory can generate up to 100 terabytes per second of heterogeneous, high-dimensional data. Because it wouldn’t be possible for scientists and their computers to store — never mind analyze — this firehose of physics results, they rely upon filters, or “trigger systems,” that only keep data containing what they think will be the most interesting scientific information. 

Different versions of these trigger algorithms have been used in particle physics experiments for decades. But as the amount of data they filter grows faster than the storage and analysis capabilities of computers, new approaches are needed. Funded by a Center for Data and Computing Discovery Grant, a new collaboration between physicist David Miller and computer scientist Yuxin Chen seeks to build upon advances in explainable AI and active learning to create a new, autonomously-driven trigger system that can aid one of today’s foremost scientific quests: the search for dark matter.

“This started from the ridiculous idea that you could have, instead of a self-driving car, a self-driving trigger, or event selection and filter system,” said Miller, Assistant Professor of Physics, The College, and the Enrico Fermi Institute. “Today, we throw away 99.99 percent of the data. So we asked, can we learn from what we’re throwing away? Can we learn how to do things better, and therefore, can the system that is doing the filtering learn over time what to keep and what to throw away, and whether it should update its own algorithm?”

Currently, the LHC operates with a “menu” of triggers that filter data according to rules set by physicists, drawing upon past results, theory, and informed guesses about where important results will be found in experiments. While these triggers perform well when physicists have a clear sense of what they’re looking for, they may be less capable of finding previously unobserved phenomena, such as dark matter. However, an algorithmic observer could alleviate these blindspots by making novel decisions on what data to keep, based on both previous decisions and its own, independent observations of the data as it goes zooming past. 

But putting such a valuable task in the “hands” of an AI decision-maker is controversial to many in the physics world, Miller said. 

“Nobody’s even written a concept paper of what this ridiculous idea even means,” Miller said. “None of my physics colleagues would go for it, at least at the moment, because no one wants to let some black box make your decisions for you.”

But a recent move in the machine learning world towards “white box” or “explainable AI” could help bridge this confidence gap. Here, an algorithm wouldn’t just recommend that a certain type of data should be kept, it would also explain how it arrived at that decision. To further reassure physicists, Miller and Chen’s project will start by attempting to recreate the current, human-crafted triggers with their new approach, training the model on both real and simulated data and observing whether it keeps the same results as the current filters. 

When it comes to creating new filters that haven’t already been proposed by its human counterparts, the model will draw upon another two promising threads of AI research: online learning and active learning. Instead of earlier machine learning models, which are trained once on a large dataset and then make decisions or predictions on new data, an online learning model constantly evolves, updating its decision rules as each new data point comes in. 

Meanwhile, active learning allows the model to autonomously decide which data points are the most promising to keep to help make future decisions. These characteristics make such techniques well suited for streaming data, such as the constant flow of information from financial markets or security cameras, but also from the LHC — the largest stream of all. 

The fact that LHC data pours in so fast and in such volume will test the limits of state-of-the-art stream-based active learning approaches, and likely lead to new innovations that can benefit other applications. 

“There are a lot of connections with other areas such as high frequency trading, where systems have to scan through all this financial data and make real time decisions, or video processing, making efficient summarization of useful traits of scenes with limited processing power,” Chen said. “So hopefully the techniques we develop for this project will be applicable for a lot of other domains.” 

The project also may intersect with other CDAC projects involving Miller and Chen, including a Winter 2019 CDAC Discovery grant to build neural networks for analysis that draw upon the physical laws governing subatomic particles, and an AI + Science grant that seeks to build a “self-driving telescope” for cosmic surveys. Combined, these research efforts show the exciting potential of AI for the large-scale physics projects that help us better understand our universe.

“[These projects] are where this interplay of the physical sciences and the computer sciences is really interesting,” Miller said. “We’re after answers to fundamental questions about what particles exist in the universe, under what conditions they are produced, and how they behave once we’ve created them. To be able to apply this new approach in a context where we have practically an unlimited amount of data, and where we plan on taking a lot more new data in the near and distant future, is really intriguing.”

Related News

More UChicago CS stories from this research area.
UChicago CS News

Chicago Public Schools Student Chris Deng Pursues Internet Equity with University of Chicago Faculty

May 16, 2023
UChicago CS News

PhD Student Kevin Bryson Receives NSF Graduate Research Fellowship to Create Equitable Algorithmic Data Tools

Apr 14, 2023
UChicago CS News

Computer Science Displays Catch Attention at MSI’s Annual Robot Block Party

Apr 07, 2023
UChicago CS News

UChicago / School of the Art Institute Class Uses Art to Highlight Data Privacy Dangers

Apr 03, 2023
UChicago CS News

UChicago, Stanford Researchers Explore How Robots and Computers Can Help Strangers Have Meaningful In-Person Conversations

Mar 29, 2023
Students posing at competition
UChicago CS News

UChicago Undergrad Team Places Second Overall In Regionals For World’s Largest Programming Competition

Mar 17, 2023
UChicago CS News

Postdoc Alum John Paparrizos Named ICDE Rising Star

Mar 15, 2023
UChicago CS News

New EAGER Grant to Asst. Prof. Eric Jonas Will Explore ML for Quantum Spectrometry

Mar 03, 2023
Young students on computers
UChicago CS News

UChicago and NYU Research Team Finds Edtech Tools Could Pose Privacy Risks For Students

Feb 21, 2023
UChicago CS News

Assistant Professor Chenhao Tan Receives Sloan Research Fellowship

Feb 15, 2023
UChicago CS News

UChicago Scientists Develop New Tool to Protect Artists from AI Mimicry

Feb 13, 2023
Garcia sitting in a jet engine
UChicago CS News

Student Spotlight: Gabi Garcia’s Bridge Between CS and Classics

Jan 30, 2023
arrow-down-largearrow-left-largearrow-right-large-greyarrow-right-large-yellowarrow-right-largearrow-right-smallbutton-arrowclosedocumentfacebookfacet-arrow-down-whitefacet-arrow-downPage 1CheckedCheckedicon-apple-t5backgroundLayer 1icon-google-t5icon-office365-t5icon-outlook-t5backgroundLayer 1icon-outlookcom-t5backgroundLayer 1icon-yahoo-t5backgroundLayer 1internal-yellowinternalintranetlinkedinlinkoutpauseplaypresentationsearch-bluesearchshareslider-arrow-nextslider-arrow-prevtwittervideoyoutube