Researcher John Paparrizos Wins ACM SIGKDD Dissertation Distinction

With more and more data pouring in from scientific collaborations, the internet, and sensored environments and machines, new systems and algorithms are needed to make sense of all that information. Many of these data streams take the form of time series, with values collected sequentially over periods of time, such as hourly weather data or stock market prices. But while time series are a very common format, researchers still lack the standards needed to automate their analysis.

As a PhD student at Columbia University and a postdoctoral researcher at the University of Chicago, John Paparrizos has worked to address this challenge. At this month’s 2019 ACM SIGKDD conference in Alaska, Paparrizos’ thesis, “Fast, Scalable, and Accurate Algorithms for Time-Series Analysis,” received an Honorable Mention for KDD’s Doctoral Dissertation Award.

Paparrizos’ dissertation describes a new set of algorithms and automated methods for analyzing time-series data, regardless of their domain. 

“The good thing is that currently we have the technological maturity to collect and store this data,” Paparrizos said. “We have different types of sensors for collecting data from natural processes and human-made artifacts, we have the computational infrastructure to store them, and we have large-scale dataflow systems to process them. But the fact is all of these systems, as well as most of the methods they support, have been designed for essentially static data.  With the rapid growth of Internet-of-Things data volumes, we need to support applications for data that evolve over time.”

Typically, researchers analyzing time series need to do the same set of analytic tasks as in other domains, such as similarity search, classification, and clustering. But due to several challenges, such as the broad ranges of domains that generate time series and the high-dimensionality of datasets that can have millions of time points, the representations required for these analyses are usually created from scratch, one project or application at a time.

“What we were saying was, can we do something better than that? Can we essentially automate the process of constructing representations that preserve crucial characteristics to support time-series analytics?,” Paparrizos said. “It’s not sustainable to have Ph.D. students working for five years in order to achieve these things again and again.”

Experiments in Paparrizos’ dissertation showed that the proposed methods achieve state-of-the-art performance on over 80+ different time-series datasets, though much more efficiently than prior work. That’s useful not only for saving scientists’ time in the future, but also for developing analytic systems capable of running on limited computational resources, which will be critical for the next wave of Internet of Things and edge computing applications. 

The thesis also describes methods for two new scientific contexts. In one, Paparrizos helped create a model that predicts which scientific concepts will have long-term impact, to help guide the decisions of funding agencies. Another project created a system that detects when people search for symptoms that may be predictive of serious diseases such as pancreatic cancer, which could trigger warnings to seek medical testing.

At UChicago, Paparrizos continues his thesis work by integrating the methods into databases, so that users can perform their analyses without moving these large datasets to external software. He’s also expanding his work for multivariate time series and to exploit alternative approaches, such as neural networks. Last year, he received a fellowship from data services company NetApp to create new methods that enable the analysis of compressed, large-scale data.

“Companies and scientists are now measuring multiple things at the same time, and they want to perform analysis over multiple different sensors, which will require significant changes in current approaches,” Paparrizos said.

Related News

More UChicago CS stories from this research area.
UChicago CS News

Five UChicago CS students named to Siebel Scholars Class of 2024

Oct 02, 2023
UChicago CS News

UChicago Computer Scientists Bring in Generative Neural Networks to Stop Real-Time Video From Lagging

Jun 29, 2023
UChicago CS News

UChicago Team Wins The NIH Long COVID Computational Challenge

Jun 28, 2023
UChicago CS News

UChicago Assistant Professor Raul Castro Fernandez Receives 2023 ACM SIGMOD Test-of-Time Award

Jun 27, 2023
UChicago CS News

PhD Student Kevin Bryson Receives NSF Graduate Research Fellowship to Create Equitable Algorithmic Data Tools

Apr 14, 2023
UChicago CS News

Computer Science Displays Catch Attention at MSI’s Annual Robot Block Party

Apr 07, 2023
UChicago CS News

UChicago / School of the Art Institute Class Uses Art to Highlight Data Privacy Dangers

Apr 03, 2023
Students posing at competition
UChicago CS News

UChicago Undergrad Team Places Second Overall In Regionals For World’s Largest Programming Competition

Mar 17, 2023
UChicago CS News

Postdoc Alum John Paparrizos Named ICDE Rising Star

Mar 15, 2023
Young students on computers
UChicago CS News

UChicago and NYU Research Team Finds Edtech Tools Could Pose Privacy Risks For Students

Feb 21, 2023
Garcia sitting in a jet engine
UChicago CS News

Student Spotlight: Gabi Garcia’s Bridge Between CS and Classics

Jan 30, 2023
UChicago CS News

UChicago Launches Transform Accelerator for Data Science & Emerging AI Startups

Jan 19, 2023
arrow-down-largearrow-left-largearrow-right-large-greyarrow-right-large-yellowarrow-right-largearrow-right-smallbutton-arrowclosedocumentfacebookfacet-arrow-down-whitefacet-arrow-downPage 1CheckedCheckedicon-apple-t5backgroundLayer 1icon-google-t5icon-office365-t5icon-outlook-t5backgroundLayer 1icon-outlookcom-t5backgroundLayer 1icon-yahoo-t5backgroundLayer 1internal-yellowinternalintranetlinkedinlinkoutpauseplaypresentationsearch-bluesearchshareslider-arrow-nextslider-arrow-prevtwittervideoyoutube