Just beneath the surface of today’s internet, there’s a deep ocean of data. Many of the most popular websites for online discussion, maps, and shopping now sit on top of massive databases. Yet users still expect near instantaneous load times regardless of the complex queries running behind the scenes.

A new study from University of Chicago and University of Washington researchers provides assistance to web developers confronting these growing performance challenges. By detecting and analyzing hundreds of inefficiencies in how a common framework for web applications interacts with databases, the study found they could make sites run several times faster with just small changes to code.

Many developers today use Object Relational Mapping (ORM) frameworks such as Ruby on Rails, Django, or Hibernate to build web applications that run on databases. Instead of learning an additional language, such as SQL, to handle calls to the database where persistent data for the application is stored, developers can use the ORM framework within their favored language to translate those interactions. However, this abstraction can lead to costly mistakes.

“If you want to create a web application or you have an idea and you want to do it quickly, ORM is a kind of middleman that does this translation, that's why this is so popular,” said Shan Lu, associate professor at UChicago CS and co-author of the study. “The problem is, now you are no longer aware of all the operations that are going on, and these interactions with the database are usually the most expensive part.”

To look for misapplications of ORM that could slow performance, Lu’s graduate student Junwen Yang looked at a dozen of the most popular open source web applications using Ruby on Rails, the most popular ORM framework. The team, which also included Pranav Subramaniam of UChicago and Washington’s Cong Yan and Alvin Cheung, reviewed code and previously-reported bugs from Github for each project, and built synthetic databases for each application.

Their analysis examined more than 200 performance issues from past and current versions of the applications, which were then grouped into 9 “anti-patterns” of ORM misuse. Some of these stemmed from simple confusion over the appropriate API call, such as the choice between “any?” versus “exists?” when querying the database.

“The documentation describes them as doing the same thing, but the performance sometimes can be ten times different or more,” Lu said. “So it looks simple, but when you actually look at it, I understand why there are so many problems, because these options can be so deceiving for developers.”

“Sometimes this is due to abstraction,” said Cheung, assistant professor of computer science & engineering at Washington. “It's very hard for a library developer to anticipate all possible uses of their functions, so they might have used one version in their library implementation, only to find out later that programs that call their library function could run much faster if they implemented their function some other way.”

Other performances issues could be traced back to the design of the database, or even human decisions about the way content is displayed by the application, such as returning all results for a search instead of breaking results into pages. Even simple features that only require a few lines of code can create serious performance issues if the underlying database query is very complex.

Charitably, the project didn’t stop at identifying these ORM problems, but also provided fixes for the 64 new issues the research team discovered. On average, these fixes improved web page load time from over 4 seconds to less than a second; the most extreme upgrade loaded a problematic page 39 times faster than before. These fixes were submitted via Github to the web application teams, and in many cases, have already been implemented into the latest versions, Yang said.

The fixes also were merged into the research team’s Hyperloop project, an ongoing effort to discover and correct common ORM-related inefficiencies in both software and databases. Rails developers can now download a plug-in called PowerStation for their coding environment that highlights ORM misuse and offers suggestions on how to speed up application performance. Because of the popularity of Ruby on Rails in the web development community, the paper has already received widespread attention, including from the Morning Paper blog and Hacker News.

“These web applications are so popular, and to some extent reasonably well developed, so all of us were just amazed that it was so easy to find hundreds and hundreds of performance problems in them,” Lu said. “That was fascinating to me.”

Read more about the study and the Powerstation analyzer in a blog post from the University of Washington Database Group.

Related News

More UChicago CS stories from this research area.
No Name

UChicago Team Wins The NIH Long COVID Computational Challenge

Jun 28, 2023
No Name

UChicago Assistant Professor Raul Castro Fernandez Receives 2023 ACM SIGMOD Test-of-Time Award

Jun 27, 2023
No Name

PhD Student Kevin Bryson Receives NSF Graduate Research Fellowship to Create Equitable Algorithmic Data Tools

Apr 14, 2023
No Name

Computer Science Displays Catch Attention at MSI’s Annual Robot Block Party

Apr 07, 2023
No Name

UChicago / School of the Art Institute Class Uses Art to Highlight Data Privacy Dangers

Apr 03, 2023
Students posing at competition
No Name

UChicago Undergrad Team Places Second Overall In Regionals For World’s Largest Programming Competition

Mar 17, 2023
No Name

Postdoc Alum John Paparrizos Named ICDE Rising Star

Mar 15, 2023
Young students on computers
No Name

UChicago and NYU Research Team Finds Edtech Tools Could Pose Privacy Risks For Students

Feb 21, 2023
Garcia sitting in a jet engine
No Name

Student Spotlight: Gabi Garcia’s Bridge Between CS and Classics

Jan 30, 2023
No Name

UChicago Launches Transform Accelerator for Data Science & Emerging AI Startups

Jan 19, 2023
Two students looking at a wearable device
No Name

High School Students Find Their Place in Computing Through Wearables Workshop

Jan 13, 2023
No Name

UChicago CS Research Finds New Angle on Database Query Processing with Geometry

Nov 08, 2022
arrow-down-largearrow-left-largearrow-right-large-greyarrow-right-large-yellowarrow-right-largearrow-right-smallbutton-arrowclosedocumentfacebookfacet-arrow-down-whitefacet-arrow-downPage 1CheckedCheckedicon-apple-t5backgroundLayer 1icon-google-t5icon-office365-t5icon-outlook-t5backgroundLayer 1icon-outlookcom-t5backgroundLayer 1icon-yahoo-t5backgroundLayer 1internal-yellowinternalintranetlinkedinlinkoutpauseplaypresentationsearch-bluesearchshareslider-arrow-nextslider-arrow-prevtwittervideoyoutube