Bryan Pardo (Northwestern) – Audio Source Separation Models That Learn Without Ground Truth

Date & Time:

March 6, 2019 1:00 pm – 2:00 pm

Location:

Harper Center 219, 5807 S. Woodlawn Ave., Chicago, IL,

03/06/2019 01:00 PM 03/06/2019 02:00 PM America/Chicago Bryan Pardo (Northwestern) – Audio Source Separation Models That Learn Without Ground Truth Joint University of Chicago and Toyota Technological Institute at Chicago Machine Learning Seminar Series Harper Center 219, 5807 S. Woodlawn Ave., Chicago, IL,

Audio Source Separation Models that Learn Without Ground Truth and are Open to User Correction

Separating an audio scene into isolated sources is a fundamental problem in computer audition, analogous to image segmentation in visual scene analysis. It is an enabling technology for many tasks, such as automatic speech recognition, labeling sound objects in an acoustic scene, music transcription, and remixing of existing recordings. Source separation systems based on deep learning are currently the most successful approaches for solving the underdetermined separation problem, where there are more sound sources (e.g. instruments in a band) than channels (a stereo recording has two channels). Currently, deep learning systems that perform source separation are trained on many mixtures (e.g., tens of thousands) for which the ground truth decompositions are already known. Since most real-world recordings have no such decomposition available, developers train systems on artificial mixtures created from isolated individual recordings. Although there are large databases of isolated speech, it is impractical to find or build large databases of isolated recordings for every arbitrary sound. This fundamentally limits the range of sounds that deep models can learn to separate. Once learned, a deep model’s output is take-it-or-leave it and it can be difficult for the end user to affect either the current output or to give corrective feedback for the future. In this talk Prof. Pardo discusses recent work in two areas. The first is bootstrapping learning of a scene segmentation model using an acoustic cue known to be used in human audition. This allows learning a model without access to ground-truth decompositions of acoustic scenes. The second is ongoing work to provide an interface for an end user to interact with a deep model, to affect the current separation and improve future separation by allowing for retraining of the model from corrective feedback.

Bryan Pardo

Associate Professor, Northwestern University

Bryan Pardo is an associate professor in the Northwestern University Department of Electrical Engineering and Computer Science. Prof. Pardo received a M. Mus. in Jazz Studies in 2001 and a Ph.D. in Computer Science in 2005, both from the University of Michigan. He has authored over 100 peer-reviewed publications. He has developed speech analysis software for the Speech and Hearing department of the Ohio State University, statistical software for SPSS and worked as a machine learning researcher for General Dynamics. While finishing his doctorate, he taught in the Music Department of Madonna University.

Resources

Community

What’s Real and What’s Not? Watermarking to Identify AI-Generated Text

Enhancing Multitasking Efficiency: The Role of Muscle Stimulation in Reducing Mental Workload

From wildfires to bird calls: Sage redefines environmental monitoring

“Machine Learning Foundations Accelerate Innovation and Promote Trustworthiness” by Rebecca Willett

Nightshade: Data Poisoning to Fight Generative AI with Ben Zhao

Ian Foster – Better Information Faster: Programming the Continuum

Audio Source Separation Models that Learn Without Ground Truth and are Open to User Correction

Bryan Pardo

“Machine Learning Foundations Accelerate Innovation and Promote Trustworthiness” by Rebecca Willett

Nightshade: Data Poisoning to Fight Generative AI with Ben Zhao

Five UChicago CS students named to Siebel Scholars Class of 2024

In The News: U.N. Officials Urge Regulation of Artificial Intelligence

UChicago Computer Scientists Bring in Generative Neural Networks to Stop Real-Time Video From Lagging

UChicago Team Wins The NIH Long COVID Computational Challenge

UChicago Assistant Professor Raul Castro Fernandez Receives 2023 ACM SIGMOD Test-of-Time Award

PhD Student Kevin Bryson Receives NSF Graduate Research Fellowship to Create Equitable Algorithmic Data Tools

Computer Science Displays Catch Attention at MSI’s Annual Robot Block Party

UChicago / School of the Art Institute Class Uses Art to Highlight Data Privacy Dangers

UChicago, Stanford Researchers Explore How Robots and Computers Can Help Strangers Have Meaningful In-Person Conversations

UChicago Undergrad Team Places Second Overall In Regionals For World’s Largest Programming Competition