MS Presentation: Zhi Hong – Department of Computer Science

Date & Time:

November 8, 2019 1:30 pm – 2:30 pm

Location:

Crerar 298, 5730 S. Ellis Ave., Chicago, IL,

11/08/2019 01:30 PM 11/08/2019 02:30 PM America/Chicago MS Presentation: Zhi Hong Crerar 298, 5730 S. Ellis Ave., Chicago, IL,

Enabling Generalizable Scientific Named Entity
Recognition

Over the past decades, we have witnessed the explosive growth of the
hardware capabilities on computers. Machine Learning and Deep Learning
models, of which the theoretical foundations have been established
long ago, are finally computationally feasible. This does not only
affect computer science. In fact, more and more disciplines are
turning into “data sciences”, with cheaper, safer, easier data-based
simulations providing insights and guidance for traditional
experiments. These data-based methods require large of amounts of
data, especially structured data that can be easily understood and
processed by computers. Yet scientists have relied on written papers,
not digital databases, to disseminate their discoveries for several
centuries. Scientific papers are intended to be read by humans, and
most adequately convey not only discoveries, but the conditions and
methods by which those discoveries were made. Unfortunately, the
ambiguity and variability inherent in natural language makes the
automated extraction of claims from scientific papers very difficult.
Even apparently simple tasks, such as isolating reported values for
physical quantities (e.g., “the melting point of X is Y”) can be
complicated by such factors as domain-specific conventions about how
named entities (the X in the example) are referenced. Although there
are domain-specific toolkits that can handle such complications in
certain areas, a generalizable, adaptable model for scientific texts
is still lacking. In this thesis, we present our first step towards
automating this process. We have de- signed, implemented, and
evaluated models based on classifiers and neural networks for
recognizing scientific entities in free text in multiple domains.
Experiments show that our neural network model outperforms a leading
domain-specific extraction toolkit by up to 50%, as measured by F1
score, while also being easily adapted to new domains.

Zhi Hong

M.S. Candidate, University of Chicago

Zhi's advisor is Prof. Ian Foster

Resources

Community

Helping Elementary School Children Learn About Digital Privacy and Security With Micro-Lessons

New Study Reveals Gaps in Common Types of Cybersecurity Training

Jasmine Lu on Sustainable Computing: Rethinking E-Waste and Innovation

Hao Zhu (Stanford)- Ushering AI Agents to an Open Social World

Noah Apthorpe (Colgate)- Measuring the Impacts of Technology Policy: Age Gating, Authentication Security, and User Protection

Jovan Stojkovic (UIUC)- Chasing the “Tail at Scale”: Toward Cloud-Native Architectures

“Machine Learning Foundations Accelerate Innovation and Promote Trustworthiness” by Rebecca Willett

Nightshade: Data Poisoning to Fight Generative AI with Ben Zhao

Ian Foster – Better Information Faster: Programming the Continuum

Enabling Generalizable Scientific Named Entity Recognition

Zhi Hong

UChicago Partners On New National Science Foundation Large-Scale Research Infrastructure For Education

Data Ecology: A Socio-Technical Approach to Controlling Dataflows

NeurIPS 2023 Award-winning paper by DSI Faculty Bo Li, DecodingTrust, provides a comprehensive framework for assessing trustworthiness of GPT models

“Machine Learning Foundations Accelerate Innovation and Promote Trustworthiness” by Rebecca Willett

Nightshade: Data Poisoning to Fight Generative AI with Ben Zhao

UChicago Undergrad Analyzes Machine Learning Models Used By CPD, Uncovers Lack of Transparency About Data Usage

In The News: U.N. Officials Urge Regulation of Artificial Intelligence

UChicago Computer Scientists Bring in Generative Neural Networks to Stop Real-Time Video From Lagging

UChicago Team Wins The NIH Long COVID Computational Challenge

UChicago Assistant Professor Raul Castro Fernandez Receives 2023 ACM SIGMOD Test-of-Time Award

Mike Franklin, Dan Nicolae Receive 2023 Arthur L. Kelly Faculty Prize

PhD Student Kevin Bryson Receives NSF Graduate Research Fellowship to Create Equitable Algorithmic Data Tools

Enabling Generalizable Scientific Named Entity
Recognition