Go to main content

PDF

Description

Word Sense Disambiguation (WSD) is a well-researched problem in Natural Language Processing with decades of papers written about it and new techniques coming out every year. However, this technique is still under-explored in certain domains such as scientific texts. Scientific papers are a particularly interesting use case as there is a rich amount of information associated with each paper that can potentially improve existing approaches for disambiguation, mainly its title, abstract, and its place in the citation graph. A related area of study that also works in this direction is Acronym Disambiguation (AD). We believe that these two problems are related and similar techniques could strongly perform in both settings. However, there is a lack of a large dataset for WSD in scientific texts, motivating the need to create one artificially, without spending an exorbitant amount of resources. Thus, we turn towards Pseudowords as a means of creating this dataset. We demonstrate that using paper information can lead to improvements in AD and WSD and present a brand-new dataset to further research in Scientific WSD.

Details

Files

Statistics

from
to
Export
Download Full History
Formats
Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS