Hedwig: A named entity linker

Forskningsoutput: Kapitel i bok/rapport/Conference proceedingKonferenspaper i proceedingPeer review

Sammanfattning

Named entity linking is the task of identifying mentions of named things in text, such as “Barack Obama” or “New York”, and linking these mentions to unique identifiers. In this paper, we describe Hedwig, an end-to-end named entity linker, which uses a combination of word and character BILSTM models for mention detection, a Wikidata and Wikipedia-derived knowledge base with global information aggregated over nine language editions, and a PageRank algorithm for entity linking. We evaluated Hedwig on the TAC2017 dataset, consisting of news texts and discussion forums, and we obtained a final score of 59.9% on CEAFmC+, an improvement over our previous generation linker Ugglan, and a trilingual entity link score of 71.9%.

Originalspråkengelska
Titel på värdpublikationLREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings
RedaktörerNicoletta Calzolari, Frederic Bechet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Helene Mazo, Asuncion Moreno, Jan Odijk, Stelios Piperidis
FörlagEuropean Language Resources Association
Sidor4501-4508
Antal sidor8
ISBN (elektroniskt)9791095546344
StatusPublished - 2020
Evenemang12th International Conference on Language Resources and Evaluation, LREC 2020 - Marseille, Frankrike
Varaktighet: 2020 maj 112020 maj 16

Publikationsserier

NamnLREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings

Konferens

Konferens12th International Conference on Language Resources and Evaluation, LREC 2020
Land/TerritoriumFrankrike
OrtMarseille
Period2020/05/112020/05/16

Ämnesklassifikation (UKÄ)

  • Datavetenskap (datalogi)
  • Språkteknologi (språkvetenskaplig databehandling)

Citera det här