PAL, a tool for pre-annotation and active learning

Research output: Contribution to journalArticle

Abstract

Many natural language processing systems rely on machine learning models that are trained on large amounts of manually annotated text data. The lack of sufficient amounts of annotated data is, however, a common obstacle for such systems, since manual annotation of text is often expensive and time-consuming. The aim of “PAL", a tool for Pre-annotation and Active Learning” is to provide a ready-made package that can be used to simplify annotation and to reduce the amount of annotated data required to train a machine learning classifier. The package provides support for two techniques that have been shown to be successful in previous studies, namely active learning and pre-annotation. The output of the pre-annotation is provided in the annotation format of the annotation tool BRAT, but PAL is a stand-alone package that can be adapted to other formats.

Details

Authors
Organisations
External organisations
  • Gagavai AB
  • Linnaeus University
Research areas and keywords

Subject classification (UKÄ) – MANDATORY

  • Language Technology (Computational Linguistics)
Original languageEnglish
Pages (from-to)91-110
Number of pages19
JournalJournal for Language Technology and Computational Linguistics
Volume31
Issue number1
Publication statusPublished - 2017
Publication categoryResearch
Peer-reviewedYes

Related projects

Carita Paradis, Andreas Kerren, Magnus Sahlgren, Kostiantyn Kucher, Maria Skeppstedt & Vasiliki Simaki

Swedish Research Council

2013/01/012018/01/02

Project: Research

View all (1)