Icelandic data driven part of speech tagging

Mark Dredze, Joel Wallenberg

Research output: Contribution to journalConference articlepeer-review

Abstract

Data driven POS tagging has achieved good performance for English, but can still lag behind linguistic rule based taggers for morphologically complex languages, such as Icelandic. We extend a statistical tagger to handle fine grained tagsets and improve over the best Icelandic POS tagger. Additionally, we develop a case tagger for non-local case and gender decisions. An error analysis of our system suggests future directions.

Original languageEnglish (US)
Pages (from-to)33-36
Number of pages4
JournalProceedings of the Annual Meeting of the Association for Computational Linguistics
DOIs
StatePublished - 2008
Externally publishedYes
Event46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, ACL 2008 - Columbus, United States
Duration: Jun 16 2008Jun 17 2008

ASJC Scopus subject areas

  • Computer Science Applications
  • Linguistics and Language
  • Language and Linguistics

Fingerprint

Dive into the research topics of 'Icelandic data driven part of speech tagging'. Together they form a unique fingerprint.

Cite this