Icelandic data driven part of speech tagging

Mark Dredze, Joel Wallenberg

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Data driven POS tagging has achieved good performance for English, but can still lag behind linguistic rule based taggers for morphologically complex languages, such as Icelandic. We extend a statistical tagger to handle fine grained tagsets and improve over the best Icelandic POS tagger. Additionally, we develop a case tagger for non-local case and gender decisions. An error analysis of our system suggests future directions.

Original languageEnglish (US)
Title of host publicationACL-08
Subtitle of host publicationHLT - 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference
PublisherAssociation for Computational Linguistics (ACL)
Pages33-36
Number of pages4
ISBN (Print)9781932432046
DOIs
StatePublished - 2008
Externally publishedYes
Event46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, ACL-08: HLT - Columbus, OH, United States
Duration: Jun 15 2008Jun 20 2008

Publication series

NameACL-08: HLT - 46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference

Conference

Conference46th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, ACL-08: HLT
Country/TerritoryUnited States
CityColumbus, OH
Period6/15/086/20/08

ASJC Scopus subject areas

  • Language and Linguistics
  • Computer Networks and Communications
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'Icelandic data driven part of speech tagging'. Together they form a unique fingerprint.

Cite this