TY - GEN
T1 - Beto, Bentz, Becas
T2 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019
AU - Wu, Shijie
AU - Dredze, Mark
N1 - Publisher Copyright:
© 2019 Association for Computational Linguistics
PY - 2019
Y1 - 2019
N2 - Pretrained contextual representation models (Peters et al., 2018; Devlin et al., 2019) have pushed forward the state-of-the-art on many NLP tasks. A new release of BERT (Devlin, 2018) includes a model simultaneously pretrained on 104 languages with impressive performance for zero-shot cross-lingual transfer on a natural language inference task. This paper explores the broader cross-lingual potential of mBERT (multilingual) as a zero-shot language transfer model on 5 NLP tasks covering a total of 39 languages from various language families: NLI, document classification, NER, POS tagging, and dependency parsing. We compare mBERT with the best-published methods for zero-shot cross-lingual transfer and find mBERT competitive on each task. Additionally, we investigate the most effective strategy for utilizing mBERT in this manner, determine to what extent mBERT generalizes away from language-specific features, and measure factors that influence cross-lingual transfer.
AB - Pretrained contextual representation models (Peters et al., 2018; Devlin et al., 2019) have pushed forward the state-of-the-art on many NLP tasks. A new release of BERT (Devlin, 2018) includes a model simultaneously pretrained on 104 languages with impressive performance for zero-shot cross-lingual transfer on a natural language inference task. This paper explores the broader cross-lingual potential of mBERT (multilingual) as a zero-shot language transfer model on 5 NLP tasks covering a total of 39 languages from various language families: NLI, document classification, NER, POS tagging, and dependency parsing. We compare mBERT with the best-published methods for zero-shot cross-lingual transfer and find mBERT competitive on each task. Additionally, we investigate the most effective strategy for utilizing mBERT in this manner, determine to what extent mBERT generalizes away from language-specific features, and measure factors that influence cross-lingual transfer.
UR - http://www.scopus.com/inward/record.url?scp=85081201019&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85081201019&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85081201019
T3 - EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference
SP - 833
EP - 844
BT - EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference
PB - Association for Computational Linguistics
Y2 - 3 November 2019 through 7 November 2019
ER -