TY - GEN
T1 - Named entity recognition for Chinese social media with jointly trained embeddings
AU - Peng, Nanyun
AU - Dredze, Mark
N1 - Publisher Copyright:
© 2015 Association for Computational Linguistics.
PY - 2015
Y1 - 2015
N2 - We consider the task of named entity recognition for Chinese social media. The long line of work in Chinese NER has focused on formal domains, and NER for social media has been largely restricted to English. We present a new corpus of Weibo messages annotated for both name and nominal mentions. Additionally, we evaluate three types of neural embeddings for representing Chinese text. Finally, we propose a joint training objective for the embeddings that makes use of both (NER) labeled and unlabeled raw text. Our methods yield a 9% improvement over a stateof-the-art baseline.
AB - We consider the task of named entity recognition for Chinese social media. The long line of work in Chinese NER has focused on formal domains, and NER for social media has been largely restricted to English. We present a new corpus of Weibo messages annotated for both name and nominal mentions. Additionally, we evaluate three types of neural embeddings for representing Chinese text. Finally, we propose a joint training objective for the embeddings that makes use of both (NER) labeled and unlabeled raw text. Our methods yield a 9% improvement over a stateof-the-art baseline.
UR - http://www.scopus.com/inward/record.url?scp=84959875172&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84959875172&partnerID=8YFLogxK
U2 - 10.18653/v1/d15-1064
DO - 10.18653/v1/d15-1064
M3 - Conference contribution
AN - SCOPUS:84959875172
T3 - Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing
SP - 548
EP - 554
BT - Conference Proceedings - EMNLP 2015
PB - Association for Computational Linguistics (ACL)
T2 - Conference on Empirical Methods in Natural Language Processing, EMNLP 2015
Y2 - 17 September 2015 through 21 September 2015
ER -