TY - GEN
T1 - Using Author Embeddings to Improve Tweet Stance Classification
AU - Benton, Adrian
AU - Dredze, Mark
N1 - Publisher Copyright:
© 2018 Association for Computational Linguistics.
PY - 2018
Y1 - 2018
N2 - Many social media classification tasks analyze the content of a message, but do not consider the context of the message. For example, in tweet stance classification – where a tweet is categorized according to a viewpoint it espouses – the expressed viewpoint depends on latent beliefs held by the user. In this paper we investigate whether incorporating knowledge about the author can improve tweet stance classification. Furthermore, since author information and embeddings are often unavailable for labeled training examples, we propose a semi-supervised pre-training method to predict user embeddings. Although the neural stance classifiers we learn are often outperformed by a baseline SVM, author embedding pre-training yields improvements over a non-pre-trained neural network on four out of five domains in the SemEval 2016 6A tweet stance classification task. In a tweet gun control stance classification dataset, improvements from pre-training are only apparent when training data is limited.
AB - Many social media classification tasks analyze the content of a message, but do not consider the context of the message. For example, in tweet stance classification – where a tweet is categorized according to a viewpoint it espouses – the expressed viewpoint depends on latent beliefs held by the user. In this paper we investigate whether incorporating knowledge about the author can improve tweet stance classification. Furthermore, since author information and embeddings are often unavailable for labeled training examples, we propose a semi-supervised pre-training method to predict user embeddings. Although the neural stance classifiers we learn are often outperformed by a baseline SVM, author embedding pre-training yields improvements over a non-pre-trained neural network on four out of five domains in the SemEval 2016 6A tweet stance classification task. In a tweet gun control stance classification dataset, improvements from pre-training are only apparent when training data is limited.
UR - http://www.scopus.com/inward/record.url?scp=85064229028&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85064229028&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85064229028
T3 - 4th Workshop on Noisy User-Generated Text, W-NUT 2018 - Proceedings of the Workshop
SP - 184
EP - 194
BT - 4th Workshop on Noisy User-Generated Text, W-NUT 2018 - Proceedings of the Workshop
PB - Association for Computational Linguistics (ACL)
T2 - 4th Workshop on Noisy User-Generated Text, W-NUT 2018
Y2 - 1 November 2018
ER -