TY - GEN
T1 - Challenges of using text classifiers for causal inference
AU - Wood-Doughty, Zach
AU - Shpitser, Ilya
AU - Dredze, Mark
N1 - Funding Information:
This work was in part supported by the National Institute of General Medical Sciences under grant number 5R01GM114771 and by the National Institute of Allergy and Infectious Diseases under grant number R01 AI127271-01A1. We thank the anonymous reviewers for their helpful comments.
Publisher Copyright:
© 2018 Association for Computational Linguistics
PY - 2018
Y1 - 2018
N2 - Causal understanding is essential for many kinds of decision-making, but causal inference from observational data has typically only been applied to structured, low-dimensional datasets. While text classifiers produce low-dimensional outputs, their use in causal inference has not previously been studied. To facilitate causal analyses based on language data, we consider the role that text classifiers can play in causal inference through established modeling mechanisms from the causality literature on missing data and measurement error. We demonstrate how to conduct causal analyses using text classifiers on simulated and Yelp data, and discuss the opportunities and challenges of future work that uses text data in causal inference.
AB - Causal understanding is essential for many kinds of decision-making, but causal inference from observational data has typically only been applied to structured, low-dimensional datasets. While text classifiers produce low-dimensional outputs, their use in causal inference has not previously been studied. To facilitate causal analyses based on language data, we consider the role that text classifiers can play in causal inference through established modeling mechanisms from the causality literature on missing data and measurement error. We demonstrate how to conduct causal analyses using text classifiers on simulated and Yelp data, and discuss the opportunities and challenges of future work that uses text data in causal inference.
UR - http://www.scopus.com/inward/record.url?scp=85081737795&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85081737795&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85081737795
T3 - Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018
SP - 4586
EP - 4598
BT - Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018
A2 - Riloff, Ellen
A2 - Chiang, David
A2 - Hockenmaier, Julia
A2 - Tsujii, Jun'ichi
PB - Association for Computational Linguistics
T2 - 2018 Conference on Empirical Methods in Natural Language Processing, EMNLP 2018
Y2 - 31 October 2018 through 4 November 2018
ER -