The Problem of Semantic Shift in Longitudinal Monitoring of Social Media: A Case Study on Mental Health During the COVID-19 Pandemic

Keith Harrigian, Mark Dredze

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Social media allows researchers to track societal and cultural changes over time based on language analysis tools. Many of these tools rely on statistical algorithms which need to be tuned to specific types of language. Recent studies have shown the absence of appropriate tuning, specifically in the presence of semantic shift, can hinder robustness of the underlying methods. However, little is known about the practical effect this sensitivity may have on downstream longitudinal analyses. We explore this gap in the literature through a timely case study: understanding shifts in depression during the course of the COVID-19 pandemic. We find that inclusion of only a small number of semantically-unstable features can promote significant changes in longitudinal estimates of our target outcome. At the same time, we demonstrate that a recently-introduced method for measuring semantic shift may be used to proactively identify failure points of language-based models and, in turn, improve predictive generalization.

Original languageEnglish (US)
Title of host publicationWebSci 2022 - Proceedings of the 14th ACM Web Science Conference
PublisherAssociation for Computing Machinery
Pages208-218
Number of pages11
ISBN (Electronic)9781450391917
DOIs
StatePublished - Jun 26 2022
Event14th ACM Web Science Conference, WebSci 2022 - Virtual, Online, Spain
Duration: Jun 26 2022Jun 29 2022

Publication series

NameACM International Conference Proceeding Series

Conference

Conference14th ACM Web Science Conference, WebSci 2022
Country/TerritorySpain
CityVirtual, Online
Period6/26/226/29/22

Keywords

  • longitudinal monitoring
  • mental health
  • semantic shift

ASJC Scopus subject areas

  • Software
  • Human-Computer Interaction
  • Computer Vision and Pattern Recognition
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'The Problem of Semantic Shift in Longitudinal Monitoring of Social Media: A Case Study on Mental Health During the COVID-19 Pandemic'. Together they form a unique fingerprint.

Cite this