Using Large Language Models to Support Content Analysis: A Case Study of ChatGPT for Adverse Event Detection

Eric C. Leas, John W. Ayers, Nimit Desai, Mark Dredze, Michael Hogarth, Davey M. Smith

Research output: Contribution to journalArticlepeer-review

Abstract

This study explores the potential of using large language models to assist content analysis by conducting a case study to identify adverse events (AEs) in social media posts. The case study compares ChatGPT’s performance with human annotators’ in detecting AEs associated with delta-8-tetrahydrocannabinol, a cannabis-derived product. Using the identical instructions given to human annotators, ChatGPT closely approximated human results, with a high degree of agreement noted: 94.4% (9436/10,000) for any AE detection (Fleiss κ=0.95) and 99.3% (9931/10,000) for serious AEs (κ=0.96). These findings suggest that ChatGPT has the potential to replicate human annotation accurately and efficiently. The study recognizes possible limitations, including concerns about the generalizability due to ChatGPT’s training data, and prompts further research with different models, data sources, and content analysis tasks. The study highlights the promise of large language models for enhancing the efficiency of biomedical research.

Original languageEnglish (US)
Article numbere52499
JournalJournal of medical Internet research
Volume26
DOIs
StatePublished - 2024
Externally publishedYes

Keywords

  • AI
  • ChatGPT
  • LLM
  • adverse events
  • annotation
  • artificial intelligence
  • cannabis
  • delta-8-THC
  • delta-8-tetrahydrocannabiol
  • large language model
  • text analysis

ASJC Scopus subject areas

  • Health Informatics

Fingerprint

Dive into the research topics of 'Using Large Language Models to Support Content Analysis: A Case Study of ChatGPT for Adverse Event Detection'. Together they form a unique fingerprint.

Cite this