Natural Language Processing Approaches for Retrieval of Clinically Relevant Genomic Information in Cancer

Taxiarchis Botsis, Joseph Murray, Alessandro Leal, Doreen Palsgrove, Wei Wang, James R. White, Victor E. Velculescu, Valsamo Anagnostou

Research output: Chapter in Book/Report/Conference proceedingConference contribution


The accelerating impact of genomic data in clinical decision-making has generated a paradigm shift from treatment based on the anatomic origin of the tumor to the incorporation of key genomic features to guide therapy. Assessing the clinical validity and utility of the genomic background of a patient's cancer represents one of the emerging challenges in oncology practice, demanding the development of automated platforms for extracting clinically relevant genomic information from medical texts. We developed PubMiner, a natural language processing tool to extract and interpret cancer type, therapy, and genomic information from biomedical abstracts. Our initial focus has been the retrieval of gene names, variants, and negations, where PubMiner performed highly in terms of total recall (91.7%) with a precision of 79.7%. Our next steps include developing a web-based interface to promote personalized treatment based on each tumor's unique genomic fingerprints.

Original languageEnglish (US)
Title of host publicationAdvances in Informatics, Management and Technology in Healthcare
EditorsJohn Mantas, Parisis Gallos, Emmanouil Zoulias, Arie Hasman, Mowafa S. Househ, Marianna Diomidous, Joseph Liaskos, Martha Charalampidou
PublisherIOS Press BV
Number of pages4
ISBN (Electronic)9781643682907
StatePublished - 2022

Publication series

NameStudies in Health Technology and Informatics
ISSN (Print)0926-9630
ISSN (Electronic)1879-8365


  • Natural language processing
  • actionable genomic alterations
  • cancer

ASJC Scopus subject areas

  • Health Information Management
  • Health Informatics
  • Biomedical Engineering


Dive into the research topics of 'Natural Language Processing Approaches for Retrieval of Clinically Relevant Genomic Information in Cancer'. Together they form a unique fingerprint.

Cite this