A large-scale quantitative analysis of latent factors and sentiment in online doctor reviews

Byron C. Wallace; Michael J. Paul; Urmimala Sarkar; Thomas A. Trikalinos; Mark Dredze

doi:10.1136/amiajnl-2014-002711

A large-scale quantitative analysis of latent factors and sentiment in online doctor reviews

Byron C. Wallace, Michael J. Paul, Urmimala Sarkar, Thomas A. Trikalinos, Mark Dredze

Research output: Contribution to journal › Article › peer-review

Abstract

Online physician reviews are a massive and potentially rich source of information capturing patient sentiment regarding healthcare. We analyze a corpus comprising nearly 60 000 such reviews with a state-of-the-art probabilistic model of text. We describe a probabilistic generative model that captures latent sentiment across aspects of care (eg, interpersonal manner). We target specific aspects by leveraging a small set of manually annotated reviews. We perform regression analysis to assess whether model output improves correlation with state-level measures of healthcare. We report both qualitative and quantitative results. Model output correlates with state-level measures of quality healthcare, including patient likelihood of visiting their primary care physician within 14 days of discharge (p=0.03), and using the proposed model better predicts this outcome (p=0.10). We find similar results for healthcare expenditure. Generative models of text can recover important information from online physician reviews, facilitating large-scale analyses of such reviews.

Original language	English (US)
Pages (from-to)	1098-1103
Number of pages	6
Journal	Journal of the American Medical Informatics Association
Volume	21
Issue number	6
DOIs	https://doi.org/10.1136/amiajnl-2014-002711
State	Published - Jun 10 2014
Externally published	Yes

ASJC Scopus subject areas

Health Informatics

Access to Document

10.1136/amiajnl-2014-002711

Cite this

@article{76a73c0af1f6403ab506ee1566c8f16c,

title = "A large-scale quantitative analysis of latent factors and sentiment in online doctor reviews",

abstract = "Online physician reviews are a massive and potentially rich source of information capturing patient sentiment regarding healthcare. We analyze a corpus comprising nearly 60 000 such reviews with a state-of-the-art probabilistic model of text. We describe a probabilistic generative model that captures latent sentiment across aspects of care (eg, interpersonal manner). We target specific aspects by leveraging a small set of manually annotated reviews. We perform regression analysis to assess whether model output improves correlation with state-level measures of healthcare. We report both qualitative and quantitative results. Model output correlates with state-level measures of quality healthcare, including patient likelihood of visiting their primary care physician within 14 days of discharge (p=0.03), and using the proposed model better predicts this outcome (p=0.10). We find similar results for healthcare expenditure. Generative models of text can recover important information from online physician reviews, facilitating large-scale analyses of such reviews.",

author = "Wallace, {Byron C.} and Paul, {Michael J.} and Urmimala Sarkar and Trikalinos, {Thomas A.} and Mark Dredze",

year = "2014",

month = jun,

day = "10",

doi = "10.1136/amiajnl-2014-002711",

language = "English (US)",

volume = "21",

pages = "1098--1103",

journal = "Journal of the American Medical Informatics Association",

issn = "1067-5027",

publisher = "Oxford University Press",

number = "6",

}

TY - JOUR

T1 - A large-scale quantitative analysis of latent factors and sentiment in online doctor reviews

AU - Wallace, Byron C.

AU - Paul, Michael J.

AU - Sarkar, Urmimala

AU - Trikalinos, Thomas A.

AU - Dredze, Mark

PY - 2014/6/10

Y1 - 2014/6/10

N2 - Online physician reviews are a massive and potentially rich source of information capturing patient sentiment regarding healthcare. We analyze a corpus comprising nearly 60 000 such reviews with a state-of-the-art probabilistic model of text. We describe a probabilistic generative model that captures latent sentiment across aspects of care (eg, interpersonal manner). We target specific aspects by leveraging a small set of manually annotated reviews. We perform regression analysis to assess whether model output improves correlation with state-level measures of healthcare. We report both qualitative and quantitative results. Model output correlates with state-level measures of quality healthcare, including patient likelihood of visiting their primary care physician within 14 days of discharge (p=0.03), and using the proposed model better predicts this outcome (p=0.10). We find similar results for healthcare expenditure. Generative models of text can recover important information from online physician reviews, facilitating large-scale analyses of such reviews.

AB - Online physician reviews are a massive and potentially rich source of information capturing patient sentiment regarding healthcare. We analyze a corpus comprising nearly 60 000 such reviews with a state-of-the-art probabilistic model of text. We describe a probabilistic generative model that captures latent sentiment across aspects of care (eg, interpersonal manner). We target specific aspects by leveraging a small set of manually annotated reviews. We perform regression analysis to assess whether model output improves correlation with state-level measures of healthcare. We report both qualitative and quantitative results. Model output correlates with state-level measures of quality healthcare, including patient likelihood of visiting their primary care physician within 14 days of discharge (p=0.03), and using the proposed model better predicts this outcome (p=0.10). We find similar results for healthcare expenditure. Generative models of text can recover important information from online physician reviews, facilitating large-scale analyses of such reviews.

UR - http://www.scopus.com/inward/record.url?scp=84901906820&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84901906820&partnerID=8YFLogxK

U2 - 10.1136/amiajnl-2014-002711

DO - 10.1136/amiajnl-2014-002711

M3 - Article

C2 - 24918109

AN - SCOPUS:84901906820

SN - 1067-5027

VL - 21

SP - 1098

EP - 1103

JO - Journal of the American Medical Informatics Association

JF - Journal of the American Medical Informatics Association

IS - 6

ER -

A large-scale quantitative analysis of latent factors and sentiment in online doctor reviews

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this