TY - JOUR
T1 - Electronic medical records for genetic research
T2 - Results of the eMERGE consortium
AU - Kho, Abel N.
AU - Pacheco, Jennifer A.
AU - Peissig, Peggy L.
AU - Rasmussen, Luke
AU - Newton, Katherine M.
AU - Weston, Noah
AU - Crane, Paul K.
AU - Pathak, Jyotishman
AU - Chute, Christopher G.
AU - Bielinski, Suzette J.
AU - Kullo, Iftikhar J.
AU - Li, Rongling
AU - Manolio, Teri A.
AU - Chisholm, Rex L.
AU - Denny, Joshua C.
PY - 2011/4/20
Y1 - 2011/4/20
N2 - Clinical data in electronic medical records (EMRs) are a potential source of longitudinal clinical data for research. The Electronic Medical Records and Genomics Network (eMERGE) investigates whether data captured through routine clinical care using EMRs can identify disease phenotypes with sufficient positive and negative predictive values for use in genome-wide association studies (GWAS). Using data from five different sets of EMRs, we have identified five disease phenotypes with positive predictive values of 73 to 98% and negative predictive values of 98 to 100%. Most EMRs captured key information (diagnoses, medications, laboratory tests) used to define phenotypes in a structured format. We identified natural language processing as an important tool to improve case identification rates. Efforts and incentives to increase the implementation of interoperable EMRs will markedly improve the availability of clinical data for genomics research.
AB - Clinical data in electronic medical records (EMRs) are a potential source of longitudinal clinical data for research. The Electronic Medical Records and Genomics Network (eMERGE) investigates whether data captured through routine clinical care using EMRs can identify disease phenotypes with sufficient positive and negative predictive values for use in genome-wide association studies (GWAS). Using data from five different sets of EMRs, we have identified five disease phenotypes with positive predictive values of 73 to 98% and negative predictive values of 98 to 100%. Most EMRs captured key information (diagnoses, medications, laboratory tests) used to define phenotypes in a structured format. We identified natural language processing as an important tool to improve case identification rates. Efforts and incentives to increase the implementation of interoperable EMRs will markedly improve the availability of clinical data for genomics research.
UR - http://www.scopus.com/inward/record.url?scp=79955035027&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79955035027&partnerID=8YFLogxK
U2 - 10.1126/scitranslmed.3001807
DO - 10.1126/scitranslmed.3001807
M3 - Article
C2 - 21508311
AN - SCOPUS:79955035027
SN - 1946-6234
VL - 3
JO - Science translational medicine
JF - Science translational medicine
IS - 79
M1 - 79re1
ER -