Abstract
UNLABELLED: Current research on high throughput identification of patients with a specific phenotype is in its infancy. There is an urgent need to develop a general automatic approach for patient identification.
OBJECTIVE: We took advantage of Mayo Clinic electronic clinical notes and proposed a novel method of combining NLP, machine learning, and ontology for automatic patient identification. We also investigated the benefits of involving existing SNOMED semantic knowledge in a patient identification task.
METHODS: the SVM algorithm was applied on SNOMED concept units extracted from T2DM case/control clinical notes. Precision, recall, and F-score were calculated to evaluate the performance.
RESULTS: This approach achieved an F-score of above 0.950 for both groups when using all identified concept units as features. Concept units from semantic type-Disease or Syndrome contain the most important information for patient identification. Our results also implied that the coarse level concepts contain enough information to classify T2DM cases/controls.
Original language | English (US) |
---|---|
Pages (from-to) | 857-861 |
Number of pages | 5 |
Journal | AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium |
Volume | 2010 |
State | Published - 2010 |
Externally published | Yes |
ASJC Scopus subject areas
- General Medicine