Sampling strategies in a statistical approach to clinical classification.

Y. Yang, C. G. Chute

Research output: Contribution to journalArticlepeer-review

2 Scopus citations


This paper studies the sampling strategies for the Expert Network (EexNet), a statistical learning system used for patient record classification at the Mayo Clinic. The goal is to achieve high accuracy classification at an affordable computational cost in very large applications. The learning curves of ExpNet were observed with respect to the choice of training resources, the size, vocabulary coverage and category coverage of a training set, and the category distribution over training instances. A method combining advantages of different sampling strategies is proposed and evaluated using a large training corpus. As a result, Expert Network has achieved its nearly-optimal classification accuracy (measured by average precision) using a relatively small training set, with a fast real-time response which satisfies the needs of human-machine interaction.

Original languageEnglish (US)
Pages (from-to)32-36
Number of pages5
JournalProceedings / the ... Annual Symposium on Computer Application [sic] in Medical Care. Symposium on Computer Applications in Medical Care
StatePublished - 1995
Externally publishedYes


Dive into the research topics of 'Sampling strategies in a statistical approach to clinical classification.'. Together they form a unique fingerprint.

Cite this