TY - JOUR
T1 - Reproducibility of the diagnosis of dysplasia in Barrett esophagus
T2 - A reaffirmation
AU - Montgomery, Elizabeth
AU - Bronner, Mary P.
AU - Goldblum, John R.
AU - Greenson, Joel K.
AU - Haber, Marian M.
AU - Hart, John
AU - Lamps, Laura W.
AU - Lauwers, Gregory Y.
AU - Lazenby, Audrey J.
AU - Lewin, David N.
AU - Robert, Marie E.
AU - Toledano, Alicia Y.
AU - Shyr, Yu
AU - Washington, Kay
N1 - Copyright:
Copyright 2017 Elsevier B.V., All rights reserved.
PY - 2001
Y1 - 2001
N2 - Morphologic assessment of dysplasia in Barrett esophagus, despite limitations, remains the basis of treatment. We rigorously tested modified 1988 criteria, assessing intraobserver and interobserver reproducibility. Participants submitted slides of Barrett mucosa negative (BE) and indefinite (IND) for dysplasia, with low-grade dysplasia (LGD) and high-grade dysplasia (HGD), and with carcinoma. Two hundred fifty slides were divided into 2 groups. The first 125 slides were reviewed, without knowledge of the prior diagnoses, on 2 occasions by 12 gastrointestinal pathologists without prior discussion of criteria. Results were analyzed by κ statistics, which correct for agreement by chance. A consensus meeting was then held, establishing, by group review of the index 125 slides, the criteria outlined herein. The second 125-slide set was then reviewed twice by each of the same 12 pathologists, and follow-up κ statistics were calculated. When statistical analysis was performed using 2 broad diagnostic categories (BE, IND, and LG v HG and carcinoma), intraobserver agreement was near perfect both before and after the consensus meeting (mean κ = 0.82 and 0.80). Interobserver agreement was substantial (κ = 0.66) and improved after the consensus meeting (κ = 0.70; P = .02). When statistical analysis was performed using 4 clinically relevant separations (BE; IND and LGD; HGD; carcinoma), mean intraobserver κ improved from 0.64 to 0.68 (both substantial) after the consensus meeting, and mean interobserver κ improved from 0.43 to 0.46 (both moderate agreement). When statistical analysis was performed using 4 diagnostic categories that required distinction between LGD and IND (BE; IND; LGD; HGD and carcinoma), the pre-consensus meeting mean intraobserver κ was 0.60 (substantial agreement), improving to 0.65 after the meeting (P < .05). Interobserver agreement was poorer, with premeeting and postmeeting mean values unchanged (κ = 0.43 at both times). Interobserver agreement was substantial for HGD/carcinoma (κ = 0.65), moderate to substantial for BE (κ = 0.58), fair for LGD (κ = 0.32), and slight for IND (κ = 0.15). The intraobserver reproducibility for the diagnosis of dysplasia in BE was substantial. Interobserver reproducibility was substantial at the ends of the spectrum (BE and HG/carcinoma) but slight for IND. Both intraobserver and interobserver variation improved overall after the application of a modified grading system developed at a consensus conference but not in separation of BE, IND, and LGD. The criteria used by the group are presented.
AB - Morphologic assessment of dysplasia in Barrett esophagus, despite limitations, remains the basis of treatment. We rigorously tested modified 1988 criteria, assessing intraobserver and interobserver reproducibility. Participants submitted slides of Barrett mucosa negative (BE) and indefinite (IND) for dysplasia, with low-grade dysplasia (LGD) and high-grade dysplasia (HGD), and with carcinoma. Two hundred fifty slides were divided into 2 groups. The first 125 slides were reviewed, without knowledge of the prior diagnoses, on 2 occasions by 12 gastrointestinal pathologists without prior discussion of criteria. Results were analyzed by κ statistics, which correct for agreement by chance. A consensus meeting was then held, establishing, by group review of the index 125 slides, the criteria outlined herein. The second 125-slide set was then reviewed twice by each of the same 12 pathologists, and follow-up κ statistics were calculated. When statistical analysis was performed using 2 broad diagnostic categories (BE, IND, and LG v HG and carcinoma), intraobserver agreement was near perfect both before and after the consensus meeting (mean κ = 0.82 and 0.80). Interobserver agreement was substantial (κ = 0.66) and improved after the consensus meeting (κ = 0.70; P = .02). When statistical analysis was performed using 4 clinically relevant separations (BE; IND and LGD; HGD; carcinoma), mean intraobserver κ improved from 0.64 to 0.68 (both substantial) after the consensus meeting, and mean interobserver κ improved from 0.43 to 0.46 (both moderate agreement). When statistical analysis was performed using 4 diagnostic categories that required distinction between LGD and IND (BE; IND; LGD; HGD and carcinoma), the pre-consensus meeting mean intraobserver κ was 0.60 (substantial agreement), improving to 0.65 after the meeting (P < .05). Interobserver agreement was poorer, with premeeting and postmeeting mean values unchanged (κ = 0.43 at both times). Interobserver agreement was substantial for HGD/carcinoma (κ = 0.65), moderate to substantial for BE (κ = 0.58), fair for LGD (κ = 0.32), and slight for IND (κ = 0.15). The intraobserver reproducibility for the diagnosis of dysplasia in BE was substantial. Interobserver reproducibility was substantial at the ends of the spectrum (BE and HG/carcinoma) but slight for IND. Both intraobserver and interobserver variation improved overall after the application of a modified grading system developed at a consensus conference but not in separation of BE, IND, and LGD. The criteria used by the group are presented.
KW - Barrett esophagus
KW - Dysplasia
KW - Interobserver
KW - Variability
UR - http://www.scopus.com/inward/record.url?scp=0035028767&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0035028767&partnerID=8YFLogxK
U2 - 10.1053/hupa.2001.23510
DO - 10.1053/hupa.2001.23510
M3 - Article
C2 - 11331953
AN - SCOPUS:0035028767
SN - 0046-8177
VL - 32
SP - 368
EP - 378
JO - Human pathology
JF - Human pathology
IS - 4
ER -