Intraobserver and Interobserver Agreement of Structural and Functional Software Programs for Measuring Glaucoma Progression
Moreno-Montañés J (1), Antón V (1), Antón A (2), Larrosa JM (3), Martinez-de-la-Casa JM (4), Rebolleda G (5), Ussa F (6), García-Granero M (7).
It is important to evaluate intraobserver and interobserver agreement using visual field (VF) testing and optical coherence tomography (OCT) software in order to understand whether the use of this software is sufficient to detect glaucoma progression and to make decisions regarding its treatment.
To evaluate agreement in VF and OCT software among 5 glaucoma specialists.
DESIGN, SETTING AND PARTICIPANTS:
The printout pages from VF progression software and OCT progression software from 100 patients were randomized, and the 5 glaucoma specialists subjectively and independently evaluated them for glaucoma. Each image was classified as having no progression, questionable progression, or progression.
The principal investigator classified the patients previously as without variability (normal) or with high variability among tests (difficult). Using both software, the specialists also evaluated whether the glaucoma damage had progressed and if treatment change was needed. One month later, the same observers reevaluated the patients in a different order to determine intraobserver reproducibility.
MAIN OUTCOMES AND MEASURES:
Intraobserver and interobserver agreement was estimated using κ statistics and Gwet second-order agreement coefficient. The agreement was compared with other factors.
Of the 100 observed patients, half were male and all were white; the mean (SD) age was 69.7 (14.1) years. Intraobserver agreement was substantial to almost perfect for VF software (overall κ [95% CI], 0.59 [0.46-0.72] to 0.87 [0.79-0.96]) and similar for OCT software (overall κ [95% CI], 0.59 [0.46-0.71] to 0.85 [0.76-0.94]). Interobserver agreement among the 5 glaucoma specialists with the VF progression software was moderate (κ, 0.48; 95% CI, 0.41-0.55) and similar to OCT progression software (κ, 0.52; 95% CI, 0.44-0.59).
Interobserver agreement was substantial in images classified as having no progression but only fair in those classified as having questionable glaucoma progression or glaucoma progression. Interobserver agreement was fair regarding questions about glaucoma progression (κ, 0.39; 95% CI, 0.32-0.48) and consideration about treatment changes (κ, 0.39; 95% CI, 0.32-0.48). The factors associated with agreement were the glaucoma stage and case difficulty.
CONCLUSIONS AND RELEVANCE:
There was substantial intraobserver agreement but moderate interobserver agreement among glaucoma specialists using 2 glaucoma progression software packages. These data suggest that these glaucoma progression software packages are insufficient to obtain high interobserver agreement in both devices except in patients with no progression.
The low agreement regarding progression or treatment changes suggests that both software programs used in isolation are insufficient for decision making.