Page 83 - 18F-FDG PET as biomarker in aggressive lymphoma; technical and clinical validation
P. 83

                                PET/CT interobserver agreement in DLBCL
were anonymized and uploaded to a database server hosted by Keosys (Imagys), allowing reviewers to read the images in their own workspaces. Seven percent of the PET scans performed in the HOVON84 trial were done with dedicated PET scanners, but this analysis of interobserver agreement was limited to PET/CT examinations. Reviewers were experienced nuclear medicine physicians (>5 y of experience with response evaluation of lymphoma in academic or large peripheral hospitals), actively participating in the HOVON Imaging Working Group. They were masked to clinical follow-up and randomization arm. Reviewers had access to all baseline imaging data (electronic case records containing clinical and imaging staging information provided by local clinicians and image reviewers). For the trial, discrepancies between the 2 reviewers were adjudicated by a third reviewer.
Reviewers used an electronic case record with prespecified nodal localizations (specifying regions as Waldeyer’s ring, cervical, supraclavicular, axillary, mediastinum, hilar, paraaortic, mesenteric, spleen, iliac, inguinal, and other) and extranodal locations (gastrointestinal, central nervous system, skin, liver, lung, pleural, skeletal, and other). Open text fields were available for explanation of difficulties in reading. Reviewers assigned a DS for individual nodal and extranodal localizations together with a final patient-based score (highest lesional DS). We analyzed the DS of I-PET and EoT-PET as ordinal as well as dichotomized scores (DS 1–3 considered negative, DS 4–5 positive) [2].
Statistical Analysis
We performed patient- and region-based analyses. Besides the percentage overall agreement (OA), we calculated the percentage specific agreement, separating positive agreement (PA) from negative agreement (NA). PA and NA were defined as the probability that, if one reviewer assigns a positive or negative score, respectively, a second reviewer scores positive or negative as well [18]. The prevalence of positive scans was calculated as the sum of the number of scans in which both reviewers scored positive and half the scans with discrepancies divided by the total number of scans. We analyzed the following potential sources of observer variation: I-PET and EoT-PET; availability of a baseline PET, PET/ CT, or CT scan for reference; and residual 18F-FDG uptake in different nodal and extranodal localizations. Discrepancies in these specific sites were related to baseline lymphoma prevalence, to assess which localizations were most difficult to read. In addition, we checked the assumption that there was no difference in
81
 4




























































































   81   82   83   84   85