Page 108 - 18F-FDG PET as biomarker in aggressive lymphoma; technical and clinical validation
P. 108
Chapter 5
To determine ease of use for both workflows, each observer noted the total analysis time per patient (including loading of the scan, performing the analysis, and saving results).
In addition, the success of all semi-automatically generated VOIs was rated by each observer according to the following definitions:
• Failed: generated VOI is unrealistic or does not contain complete lesion
• Poor: generated VOI takes into account physiological uptake or contains
a lot of background and manual modification is needed
• Acceptable: only minimal manual modification needed for good VOI
• Good: generated VOI is comparable to what you consider to be lymphoma
A mean “success rate” (all acceptable and good ratings) was calculated for each method. Finally, observers had to choose one “preferred segmentation” for the generated VOIs. The MV2 and MV3 consensus methods were rated by one experienced observer according to the same success definitions. As these MV methods were assessed afterwards, they could not be chosen as “preferred segmentation.”
Image Analysis Workflow C
The observers used the fully automated method as in Workflow B for the analyses on the same twelve scans (Workflow C1). These analyses were performed 3 months later to minimize recall bias. In addition to the interactive deletion of physiological uptake regions similar to Workflow B, the observers were allowed in Workflow C to manually modify the generated VOIs by adding missed lesions (with the A50%P option or manually) and removing of physiological uptake with an “eraser” tool. The manually modified MTVs and TLGs were checked for correct delineation and identification of tumor sites (and changed if needed) by independent nuclear medicine physicians (NM, one per observer) with more than 10 years of experience with [18F]FDG PET/CT evaluation in lymphoma (OSH, SFB, SM; Workflow C2).
Statistical Analysis
Success rates of generated VOIs were analyzed descriptively. Interobserver reliability was expressed as intraclass correlation coefficients (ICCs) and coefficients of variation (CoVs). ICC estimates and their 95 % confidence
106