Page 133 - Quantitative Imaging of Small Tumours with Positron Emission Tomography
P. 133

                                Chapter 6 inherent dependency of train-test iterations. Still, to be able to compare the mean AUCs of radiomics versus standard PET metrics, we used a framework developed by Van De Wiel et al. (30): in each fold the AUCs of two models were compared statistically using DeLong test (31), and the median of the p-values over the different folds was reported as the final p-value. A disadvantage of this method is that each p-value is based on the test set of a single fold only, resulting in a rather low power to detect true differences. Intraclass correlation coefficients (ICC: 2-way mixed model, absolute agreement) were calculated for each radiomic feature between original versus PVC images (per delineation threshold), and between delineation thresholds. ICCs were categorized as poor (ICC<0.5), moderate (0.5<ICC<0.75), good (0.75<ICC<0.9), or excellent (ICC>0.9) (32). Results Patients We included 76 patients (Table 6.1), of which 71 ultimately underwent surgery. Six patients had uptake suspicious for distant metastases on PET (n=2 nodal, n=1 bone, n=3 both), all of which were biopsied. In 4 of these patients biopsies confirmed malignancy and surgery was omitted; in 2 patients (n=1 bone, n=1 nodal lesion) biopsy did not confirm malignancy and surgery was performed as planned. Additionally, 1 patient had biopsy-proven LNI within the ePLND template, but surgery was omitted due to additional PSMA-positive nodal metastases outside the ePLND template. The final pathology findings are listed in Table 6.2. Impact Of PVC And Delineation Threshold Delineated tumor volumes for each delineation threshold with and without PVC are shown in Supplemental Fig. Most radiomic features had a moderate agreement between original and PVC data (Fig. 6.2A). Delineation thresholds mainly affected morphological features, while intensity and textural features were less affected (Fig. 6.2B). For LNI and any metastasis prediction, PVC and a higher delineation threshold tended to improve model stability, reducing the width of the cross-validation AUC distributions (Figs. 6.3A-B). For GS and ECE predictions, there was no optimal delineation threshold and PVC had no apparent benefit (Figs. 6.3C-D). 132 


































































































   131   132   133   134   135