Page 118 - Demo

P. 118

                                
                                    Chapter 7116Feature Design and selectionBased on the existing force/torque and movement data, additional variables – so called ‘features’ - can be computed. These features can be best compared to the (independent) ‘variables’ we know from traditional statistics. They were designed in multiple brainstorming sessions between computer scientists and OMF-surgeons. An effort was made to design clinically interpretable features, e.g., rotational velocity or peak forces/torques in every direction. For a complete overview of all features, see Appendix Table 1. Each of these features have their own predictive power to distinguish between different classes of teeth. Teeth were grouped together as ‘classes’ to optimize model performance for a small dataset. To ease the clinical interpretability of this model, four classes were chosen as an output for the model. These classes were the same for both upper (U) and lower (L) ja incisors (U1/U2, L1/L2), cuspids (U3, L3), bicuspids (U4/U5, L4/L5) and molars (U6/U7, L6/L7). The goal of feature selection is to determine what features should be included in order to optimally classify tooth removal procedures with a minimum set of features [15, 16]. Several approaches are available to select the most important features of which ‘regularization’ is one [17]. A model including a regularization term trades off simplicity and performance by weighting different features. The model is simplified by discarding uninformative features at the cost of a reduction in classification accuracy. This way, only features with high importance will remain. For this study, logistic regression with L2 (or ‘ridge regression’) regularization was used. L2 regularization was chosen over L1 (or ‘lasso regression’) because it is more suitable to avoid overfitting of a model. In contrast to L1 regularization, features are not removed from the model in L2, but it tends to reduce extreme weights leading to a more even distribution of the weight of the features. The actual selection is then performed by applying a threshold for feature importance, which, in our study was chosen to be the mean of the overall feature importance [10].Designing a classification modelBecause features can differ in terms of scale, standardization (i.e., variance scaling) of the features was performed to even out their scales. In the standardization process, every feature is scaled down to a mean of zero and a standard deviation of one. It prevents the algorithm to mistakenly give importance to features that have larger scales.Tom van Riet.indd 116 26-10-2023 11:59
112 113 114 115 116 117 118 119 120 121 122