R code underlying the analysis reported in the manuscript "Machine learning approach for pitch type classification based on pelvis and trunk kinematics captured with wearable sensors"

doi:10.4121/e339176b-0ecd-48e5-bc7e-9b587c0a8959.v1
The doi above is for this specific version of this dataset, which is currently the latest. Newer versions may be published in the future. For a link that will always point to the latest version, please use
doi: 10.4121/e339176b-0ecd-48e5-bc7e-9b587c0a8959
Datacite citation style:
Gomaz, Larisa (2023): R code underlying the analysis reported in the manuscript "Machine learning approach for pitch type classification based on pelvis and trunk kinematics captured with wearable sensors". Version 1. 4TU.ResearchData. software. https://doi.org/10.4121/e339176b-0ecd-48e5-bc7e-9b587c0a8959.v1
Other citation styles (APA, Harvard, MLA, Vancouver, Chicago, IEEE) available at Datacite
Software

The study utilized classifiers integrated in the caret R package including K-Nearest Neighbours (KNN), Naive Bayes (NB), Random Forest (RF) and Support Vector Machine (SVM). We investigated the performance of the classifiers in both binary and multiclass classification, including additional Logistic Regression (LOGREG) for binary and Multinomial Logistic Regression (MNOM) for multiclass classification task.


We used a database created by PITCHPERFECT that characterises each pitch with 3 features used directly from the system (pelvis peak angular velocity, trunk peak angular velocity and separation time between them). Data were pre-processed and analysed using R programming language (version 4.3.1). All continuous features were scaled and centred.


We set up our training and testing cases following the 80\% (training) and 20\% (testing) split. To achieve a fair understanding of the generalizability of the classifiers, in the designated training set Leave-One-Group-Out Cross-Validation (LOGO-CV) was carried out. The performance of the classifiers is evaluated by four evaluation criteria - Accuracy, Sensitivity , Precision and F1-score. Hyperparameters were tuned using grid search, a default method for optimizing tuning parameters in the caret package. Feature selection was performed using correlation analysis. Since the correlation between the features was low, the models were trained and tested using all variables derived from PITCHPERFECT system.

history
  • 2023-11-09 first online, published, posted
publisher
4TU.ResearchData
format
.R
organizations
TU Delft, Faculty of Electrical Engineering, Mathematics and Computer Science, Delft Institute of Applied Mathematics (DIAM)

DATA

files (2)