R code underlying the analysis reported in the manuscript "Machine learning approach for pitch type classification based on pelvis and trunk kinematics captured with wearable sensors" (software)

R code underlying the analysis reported in the manuscript "Machine learning approach for pitch type classification based on pelvis and trunk kinematics captured with wearable sensors"

DOI:10.4121/e339176b-0ecd-48e5-bc7e-9b587c0a8959.v1

The DOI displayed above is for this specific version of this dataset, which is currently the latest. Newer versions may be published in the future. For a link that will always point to the latest version, please use
DOI: 10.4121/e339176b-0ecd-48e5-bc7e-9b587c0a8959

Datacite citation style

Gomaz, Larisa (2023): R code underlying the analysis reported in the manuscript "Machine learning approach for pitch type classification based on pelvis and trunk kinematics captured with wearable sensors". Version 1. 4TU.ResearchData. software. https://doi.org/10.4121/e339176b-0ecd-48e5-bc7e-9b587c0a8959.v1

Other citation styles (APA, Harvard, MLA, Vancouver, Chicago, IEEE) available at Datacite

Software

Usage statistics

309

views

141

downloads

Keywords

baseball classification kinematics pitch type sensors wearables

Licence

CC BY 4.0

Interoperability

RO-Crate Metadata

Export as...

RefWorks BibTeX Reference Manager Endnote DataCite NLM DC CFF

by Larisa Gomaz

The study utilized classifiers integrated in the caret R package including K-Nearest Neighbours (KNN), Naive Bayes (NB), Random Forest (RF) and Support Vector Machine (SVM). We investigated the performance of the classifiers in both binary and multiclass classification, including additional Logistic Regression (LOGREG) for binary and Multinomial Logistic Regression (MNOM) for multiclass classification task.

We used a database created by PITCHPERFECT that characterises each pitch with 3 features used directly from the system (pelvis peak angular velocity, trunk peak angular velocity and separation time between them). Data were pre-processed and analysed using R programming language (version 4.3.1). All continuous features were scaled and centred.

We set up our training and testing cases following the 80\% (training) and 20\% (testing) split. To achieve a fair understanding of the generalizability of the classifiers, in the designated training set Leave-One-Group-Out Cross-Validation (LOGO-CV) was carried out. The performance of the classifiers is evaluated by four evaluation criteria - Accuracy, Sensitivity , Precision and F1-score. Hyperparameters were tuned using grid search, a default method for optimizing tuning parameters in the caret package. Feature selection was performed using correlation analysis. Since the correlation between the features was low, the models were trained and tested using all variables derived from PITCHPERFECT system.

History

2023-11-09 first online, published, posted

Publisher

4TU.ResearchData

Format

Organizations

TU Delft, Faculty of Electrical Engineering, Mathematics and Computer Science, Delft Institute of Applied Mathematics (DIAM)

DATA

Files (2)

15,811 bytesMD5:de6a4afe628b07188eaf5684fc926f9aPitchTypeClassification.R
3,810 bytesMD5:c6f7f1621babeecbb9c4ab67bfa0023fSessionInfo.txt
download all files (zip)
19,621 bytes unzipped