Code and data underlying the publication: Data-driven Semi-supervised Machine Learning with Safety Indicators for Abnormal Driving Behavior Detection

DOI:10.4121/b60dfda0-055a-4046-a615-e0166a356c95.v1
The DOI displayed above is for this specific version of this dataset, which is currently the latest. Newer versions may be published in the future. For a link that will always point to the latest version, please use
DOI: 10.4121/b60dfda0-055a-4046-a615-e0166a356c95
Datacite citation style:
Dong, Yongqi; Zhang, Lanxin; Haneen Farah; Zgonnikov, Arkady; van Arem, Bart (2025): Code and data underlying the publication: Data-driven Semi-supervised Machine Learning with Safety Indicators for Abnormal Driving Behavior Detection. Version 1. 4TU.ResearchData. dataset. https://doi.org/10.4121/b60dfda0-055a-4046-a615-e0166a356c95.v1
Other citation styles (APA, Harvard, MLA, Vancouver, Chicago, IEEE) available at Datacite

Dataset

This is the code and processed data related to the publication:

Dong, Y., Zhang, L., Farah, H., Zgonnikov, A., & van Arem, B. (2023). Data-driven Semi-supervised Machine Learning with Surrogate Safety Measures for Abnormal Driving Behavior Detection. arXiv preprint arXiv:2312.04610. https://arxiv.org/abs/2312.04610


The original data is from https://github.com/UCF-SST-Lab/UCF-SST-CitySim1-Dataset

The codes make use of open-sourced implementation of HELM and other semi-supervised learning algorithms.


After setting up the folder and fetching the data, one can simply run the code with the specific function (identified by their names) get the relevant results.

Details about the implementation are demonstrated in the paper.


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Detecting abnormal driving behaviour is critical for road traffic safety and the evaluation of drivers' behaviour. With the advancement of machine learning (ML) algorithms and the accumulation of naturalistic driving data, many ML models have been adopted for abnormal driving behaviour detection (also referred to as anomalies). Most existing ML-based detectors rely on supervised methods, which require substantial labelled data. However, ground truth labels are not always available in the real world, and labelling large amounts of data is tedious. Thus, there is a need to explore unsupervised or semi-supervised methods to make the anomaly detection process more feasible and efficient. To fill this research gap, this study analyzes large-scale real-world data revealing several abnormal driving behaviours (e.g., sudden acceleration, rapid lane-changing) and develops a Hierarchical Extreme Learning Machines (HELM) based semi-supervised ML method using partly labelled data to accurately detect the identified abnormal driving behaviours. Moreover, previous ML-based approaches predominantly utilized basic vehicle motion features (e.g., velocity and acceleration) to label and detect abnormal driving behaviours, while this study seeks to introduce event-level safety indicators as input features for ML models to improve detection performance. Results from extensive experiments demonstrate the effectiveness of the proposed semi-supervised ML model with the introduced safety indicators serving as important features. The proposed semi-supervised ML method outperforms other baseline semi-supervised or unsupervised methods regarding various metrics, e.g., delivering the best accuracy (99.58%) and the best F-1 measure (0.9913). The ablation study further highlights the significance of safety indicators for advancing the detection performance.


History

  • 2025-02-20 first online, published, posted

Publisher

4TU.ResearchData

Format

py; txt; csv; png; .h5; word;

Funding

  • Safe and efficient operation of AutoMated and human drivEN vehicles in mixed traffic (grant code 17187) [more info...] Applied and Technical Sciences (TTW), a subdomain of the Dutch Institute for Scientific Research (NWO)

Organizations

TU Delft, Faculty of Civil Engineering and Geosciences, Department of Transport and Planning

DATA

Files (2)