Code and dataset for malicious cloud service traffic detection based on multi-feature fusion
DOI: 10.4121/ae5d2dd8-20c7-46ec-a1da-d97a8122343f
Dataset
Categories
Licence MIT
With the rapid growth of cloud computing, malicious attacks targeting cloud services have become more prevalent. We propose a method for detecting malicious cloud service traffic based on multi-feature fusion, addressing the issues of single feature extraction and weak generalization capabilities in traditional methods. By analyzing the attack patterns of malicious traffic, our model extracts features from both field attributes and statistical attributes of malicious requests. Furthermore, to enhance the generalization ability of the extracted features, a feature fusion algorithm based on an attention mechanism is employed for field feature fusion, and a feature selection algorithm based on the Gini coefficient and random forest is used for statistical feature selection. To balance the contribution of different types of features to the model during training, we propose a dual branch malicious request detection model, which processes and trains field feature vectors and statistical feature vectors through separate branches. After comparing currently available datasets for cloud service attack detection, this paper selects the HTTP dataset CSIC 2010 and a real-world cloud service log dataset for testing and validating the proposed method. Experimental results demonstrate that the proposed method exhibits strong competitiveness and achieves superior classification performance compared to other models.
History
- 2025-03-04 first online, published, posted
Publisher
4TU.ResearchDataFormat
script/.py text/.txt image/.png dataset/.csvOrganizations
School of Computer Science and Engineering, Southeast University;University of Electronic Science and Technology of China, Chengdu, China;
The 30th Research Institute of China Electronics Technology Group Corporation, Chengdu, China
DATA
Files (18)
- 4,053 bytesMD5:
0821e6d137926c8f1377dd7b508023fb
readme.md - 16,070 bytesMD5:
2049ca43fcf07bca13a853df6ee433d8
acc.png - 16,090,299 bytesMD5:
d03503ed45d198b4cebdefec1f540131
anomalousTrafficTest.txt - 1,894 bytesMD5:
67c72bfacf2f9815fc913bc87f8c3092
config.py - 54,535,457 bytesMD5:
2477f1539ef48160f6f45b550ae7b6a5
csic.csv - 6,939 bytesMD5:
21445fe69b6a773ea1eeeaeb55b8d497
csic_data_pre.py - 9,028,620 bytesMD5:
65acb5c64a544bcf9d2243a811302725
csic_sf.csv - 10,478,789 bytesMD5:
e03d6746ac5d7cfbac05d7d90f9b5be4
csic_tf.csv - 2,671 bytesMD5:
b96ae570f4501d2ad6e9c09e4f1adbd1
data.py - 3,180 bytesMD5:
d2a2c0dc2b76429f09fb777a7acdb804
data_utils.py - 5,423 bytesMD5:
2fab070f80fa61aacdd5924358734d83
main.py - 4,004 bytesMD5:
6fdfb29db29ddea178c10d5b0a492128
model.py - 20,599,204 bytesMD5:
758e97272ca23ad5ea60ccddfdc72905
normalTrafficTest.txt - 20,640,988 bytesMD5:
80dc393c73afd08df28351e1470e3bbf
normalTrafficTraining.txt - 697 bytesMD5:
e69b25422c0df7e40468938b3146e13f
requirements.txt - 4,784 bytesMD5:
a16e69313cbe41010621b14f44ba6be0
RF_IM.py - 17,430 bytesMD5:
8ef509eaec17cbc318d588fb7b92e1d3
teloss.png - 21,265 bytesMD5:
38b803263ba133e5d05b2ceb88392a85
trloss.png -
download all files (zip)
131,461,767 bytes unzipped