Abstract
With the ever-increasing reach of the internet, and its increasing access through various types of devices, the spread of malware, phishing attempts, etc. have steadily been increasing, along with their level of sophistication. Thus it becomes very important to conduct research on different methods to prevent such harmful attacks on systems and users. Using a malicious URL is the common way for hackers to attack a system, thus, to accommodate the variety attack vectors of malicious websites, 21 features were extracted from 651,191 URLs to train the proposed model. A two-stage stacked ensemble learning model, based on gradient boosting methods and random forest, has been trained and tested in the 70:30 ratio of the 651,191 URLs, and an accuracy of 97% has been achieved. Then Explainable AI (XAI) has been used to clearly explain the working of the model, and study the impact of each of the 21 features on the 4 class predictions (benign, defacement, phishing and malware).
Original language | English |
---|---|
Title of host publication | 2022 IEEE International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom) |
Number of pages | 7 |
Place of Publication | Los Alamitos, CA |
Publisher | IEEE |
Publication date | 2023 |
Pages | 1266-1272 |
ISBN (Print) | 9781665494267 |
ISBN (Electronic) | 9781665494250 |
DOIs | |
Publication status | Published - 2023 |
Event | 21st IEEE International Conference on Trust, Security and Privacy in Computing and Communications. IEEE TrustCom 2022 - Wuhan, China Duration: 9 Dec 2022 → 11 Dec 2022 Conference number: 21 http://www.ieee-hust-ncc.org/2022/TrustCom/index.html |
Conference
Conference | 21st IEEE International Conference on Trust, Security and Privacy in Computing and Communications. IEEE TrustCom 2022 |
---|---|
Number | 21 |
Country/Territory | China |
City | Wuhan |
Period | 09/12/2022 → 11/12/2022 |
Internet address |
Series | IEEE International Conference on Trust Security and Privacy in Computing and Communications |
---|---|
ISSN | 2324-898X |
Keywords
- Malicious URL detection
- Ensemble-learning
- Random forest
- Gradient boosting
- Explainable AI