Ensemble Feature Selection for Network Intrusion Detection: Combining Information Gain and Random Forest with Recursive Feature Elimination

Wanjau, Stephen Kahara; Kamau, Gabriel Ndung’u

View/Open

Ensemble+Feature+Selection+for+Network+Intrusion.pdf (471.3Kb)

Date

2024

Author

Wanjau, Stephen Kahara

Kamau, Gabriel Ndung’u

Metadata

Show full item record

Abstract

Network intrusion detection systems (NIDS) are essential for protecting computer networks against cyberattacks. The selection of a nominal set of essential features that may adequately discriminate malicious traffic from the normal traffic is indispensable while developing a NIDS. As such, a more reliable and accurate detection result may be realized when intrusion detection is carried out on a dataset based on an inclusive feature representation. This work presents the pre-processing and feature selection workflow as well as its results in the case of the CIC-IDS-2017 dataset with a focus on two cyber-attacks namely Denial-of-Service (DoS) and PortScan. The study applied an ensemble feature selection method based on information gain and Random Forest to filter out important features. Recursive Feature Elimination method was then applied to the reduced features to optimize the selected feature subset. The selected feature subset was experimented with using two classification algorithms, namely support vector machine and multi-layer perceptron. In the evaluation process, four widely used performance metrics were considered. The study results demonstrated the efficacy of the proposed ensemble approach to optimize the selected feature subset for detecting PortScan and DoS attacks in network traffic. Experimental results revealed that the support vector machine had a slight advantage in accuracy and could train more quickly. According to the study's evaluation, the NIDS may be able to shorten processing times without sacrificing the ability to detect PortScan and DoS attacks accurately by choosing a narrow subset of informative features. This suggests the approach might be applicable to real-world NIDS scenarios involving these attacks. The study also provides encouraging perspectives on how ensemble feature selection utilizing MLP and SVM can enhance the effectiveness of NIDS. Building on these findings, more research can create NIDS solutions that are even more reliable and efficient for the dynamic field of cybersecurity.

URI

http://repository.mut.ac.ke:8080/xmlui/handle/123456789/6531

Collections

Journal Articles (CI) [132]