SlideShare a Scribd company logo
Intelligent Intrusion Detection System Based on
MLP, RBF and SVM Classification Algorithms: A
Comparative Study
Ismail M. Keshta
ismailk@dcc.kfupm.edu.sa
dr.ismail.keshta@gmail.com
Abstract—An effective approach for tackling network security
problems is Intrusion detection systems (IDS). These kind of
systems play a key role in network security as they can detect
different types of attacks in networks, including DoS, U2R Probe
and R2L. In addition, IDS are an increasingly key part of the
system’s defense. Various approaches to IDS are now being used,
but are unfortunately relatively ineffective. Data mining techniques
and artificial intelligence play an important role in security
services. We will present a comparative study of three well-
known intelligent algorithms in this paper. These are Radial Basis
Functions (RBF), Multilayer Perceptrons (MLP) and Support
Vector Machine (SVM).This work’s main interest is to benchmark
the performance of these3 intelligent algorithms. This is done by
using a dataset of about 9,000 connections, randomly chosen from
KDD'99’s 10% dataset. In addition, we investigate these
algorithms’ performance in terms of their attack classification
accuracy. The Simulation results are also analyzed and the
discussion is then presented. It has been observed that SVM with a
linear kernel (Linear-SVM) gives a better performance than MLP
and RBF in terms of its detection accuracy and processing speed.
Keywords- Intrusion detection system; Network security; Machine
learning; Anomaly detection; KDD Cup 99
I. INTRODUCTION
Network security is fast becoming a big challenge. As
interconnections among computer systems grow rapidly
Computer networks need to be protected against the
unauthorized disclosure of information, denial-of-service (DoS)
attacks and the modifying or destroying of data [1].
Attack detection techniques have become a critical issue
that are being used to secure networks. Making a network
secure is so difficult for many reasons, including the
complexity of computers and networks, a lack of awareness of
the various risks and threats, increasing internet usage and the
computer system’s vulnerabilities [2][3]. It is vital to note here
that detection techniques have become a vital difficulty of open
research and so they get given the additional attention of the
research community. Furthermore, it is important to state that
the network attacks’ complex properties are key issues that
work against these detection techniques [4][5].
The traditional techniques, including avoiding any
programming errors and firewalls, have not succeeded in fully
protecting networks and systems from the dangers of malware
and so attacks are becoming increasingly sophisticated [6].
Peddabachigari et al. [7] showed that programming errors can
no longer be avoided as the system’s complexity and
application software is rapidly evolving, leaving weaknesses
that can be exploited. Jamali et al. [8] state that firewalls are
not sufficient to give the network total security because they
just throttle attacks that come from outside and do not have any
effect on the risk of inside attacks. It is likely that computer
systems will remain unsecured in the near future.
Therefore, IDS have now become a vital and indispensable
part of security infrastructure that are used to detect any
sophisticated attacks and malware early before they can inflict
any wide spread damage [7]-[9]. IDS is, therefore, needed as an
extra wall to protect systems despite these prevention
techniques. Detection of intrusion is useful in the detection of
intrusions that are successful, as well as monitoring bids to
break security [10]-[12]. IDS protects computer systems
against hateful operations by detecting the violation of security
policies and active defenders, including by alarming operators
[13]. It particularly helps the network to provide resistance
against external attacks [14].
Figure 1: Organization of a generalized IDS
It is vital to state that many issues need to be considered
when building an IDS, including data collection, response, data
preprocessing, reporting and intrusion recognition, which is at
the heart of it. The organization of an IDS is illustrated in
Figure 1.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
23 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
Existing IDS systems are able to be divided into two
categories in general, according to each of the detection
approaches, which are anomaly detection and misuse detection
[15] [16]. Misuse-based IDS is able to detect known attacks
efficiently, but fails to find new attacks which fail to embody
the rules in the database [17]. Therefore, a database has to be
continuously updated to store the signatures of every attack that
is known. This IDS type is obviously unable to detect new
attacks unless it is trained to [18]. Anomaly-based IDS can
build a normal behavior model and it then distinguishes any
major deviations from the model as being an intrusion. This
IDS type is able to detect new attacks or unknown ones but it
features a high rate of false alarms [19]-[20].
Research efforts have been made to reduce these false
alarms by proposing an intelligent IDS that is based on
machine learning. A number of anomaly detection systems are
developed in literature that are based on a lot of different kinds
of machine learning techniques[21]-[25].Some of these studies
can apply single learning techniques. But some systems are
based on a combination of different learning techniques.
Machine learning classification algorithms provide a very
promising solution and are able to discover novel attacks that
are based on their own features[26]. In addition, they can be
utilized to study and then identify correlated data, make
decisions, make predictions and classify data[24]-[26].
Algorithms like Multilayer Perceptron (MLP), Radial Basis
Functions (RBF) and Support Vector Machine (SVM) are all
examples of algorithms which are well-known, widely adopted
and have been investigated in neural networks, machine
learning and artificial intelligence. MLP can, for a start,
successfully perform the classification operation[27][28], while
MLP neural network training is hard because of its structure’s
complexity [28]. SVM is also a very strong algorithm in data
mining, which has been applied successfully in a number of
scientific applications [29].
Despite how vital machine learning algorithms are for
intrusion detection systems[21]-[25], more could be done to
provide comparison studies between the algorithms, as little
attention has been given to this, particularly when it comes to
the designing of an effective IDS for both computer and
network systems. Furthermore, little has been done to specify
an intelligent IDS that would reduce the anomaly-based
detection’s false alarm rate.
We conducted in this work a comprehensive and detailed
comparative study across a total of 3 intelligent classification
algorithms, which are RBF, SVM and MLP, with linear kernel.
This is a polynomial kernel with exponent 1, and we chose it to
be linear for SVM as Linear-SVM is both efficient and fast.
Linear-SVM is able to consume less energy in the course of the
learning process in the deployment phase, unlike MLP and
RBF [30] [31]. Moreover Gupta and Ramanathan [32], as well
as Magno et al. [33], stated that Linear-SVM is a low
complexity classifier. Magno et al. [33] also highlighted that
Linear-SVM gives a good balance between the computational
and memory cost and the percentage of correctly classified
data. Sazonov et al. [34] said there were two powerful SVM
characteristics, which are high generalization and robustness.
In addition, Bal et al. [35] found that SVM with linear kernel is
a very promising algorithm that exists in the machine learning
field. Yuan et al. [36] also concluded that an SVM classifier,
especially one with linear kernel, can both learn and build the
knowledge that is needed from less training samples and yet
can still provide a high level of classification accuracy, unlike a
number of other classifiers such as MLP and RBF.
The following are our major contributions in this work.
Firstly, we provided a number of detailed and state-of-the-art
related IDS models, which were based on the intelligent
machine learning algorithms. Secondly, we undertook a
comprehensive comparison between three intelligent classifiers
by using a real benchmark dataset. Thirdly, the performance of
all three was examined by utilizing confusion matrix. Lastly,
we were able to propose an intelligent IDS framework for
effective and efficient IDS management computer and network
systems. The framework was addressed at classification level.
Utilizing Linear-SVM as an intelligent classifier, it is
considered a core element in the building of the framework.
We also discussed an evaluation of the proposed framework,
and the simulation of results for detecting malicious attacks,
like Remote to Local (R2L), Denial of Service (DoS) Attacks,
Remote to User (R2L) Attacks and Probing attacks, are all
provided.
The rest of this paper is organized in the following way.
Section II gives a literature review of the recent approaches that
have been proposed for IDS that is based on intelligent
classification algorithms. Section III highlights some
background into the classification algorithms that were utilized
in the work, which is RBF, MLB and SVM. It also provides a
useful overview of the experimental dataset. The paper’s main
contribution is discussed in SectionIV, while simulation
experiments and the ways they were setup is illustrated in
SectionV. This section also summarizes and discusses the
results of the simulation. Lastly, Section VI provides the
conclusion of the paper and also highlights any future research
directions.
II. RELATED WORK
There has been a lot of researches into anomaly-based
intrusion detection, and some of them have used machine
learning, as well as data mining techniques. Decision tree,
neural networks, clustering and Bayesian parameter estimation
are some techniques that have been used to detect any intrusive
behaviors in the computer network.
Chandolikar et al. [37] evaluated the performance of 2
classification algorithms. These were Bayes net and J48
algorithm, which are both used for detecting computer attacks.
The results reveal that J48 learning algorithm was more
accurate than Bayes net algorithm in terms of achieving better
accuracy and it had a lower error rate. A benchmark was used
in the evaluation. This was the KDD cup dataset. It was
emphasized that J48 algorithm had a higher accuracy which
helps to increase the IDS’ efficiency.
The Principal Component Analysis and Naive Bayes
classifier was employed by Panda et al. [38] to give them a
way of detecting intrusion by using machine learning
algorithms. These experiments were carried out ontheKDD’99
cup dataset, an intrusion detection dataset. The dimensionality
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
24 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
of the dataset was reduced by utilizing principal component
analysis, as well as the Naïve Bayes classifier classification of
the dataset. This was done in both the normal and attack
classes. They concluded that the approach they used was a
description of a Network Intrusion detection system framework
which used two algorithms, Naive Bayes and Principal
Component Analysis. The result they obtained showed that
their approach was faster compared to a number of the other
existing systems.
An intrusion detection system was proposed by Wang et al.
[39]. This was based on C4.5 decision tree, one of the
algorithm-based neural networks. The result revealed that the
intrusion detection system was effective and feasible, and had a
high rate of accuracy. All of their experiments were conducted
on a KDD CUP1999 dataset, a test set that is widely used for
intrusion detection fields. The tree that is generated by the C4.5
neural network classification algorithm for intrusion detection
was used to build rules. These then can use the knowledge base
of IDS. In other words, the rules are able to give an indication
if a new network behavior is either normal or abnormal, based
on the built knowledge.
The J48 intelligent algorithm was utilized by Chandolikar
et al. [40] in the experiments they did to make IDS. Their
results show that J48 is an effective and efficient algorithm of
the classification in the KDD CUP1999 dataset.
Yogita et al. [41] proposed IDS that used SVM as a data
mining technique. It is vital to mention here that SVM is a very
popular classification algorithm. However, they highlighted the
main drawback, which is that SVM takes a very long time to
train the neural network. These experiments were done by
utilizing the NSL-KDD Cup’99 dataset’s improved version of
the KDD Cup’99 dataset. They used the Gaussian RBF as the
kernel function and a 10-fold cross validation as the test option
parameter that was used for SVM. In addition, they pointed out
that the method based SVM that was proposed was able to
increase the accuracy of intrusion detection and cut down on
the time taken to build this classification model.
The aim of Mohammadreza et al. [42] was to use data
mining techniques, which included SVM and the classification
tree for IDS. The results reveal that the C4.5 algorithm is better
than SVM at detection of any network intrusions. These
experiments were carried out on a KDD CUP 99 dataset. Das et
al. [43] looked at the IDS at its preprocessing level, which is
the level before the classification process, and proposed what is
called a divide and conquer algorithm. The aim of this was to
reduce the feature set from the large KDD 99 dataset. The
proposed algorithm successfully reduced the IDS’s overhead
for analyzing the entire KDD dataset. This was done by
selecting the vital features and then classifying them all with a
maximized rate of classification. It was a generic algorithm and
it could be applied to absolutely any dataset. The authors used
LDA, KNN, C4.5, SVM and a number of classification
algorithms in order to classify the various feature sets that had
been obtained.
III. PRELIMINARIES
This section gives a brief background about the three
intelligent algorithms used in this study, as well as about the
dataset for the experimental comparison.
A. Classification Algorithms
The various classification algorithms that were used in the
research project are described in brief below.
1) Multilayer Perceptron (MLP)
This is composed of a big amount of widely interconnected
neurons that all work in parallel in order to solve a particular
problem. MLP is organized in a series of layers that have a
feed-forward information flow. An MLP network’s main
architecture consists of a number of signals which flow
sequentially through these various layers, starting with the
input layer, through to the output layer. Between these two
layers are a number of intermediate layers, which are also
known as hidden layers because you cannot see them at either
the input or the output. Each of the units is first utilized to
calculate what the difference is between a vector of weights
and a vector provided by the outputs of the previous layer. In
order to generate the next layer’s input, a transfer function,
which is also called activation, was applied to the result [44].
RBF, unipolar sigmoid and bipolar sigmoid are all examples of
activation functions that are both well-known and commonly
used.[45]. The training phase’s main steps in an MLP network
are the following: Firstly, after being given the dataset’s input
pattern, this particular pattern is forward-propagated to the
MLP network’s output and it is then compared with the output
desired. Secondly, the error signal that exists between the
network’s output and the desired response is then back-
propagated to the network. Lastly, a number of adjustments are
made to the synaptic weights [46]. The process is the repeated
for the next input vector and this continues until all of the
training patterns have been passed right through the network.
2) Radial Basis Functions (RBF)
This involves a total of three layers. The first is called the
input layer and it is made up of source nodes (or sensory units).
The amount of these source nodes is equal to the input vector’s
dimension. The second is the hidden layer, which consists of
nonlinear units. These are directly connected to every one of
the sensory units in the input layer. The RBF network has only
a single hidden layer that has RBF activation functions. Lastly,
the output layer is utilized to linearly combine the hidden
layer’s outputs and give the network’s response to the input
data [47].
3) Support Vector Machine (SVM)
This splits the dataset into two different classes. These are
separated by placing a linear boundary between both the
normal and attack classes in a way that maximizes the margin.
SVM finds the hyperplane that is able to provide the maximum
distance there is between the hyperplane and the closest of the
positive and negative samples [48][49]. The SVM network’s
basic structure is similar to the structure of the ordinary RBF
network. However, the kernel activating function is applied
instead of the exponential activating function (which is
generally Gaussian activation functions).This kernel activating
function can be either a polynomial kernel, a Gaussian radial
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
25 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
basis kernel, or two layer feed-forward neural network kernels
[49].
B. Dataset
This section gives a brief description of the dataset that is
used in the work. The KDDCUP’99 dataset was prepared by
the 1998 DARPA Intrusion Detection Evaluation program by
MIT Lincoln Laboratories [50]. It can be seen from the
literature that this dataset has been used widely for the
evaluation of anomaly based IDS. A lot of researchers are
using the KDDCUP’99 dataset as it is the only publicly
available dataset for the ID problem, and also because it is
possible to extract useful information from it [51] [52]. The full
dataset contained around 5 million instance/records. This is
where each data raw has its connection records. Connection is
defined in many references as a sequence of TCP packets that
start and end at some time between a source and a destination
under a protocol that is well-defined [50]-[52].
This dataset contains a number of different attack types,
which are classified into 4 major categories. These are R2L,
DOS, Probing and U2R. The KDD cup 99 set has a total of 41
attributes or features for each instance, or a sample plus 1 class
label. The total number is, therefore, 42 attributes. The 41
attributes are destination bytes, count, dst host count, diff srv
rate, wrongfragment and urgent. The 42nd
field is a label that
can be generalized as either normal or anomaly (U2R, DoS,
Probing and R2L) [50] [53] (see Table 1).
TABLE 1: TYPES OF ATTACKS IN KDD’99 DATASET
Classification Short Description Name of Attacks
DoS Attacker attempts to deny or
prevent legitimate users
from using a service.
smurf, land, pod,
teardrop, neptune,
back
R2L Attacker attempts to send
packets to the victim
machine in order to gain
access because he does not
have an account on it.
ftp_write, phf,
spy, warezmaster,
warezclient, imap,
guess_passwd,
multihop
U2R The attacker tries to exploit
some vulnerability to gain
root/super user access to the
system.
perl,
buffer_overflow,
rootkit,
loadmodule
Probe The attacker attempts to
gather information about a
computer network.
portsweep, nmap,
ipsweep, satan
Tavallaee et al. [26] highlighted that the features of
KDD’99 can be categorize into three different groups. These
are Basic features, Content features and Traffic features. Basic
features are utilized to encapsulate all of the attributes that have
been extracted from the TCP/IP connection.
The majority of these features can help to detect the major
causes of network delays. There is then a second class, which is
the Traffic features. These depend on window interval and they
can be divided into 2 major features, which are “same host”
features and also “same service” features. They are, therefore,
called time-based features. “Same host” features are used to
carry out an examination of network connections in the
previous two seconds, and they have the same target host as the
current connection. "Same service” features are utilized to test
the network’s connections and have the same service as the
current connection in the previous two seconds. The last of
these classes is called Content features, which helps to detect
U2R and R2L attacks. This is because these types of attacks do
not have either a well-defined structured feature or well-
defined pattern. Therefore, Content features have some features
that enable IDS to detect any intrusion that is tending to cause
or create suspicion in the data portion, like a number of failed
log on attempts [26].
IV. THE PROPOSED SYSTEM
The focus of this research work is on the original “10%
KDD 99” dataset because of the limited memory capacity. The
system flow for the proposed IDS is shown in Figure 2. The
original “10% KDD 99” dataset is firstly loaded into the
system. The next step is pre-processing, in which the input file
is properly prepared.
Figure 2: Block diagram of the proposed IDS
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
26 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
Figure 3: Original KDD'99 10% dataset distributions
In this step, a total of 9000 instances are randomly selected
from 10% of the KDD CUP 1999 dataset with nearly the same
distribution as the KDD'99 10% dataset. Figure 3 and Figure 4
clearly highlighted this point. Two phases are performed after
that. Firstly, the training/learning phase, which enables the
intelligent system to build up the right knowledge base. IDS
learns about relationships that exist in the built training dataset.
This training phase is seen as an adaptation process to IDS in
order to give the best response during the next phase, which is
the testing phase.
In this phase, the intelligent system will receive different
dataset, testing data set and processes it to produce an output.
To test and evaluate MLP RBF and Linear-SVM algorithms, a
5-fold cross validation is utilized as a test option. The dataset is
split into 5 subsets, and for each running time, one of these five
subsets are used as the training set and then the other subsets as
the test set.
In order to evaluate the algorithms’ effectiveness for IDS,
three experiments are carried out. The WEKA simulator
version 3.6 [54] is utilized in the classification process. That is,
the available algorithms for RBF, MLP and SVM on the Weka
simulator are employed. For the Weka parameters of the
algorithms, the Weka system’s default settings are utilized,
except for the fold cross validation, where we utilized value
five.
V. DISCUSSION OF RESULTS
The confusion matrix is used to measure the three
intelligent algorithms’ performance [55][56]. This provides
visualization of how the classifier performs on the input
dataset. A number of different performance metrics, including
recall, accuracy and specificity, are derived from the confusion
matrix. Table 2 shows the structure of this matrix. The 4
possible outcomes/cases are true positive (TP), false positive
(FP), and false negative (FN) and true negative (TN) [51][57].
TABLE 2: CONFUSION MATRIX
Predicted class
Positive Negative
Actual class Positive
TP FP
Negative
FN TN
Figure 4: Selected dataset distributions
We evaluated these algorithms by using accuracy as the
performance metric in this study. Accuracy in this instance
represents the overall correctness of the intelligent
classification of the dataset. It is given by:
FN)+FP+TP+(TN
TP)+(TN
Accuracy =
As shown in Figure 5, the obtained results out from our
dataset show the comparison between the three intrusion
detection systems.
Figure 5: Accuracy comparison graph between MLP, RBF and Linear-SVM
as classifiers and cross Validation (folds-5) as Test Option over our selected
dataset
If we compare RBF, MLP and SVM (linear kernel), we can
see that under the cross Validation Method (5-flod) Test
Option, it is SVM with linear kernel that has the highest
identification of correct instances (it is 99.84 %
((1817+7010+114+37+8)/9000*100= 99.84%). The second
highest is RBF, which is around 99.64%.
MLP has the least with 98.98%. It is worth noting that
when it comes to the average time to build the model, RBF
proves to be much faster than MLP as the hidden layer is
computed through a single function, rather than a series of
weights, as is the case with MLP. It therefore can be concluded
that Linear-SVM provides the highest accuracy and the lowest
error rates. We can therefore generally conclude that the
SVM’s performance with linear kernel was the best of the other
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
27 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
classifiers in detecting these attacks. So it is more accurate than
either RBF or MLP. In addition, Linear-SVM is the quickest
classifier in terms of building the detection model, compared to
either RBF and MLP.
Figure 6 and Figure 7, respectively, show the accuracy and
error rates for the three algorithms. It is also important to note
that Tables 3, 4, and 5 highlight the Confusion matrixes for
RBF, MLP and Linear-SVM, respectively.
Figure 6: Overall accuracy rate for the three intelligent algorithms
Figure 7: Overall error rate for the three intelligent algorithms
TABLE 3: CONFUSION MATRIX FOR MLP AS CLASSIFIER OVER OUR
SELECTED DATASET
Normal DoS Probe R2L U2R Accuracy
%
Normal 1815 4 1 0 1 99.67
DoS 7 7004 2 0 0 99.87
Probe 24 22 68 0 0 59.65
R2L 9 7 9 16 0 39.02
U2R 1 4 0 0 6 54.55
The confusion matrixes show the number of instances that
have been assigned to each class. They show how many
instances for each class received various classifications. The
sum of the diagonals represents the amount of samples that
are correctly classified. For example, the total amount of
samples for MLP that have been correctly classified is the
sum of 1815, 7004, 68, 16 and 6.
TABLE 4 CONFUSION MATRIX FOR RBF NETWORK AS CLASSIFIER
OVER OUR SELECTED DATASET OVER OUR SELECTED DATASET
Normal DoS Probe  R2L    U2R Accuracy
% 
Normal 1809 1 5 5 1 99.34
DoS 3 7010 0 0 0 99.957
Probe 8 0 106 0 0 92.98
R2L 3 0 0 35 3 85.37
U2R 2 0 0 1 8 72.73
TABLE 5: CONFUSION MATRIX FOR LINEAR-SVM AS CLASSIFIER
OVER OUR SELECTED DATASET
Normal DoS Probe R2L U2R Accuracy
%
Normal 1817 3 0 0 1 99.78
DoS 3 7010 0 0 0 99.96
Probe 0 0 114 0 0 100
R2L 3 0 0 37 1 90.24
U2R 2 0 0 1 8 72.73
VI. CONCLUSIONS AND FUTURE WORK
Network intrusion detection has recently become an area of
rapid advancement. There are similar advances in intelligent
computing, which have led to several classification techniques
being introduced to identify network traffic and differentiate it
into anomalous and normal. Intrusion detection that is based on
computational intelligence has been attracting much interest
from researchers in the research community. Its characteristics,
including adaptation, high computational speed, fault tolerance,
and error resilience in the face of noisy information, fit the
requirements that are needed to build a good intrusion detection
system.
In this paper, we have explained the requirement to apply
intelligent algorithms to network events in order to classify
network attack events. In particular, the performance of the 3
intelligent algorithms, which are MLP, RBF and Linear-SVM,
on an adapted KDD 1999 dataset was evaluated. This was by
done by both simulation and a comparison study. The results
obtained reveal that SVM with linear kernel will perform better
than MLP and the RBF network for detecting attacks in terms
of achieving better accuracy and a lower error rate.
Experiments show that Linear-SVM proves to be an
efficient algorithm that is able to detect various kinds of
intrusions/attacks in network, such as DoS, Probe, U2R and
R2L. Linear-SVM has the best detection accuracy when it
comes to detecting different types of attacks. It, therefore, has
the lowest error rate of all.
As future work, we intend to evaluate SVM’s under the
other benchmarking datasets. In addition, we will conduct a
performance comparison between SVM and different kernels,
such as Gaussian or sigmoid kernels. This will be done to find
the best kernel or activation function for SVM which can give
the best attack detection rate for building IDS.
REFERENCES
[1] Depren, O., Topallar, M., Anarim, E., & Ciliz, M. K. (2005). An
intelligent intrusion detection system (IDS) for anomaly and misuse
detection in computer networks. Expert systems with
Applications, 29(4), 713-722.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
28 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
[2] MeeraGandhi, G. (2010). Machine learning approach for attack
prediction and classification using supervised learning algorithms. Int. J.
Comput. Sci. Commun, 1(2).
[3] Nguyen, H. A., & Choi, D. (2008). Application of data mining to
network intrusion detection: classifier selection model. In Asia-Pacific
Network Operations and Management Symposium (pp. 399-408).
Springer Berlin Heidelberg.
[4] Subramanian, S., Srinivasan, V. B., & Ramasa, C. (2012). Study on
classification algorithms for network intrusion systems. Journal of
Communication and Computer, 9(11), 1242-1246.
[5] Li, M., & Dongliang, W. (2009). Anormaly intrusion detection based on
SOM. In Information Engineering, 2009. ICIE'09. WASE International
Conference on (Vol. 1, pp. 40-43). IEEE.
[6] Summers, R. C. (1997). Secure computing: threats and safeguards.
McGraw-Hill, Inc..
[7] Peddabachigari, S., Abraham, A., Grosan, C., & Thomas, J. (2007).
Modeling intrusion detection system using hybrid intelligent
systems. Journal of network and computer applications, 30(1), 114-132.
[8] Jamali, S., & Jafarzadeh, P. (2011). An intelligent intrusion detection
system by using hierarchically structured learning automata. Neural
Computing and Applications, 1-8.
[9] Wu, S. X., & Banzhaf, W. (2010). The use of computational intelligence
in intrusion detection systems: A review. Applied Soft Computing, 10(1),
1-35.
[10] Sundaram, A. (1996). An introduction to intrusion
detection. Crossroads, 2(4), 3-7.
[11] Chimedtseren, E., Iwai, K., Tanaka, H., & Kurokawa, T. (2014,
December). Intrusion detection system using Discrete Fourier
Transform. In Computational Intelligence for Security and Defense
Applications (CISDA), 2014 Seventh IEEE Symposium on (pp. 1-5).
IEEE.
[12] Igbe, O., Darwish, I., & Saadawi, T. (2016). Distributed Network
Intrusion Detection Systems: An Artificial Immune System Approach.
In Connected Health: Applications, Systems and Engineering
Technologies (CHASE), 2016 IEEE First International Conference
on (pp. 101-106). IEEE.
[13] Di Pietro, R., & Mancini, L. V. (Eds.). (2008). Intrusion detection
systems (Vol. 38). Springer Science & Business Media.
[14] Kulothungan, K., Ganapathy, S., Yogesh, P., & Kannan, A. An Agent
based Intrusion Detection System for Wireless Sensor Networks Using
Multilevel Classification. International Journal of Modern Engineering
Research (IJMER), 1(2), 55-60.
[15] Anderson, J. A. (1995). An indroduction to Neural Networks, MIT
Press.
[16] Rhodes, B. C., Mahaffey, J. A., & Cannady, J. D. (2000). Multiple self-
organizing maps for intrusion detection. In Proceedings of the 23rd
national information systems security conference (pp. 16-19).
[17] Al-Yaseen, W. L., Othman, Z. A., & Nazri, M. Z. A. (2017). Multi-level
hybrid support vector machine and extreme learning machine based on
modified K-means for intrusion detection system. Expert Systems with
Applications, 67, 296-303.
[18] Chen, C. M., Chen, Y. L., & Lin, H. C. (2010). An efficient network
intrusion detection. Computer Communications, 33(4), 477-484.
[19] Deepa, A. J., & Kavitha, V. (2012). A comprehensive survey on
approaches to intrusion detection system. Procedia Engineering, 38,
2063-2069.
[20] Thaseen, S., & Kumar, C. A. (2013). An analysis of supervised tree
based classifiers for intrusion detection system. In Pattern Recognition,
Informatics and Mobile Engineering (PRIME), 2013 International
Conference on (pp. 294-299). IEEE.
[21] Feng, W., Zhang, Q., Hu, G., & Huang, J. X. (2014). Mining network
data for intrusion detection through combining SVMs with ant colony
networks. Future Generation Computer Systems, 37, 127-140.
[22] Kuang, F., Xu, W., & Zhang, S. (2014). A novel hybrid KPCA and SVM
with GA model for intrusion detection. Applied Soft Computing, 18,
178-184.
[23] Horng, S. J., Su, M. Y., Chen, Y. H., Kao, T. W., Chen, R. J., Lai, J. L.,
& Perkasa, C. D. (2011). A novel intrusion detection system based on
hierarchical clustering and support vector machines. Expert systems with
Applications, 38(1), 306-313.
[24] Hasan, M., Nasser, M., Pal, B., & Ahmad, S. (2013). Intrusion detection
using combination of various kernels based support vector
machine. International Journal of Scientific & Engineering
Research, 4(9), 1454-1463.
[25] Mukkamala, S., Sung, A. H., & Abraham, A. (2003). Intrusion detection
using ensemble of soft computing paradigms. In Intelligent systems
design and applications (pp. 239-248). Springer Berlin Heidelberg.
[26] Tavallaee, M., Bagheri, E., Lu, W., & Ghorbani, A. A. (2009). A
detailed analysis of the KDD CUP 99 data set. In Computational
Intelligence for Security and Defense Applications, 2009. CISDA 2009.
IEEE Symposium on (pp. 1-6). IEEE.
[27] Hassim, Y. M. M., & Ghazali, R. (2012). Training a functional link
neural network using an artificial bee colony for solving a classification
problems. arXiv preprint arXiv:1212.6922.
[28] Pal, A. K., & Pal, S. (2013). Classification model of prediction for
placement of students. International Journal of Modern Education and
Computer Science, 5(11), 49.
[29] Purnami, S. W., Zain, J. M., & Heriawan, T. (2011). An alternative
algorithm for classification large categorical dataset: k-mode clustering
reduced support vector machine. International Journal of Database
Theory and Application, 4(1), 19-30.
[30] Barnawi, A. Y., & Keshta, I. M. (2014). Energy management of wireless
sensor networks based on multi-layer perceptrons. In European Wireless
2014; 20th European Wireless Conference; Proceedings of (pp. 1-6).
VDE.
[31] Barnawi, A. Y., & Keshta, I. M. (2016). Energy Management in
Wireless Sensor Networks Based on Naive Bayes, MLP, and SVM
Classifications: A Comparative Study. Journal of Sensors, 2016.
[32] Gupta, G. R., & Ramanathan, P. (2007). Level set estimation using
uncoordinated mobile sensors. In International Conference on Ad-Hoc
Networks and Wireless (pp. 101-114). Springer Berlin Heidelberg.
[33] Magno, M., Brunelli, D., Zappi, P., & Benini, L. (2010). Energy
efficient cooperative multimodal ambient monitoring. In European
Conference on Smart Sensing and Context (pp. 56-70). Springer Berlin
Heidelberg.
[34] Sazonov, E. S., & Fontana, J. M. (2012). A sensor system for automatic
detection of food intake through non-invasive monitoring of
chewing. IEEE sensors journal, 12(5), 1340-1348.
[35] Bal, M., Amasyali, M. F., Sever, H., Kose, G., & Demirhan, A. (2014).
Performance evaluation of the machine learning algorithms used in
inference mechanism of a medical decision support system. The
Scientific World Journal, 2014.
[36] Yuan, S., Liang, D., Qiu, L., & Liu, M. (2012). Mobile multi-agent
evaluation method for wireless sensor networks-based large-scale
structural health monitoring. International Journal of Distributed Sensor
Networks.
[37] Chandolikar, N. S., & Nandavadekar, V. D. (2012). Comparative
Analysis of Two Algorithms for Intrusion Attack Classification Using
KDD CUP Dataset. International Journal of Computer Science and
Engineering (IJCSE), 1(1), 81-88..
[38] Panda, M., & Patra, M. R. (2007). Network intrusion detection using
naive bayes. International journal of computer science and network
security(IJCSNS), 7(12), 258-263.
[39] Wang, J., Yang, Q., & Ren, D. (2009). An intrusion detection algorithm
based on decision tree technology. In Information Processing, 2009.
APCIP 2009. Asia-Pacific Conference on (Vol. 2, pp. 333-335). IEEE.
[40] Chandolikar, N. S., & Nandavadekar, V. D. (2012). Efficient algorithm
for intrusion attack classification by analyzing KDD Cup 99. In Wireless
and Optical Communications Networks (WOCN), 2012 Ninth
International Conference on (pp. 1-5). IEEE.
[41] Bhavsar, Y. B., & Waghmare, K. C. (2013). Intrusion detection system
using data mining technique: Support vector machine. International
Journal of Emerging Technology and Advanced Engineering, 3(3), 581-
586.
[42] Ektefa, M., Memar, S., Sidi, F., & Affendey, L. S. (2010). Intrusion
detection using data mining techniques. In Information Retrieval &
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
29 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
Knowledge Management,(CAMP), 2010 International Conference
on (pp. 200-203). IEEE.
[43] Das, A., & Nayak, R. B. (2012). A divide and conquer feature reduction
and feature selection algorithm in KDD intrusion detection dataset.
In Sustainable Energy and Intelligent Systems (SEISCON 2012), IET
Chennai 3rd International on (pp. 1-4). IET.
[44] Battiti, R., Brunato, M., & Mascia, F. (2008). Reactive search and
intelligent optimization (Vol. 45). Springer Science & Business Media.
[45] Karlik, B., & Olgac, A. V. (2011). Performance analysis of various
activation functions in generalized MLP architectures of neural
networks. International Journal of Artificial Intelligence and Expert
Systems, 1(4), 111-122.
[46] Bouzgou, H., & Benoudjit, N. (2011). Multiple architecture system for
wind speed prediction. Applied Energy, 88(7), 2463-2471.
[47] Haykin, S. S. (2001). Neural networks: a comprehensive foundation.
Tsinghua University Press.
[48] Vapnik, V. (2013). The nature of statistical learning theory. Springer
science & business media.
[49] Bennett, K. P., & Campbell, C. (2000). Support vector machines: hype
or hallelujah?. ACM SIGKDD Explorations Newsletter, 2(2), 1-13.
[50] KDD Cup 1999 Data, Information and Computer Science, University of
California, Irvine. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.
html
[51] Bijone, M. (2016). A Survey on Secure Network: Intrusion Detection &
Prevention Approaches. American Journal of Information Systems, 4(3),
69-88.
[52] Ashfaq, R. A. R., Wang, X. Z., Huang, J. Z., Abbas, H., & He, Y. L.
(2017). Fuzziness based semi-supervised learning approach for intrusion
detection system. Information Sciences, 378, 484-497
[53] Htun, P. T., & Khaing, K. T. (2012). Anomaly Intrusion Detection
System using Random Forests and k-Nearest
Neighbor. Probe, 41102(4107), 2377.
[54] http://www.cs.waikato.ac.nz/ml/weka/
[55] Weiss, S. M., & Indurkhya, N. (1998). Predictive data mining: a
practical guide. Morgan Kaufmann.
[56] Ahmim, A., & Ghoualmi-Zine, N. (2013). A new fast and high
performance intrusion detection system. International Journal of
Security and Its Applications, 7(5), 67-80.
[57] Kim, J., Shin, N., Jo, S. Y., & Kim, S. H. (2017). Method of intrusion
detection using deep neural network. In Big Data and Smart Computing
(BigComp), 2017 IEEE International Conference on (pp. 313-316).
IEEE.
AUTHOR PROFILE
Ismail Keshta is an assistant professor in the Department of Computer
and Information Technology at Dammam Community College (DCC),
King Fahd University of Petroleum and Minerals (KFUPM), Dhahran,
Saudi Arabia. He received the B.Sc. and M.Sc. degrees in computer
engineering and the Ph.D. degree in computer science and engineering
from the King Fahd University of Petroleum and Minerals (KFUPM),
Dhahran, Saudi Arabia, in 2009, 2011, and 2016, respectively. He was a
Lecturer with the Computer Engineering Department, KFUPM, from
2012 to 2016. Prior to that, in 2011, he was a Lecturer with Princess
Nora Bint Abdulrahman University (PNU) and Imam Muhammad ibn
Saud Islamic University (IMAMU), Riyadh, Saudi Arabia. He His
research interests include software process improvement, modeling, and
intelligent systems.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
30 https://sites.google.com/site/ijcsis/
ISSN 1947-5500

More Related Content

Intelligent Intrusion Detection System Based on MLP, RBF and SVM Classification Algorithms: A Comparative Study

  • 1. Intelligent Intrusion Detection System Based on MLP, RBF and SVM Classification Algorithms: A Comparative Study Ismail M. Keshta ismailk@dcc.kfupm.edu.sa dr.ismail.keshta@gmail.com Abstract—An effective approach for tackling network security problems is Intrusion detection systems (IDS). These kind of systems play a key role in network security as they can detect different types of attacks in networks, including DoS, U2R Probe and R2L. In addition, IDS are an increasingly key part of the system’s defense. Various approaches to IDS are now being used, but are unfortunately relatively ineffective. Data mining techniques and artificial intelligence play an important role in security services. We will present a comparative study of three well- known intelligent algorithms in this paper. These are Radial Basis Functions (RBF), Multilayer Perceptrons (MLP) and Support Vector Machine (SVM).This work’s main interest is to benchmark the performance of these3 intelligent algorithms. This is done by using a dataset of about 9,000 connections, randomly chosen from KDD'99’s 10% dataset. In addition, we investigate these algorithms’ performance in terms of their attack classification accuracy. The Simulation results are also analyzed and the discussion is then presented. It has been observed that SVM with a linear kernel (Linear-SVM) gives a better performance than MLP and RBF in terms of its detection accuracy and processing speed. Keywords- Intrusion detection system; Network security; Machine learning; Anomaly detection; KDD Cup 99 I. INTRODUCTION Network security is fast becoming a big challenge. As interconnections among computer systems grow rapidly Computer networks need to be protected against the unauthorized disclosure of information, denial-of-service (DoS) attacks and the modifying or destroying of data [1]. Attack detection techniques have become a critical issue that are being used to secure networks. Making a network secure is so difficult for many reasons, including the complexity of computers and networks, a lack of awareness of the various risks and threats, increasing internet usage and the computer system’s vulnerabilities [2][3]. It is vital to note here that detection techniques have become a vital difficulty of open research and so they get given the additional attention of the research community. Furthermore, it is important to state that the network attacks’ complex properties are key issues that work against these detection techniques [4][5]. The traditional techniques, including avoiding any programming errors and firewalls, have not succeeded in fully protecting networks and systems from the dangers of malware and so attacks are becoming increasingly sophisticated [6]. Peddabachigari et al. [7] showed that programming errors can no longer be avoided as the system’s complexity and application software is rapidly evolving, leaving weaknesses that can be exploited. Jamali et al. [8] state that firewalls are not sufficient to give the network total security because they just throttle attacks that come from outside and do not have any effect on the risk of inside attacks. It is likely that computer systems will remain unsecured in the near future. Therefore, IDS have now become a vital and indispensable part of security infrastructure that are used to detect any sophisticated attacks and malware early before they can inflict any wide spread damage [7]-[9]. IDS is, therefore, needed as an extra wall to protect systems despite these prevention techniques. Detection of intrusion is useful in the detection of intrusions that are successful, as well as monitoring bids to break security [10]-[12]. IDS protects computer systems against hateful operations by detecting the violation of security policies and active defenders, including by alarming operators [13]. It particularly helps the network to provide resistance against external attacks [14]. Figure 1: Organization of a generalized IDS It is vital to state that many issues need to be considered when building an IDS, including data collection, response, data preprocessing, reporting and intrusion recognition, which is at the heart of it. The organization of an IDS is illustrated in Figure 1. International Journal of Computer Science and Information Security (IJCSIS), Vol. 16, No. 5, May 2018 23 https://sites.google.com/site/ijcsis/ ISSN 1947-5500
  • 2. Existing IDS systems are able to be divided into two categories in general, according to each of the detection approaches, which are anomaly detection and misuse detection [15] [16]. Misuse-based IDS is able to detect known attacks efficiently, but fails to find new attacks which fail to embody the rules in the database [17]. Therefore, a database has to be continuously updated to store the signatures of every attack that is known. This IDS type is obviously unable to detect new attacks unless it is trained to [18]. Anomaly-based IDS can build a normal behavior model and it then distinguishes any major deviations from the model as being an intrusion. This IDS type is able to detect new attacks or unknown ones but it features a high rate of false alarms [19]-[20]. Research efforts have been made to reduce these false alarms by proposing an intelligent IDS that is based on machine learning. A number of anomaly detection systems are developed in literature that are based on a lot of different kinds of machine learning techniques[21]-[25].Some of these studies can apply single learning techniques. But some systems are based on a combination of different learning techniques. Machine learning classification algorithms provide a very promising solution and are able to discover novel attacks that are based on their own features[26]. In addition, they can be utilized to study and then identify correlated data, make decisions, make predictions and classify data[24]-[26]. Algorithms like Multilayer Perceptron (MLP), Radial Basis Functions (RBF) and Support Vector Machine (SVM) are all examples of algorithms which are well-known, widely adopted and have been investigated in neural networks, machine learning and artificial intelligence. MLP can, for a start, successfully perform the classification operation[27][28], while MLP neural network training is hard because of its structure’s complexity [28]. SVM is also a very strong algorithm in data mining, which has been applied successfully in a number of scientific applications [29]. Despite how vital machine learning algorithms are for intrusion detection systems[21]-[25], more could be done to provide comparison studies between the algorithms, as little attention has been given to this, particularly when it comes to the designing of an effective IDS for both computer and network systems. Furthermore, little has been done to specify an intelligent IDS that would reduce the anomaly-based detection’s false alarm rate. We conducted in this work a comprehensive and detailed comparative study across a total of 3 intelligent classification algorithms, which are RBF, SVM and MLP, with linear kernel. This is a polynomial kernel with exponent 1, and we chose it to be linear for SVM as Linear-SVM is both efficient and fast. Linear-SVM is able to consume less energy in the course of the learning process in the deployment phase, unlike MLP and RBF [30] [31]. Moreover Gupta and Ramanathan [32], as well as Magno et al. [33], stated that Linear-SVM is a low complexity classifier. Magno et al. [33] also highlighted that Linear-SVM gives a good balance between the computational and memory cost and the percentage of correctly classified data. Sazonov et al. [34] said there were two powerful SVM characteristics, which are high generalization and robustness. In addition, Bal et al. [35] found that SVM with linear kernel is a very promising algorithm that exists in the machine learning field. Yuan et al. [36] also concluded that an SVM classifier, especially one with linear kernel, can both learn and build the knowledge that is needed from less training samples and yet can still provide a high level of classification accuracy, unlike a number of other classifiers such as MLP and RBF. The following are our major contributions in this work. Firstly, we provided a number of detailed and state-of-the-art related IDS models, which were based on the intelligent machine learning algorithms. Secondly, we undertook a comprehensive comparison between three intelligent classifiers by using a real benchmark dataset. Thirdly, the performance of all three was examined by utilizing confusion matrix. Lastly, we were able to propose an intelligent IDS framework for effective and efficient IDS management computer and network systems. The framework was addressed at classification level. Utilizing Linear-SVM as an intelligent classifier, it is considered a core element in the building of the framework. We also discussed an evaluation of the proposed framework, and the simulation of results for detecting malicious attacks, like Remote to Local (R2L), Denial of Service (DoS) Attacks, Remote to User (R2L) Attacks and Probing attacks, are all provided. The rest of this paper is organized in the following way. Section II gives a literature review of the recent approaches that have been proposed for IDS that is based on intelligent classification algorithms. Section III highlights some background into the classification algorithms that were utilized in the work, which is RBF, MLB and SVM. It also provides a useful overview of the experimental dataset. The paper’s main contribution is discussed in SectionIV, while simulation experiments and the ways they were setup is illustrated in SectionV. This section also summarizes and discusses the results of the simulation. Lastly, Section VI provides the conclusion of the paper and also highlights any future research directions. II. RELATED WORK There has been a lot of researches into anomaly-based intrusion detection, and some of them have used machine learning, as well as data mining techniques. Decision tree, neural networks, clustering and Bayesian parameter estimation are some techniques that have been used to detect any intrusive behaviors in the computer network. Chandolikar et al. [37] evaluated the performance of 2 classification algorithms. These were Bayes net and J48 algorithm, which are both used for detecting computer attacks. The results reveal that J48 learning algorithm was more accurate than Bayes net algorithm in terms of achieving better accuracy and it had a lower error rate. A benchmark was used in the evaluation. This was the KDD cup dataset. It was emphasized that J48 algorithm had a higher accuracy which helps to increase the IDS’ efficiency. The Principal Component Analysis and Naive Bayes classifier was employed by Panda et al. [38] to give them a way of detecting intrusion by using machine learning algorithms. These experiments were carried out ontheKDD’99 cup dataset, an intrusion detection dataset. The dimensionality International Journal of Computer Science and Information Security (IJCSIS), Vol. 16, No. 5, May 2018 24 https://sites.google.com/site/ijcsis/ ISSN 1947-5500
  • 3. of the dataset was reduced by utilizing principal component analysis, as well as the Naïve Bayes classifier classification of the dataset. This was done in both the normal and attack classes. They concluded that the approach they used was a description of a Network Intrusion detection system framework which used two algorithms, Naive Bayes and Principal Component Analysis. The result they obtained showed that their approach was faster compared to a number of the other existing systems. An intrusion detection system was proposed by Wang et al. [39]. This was based on C4.5 decision tree, one of the algorithm-based neural networks. The result revealed that the intrusion detection system was effective and feasible, and had a high rate of accuracy. All of their experiments were conducted on a KDD CUP1999 dataset, a test set that is widely used for intrusion detection fields. The tree that is generated by the C4.5 neural network classification algorithm for intrusion detection was used to build rules. These then can use the knowledge base of IDS. In other words, the rules are able to give an indication if a new network behavior is either normal or abnormal, based on the built knowledge. The J48 intelligent algorithm was utilized by Chandolikar et al. [40] in the experiments they did to make IDS. Their results show that J48 is an effective and efficient algorithm of the classification in the KDD CUP1999 dataset. Yogita et al. [41] proposed IDS that used SVM as a data mining technique. It is vital to mention here that SVM is a very popular classification algorithm. However, they highlighted the main drawback, which is that SVM takes a very long time to train the neural network. These experiments were done by utilizing the NSL-KDD Cup’99 dataset’s improved version of the KDD Cup’99 dataset. They used the Gaussian RBF as the kernel function and a 10-fold cross validation as the test option parameter that was used for SVM. In addition, they pointed out that the method based SVM that was proposed was able to increase the accuracy of intrusion detection and cut down on the time taken to build this classification model. The aim of Mohammadreza et al. [42] was to use data mining techniques, which included SVM and the classification tree for IDS. The results reveal that the C4.5 algorithm is better than SVM at detection of any network intrusions. These experiments were carried out on a KDD CUP 99 dataset. Das et al. [43] looked at the IDS at its preprocessing level, which is the level before the classification process, and proposed what is called a divide and conquer algorithm. The aim of this was to reduce the feature set from the large KDD 99 dataset. The proposed algorithm successfully reduced the IDS’s overhead for analyzing the entire KDD dataset. This was done by selecting the vital features and then classifying them all with a maximized rate of classification. It was a generic algorithm and it could be applied to absolutely any dataset. The authors used LDA, KNN, C4.5, SVM and a number of classification algorithms in order to classify the various feature sets that had been obtained. III. PRELIMINARIES This section gives a brief background about the three intelligent algorithms used in this study, as well as about the dataset for the experimental comparison. A. Classification Algorithms The various classification algorithms that were used in the research project are described in brief below. 1) Multilayer Perceptron (MLP) This is composed of a big amount of widely interconnected neurons that all work in parallel in order to solve a particular problem. MLP is organized in a series of layers that have a feed-forward information flow. An MLP network’s main architecture consists of a number of signals which flow sequentially through these various layers, starting with the input layer, through to the output layer. Between these two layers are a number of intermediate layers, which are also known as hidden layers because you cannot see them at either the input or the output. Each of the units is first utilized to calculate what the difference is between a vector of weights and a vector provided by the outputs of the previous layer. In order to generate the next layer’s input, a transfer function, which is also called activation, was applied to the result [44]. RBF, unipolar sigmoid and bipolar sigmoid are all examples of activation functions that are both well-known and commonly used.[45]. The training phase’s main steps in an MLP network are the following: Firstly, after being given the dataset’s input pattern, this particular pattern is forward-propagated to the MLP network’s output and it is then compared with the output desired. Secondly, the error signal that exists between the network’s output and the desired response is then back- propagated to the network. Lastly, a number of adjustments are made to the synaptic weights [46]. The process is the repeated for the next input vector and this continues until all of the training patterns have been passed right through the network. 2) Radial Basis Functions (RBF) This involves a total of three layers. The first is called the input layer and it is made up of source nodes (or sensory units). The amount of these source nodes is equal to the input vector’s dimension. The second is the hidden layer, which consists of nonlinear units. These are directly connected to every one of the sensory units in the input layer. The RBF network has only a single hidden layer that has RBF activation functions. Lastly, the output layer is utilized to linearly combine the hidden layer’s outputs and give the network’s response to the input data [47]. 3) Support Vector Machine (SVM) This splits the dataset into two different classes. These are separated by placing a linear boundary between both the normal and attack classes in a way that maximizes the margin. SVM finds the hyperplane that is able to provide the maximum distance there is between the hyperplane and the closest of the positive and negative samples [48][49]. The SVM network’s basic structure is similar to the structure of the ordinary RBF network. However, the kernel activating function is applied instead of the exponential activating function (which is generally Gaussian activation functions).This kernel activating function can be either a polynomial kernel, a Gaussian radial International Journal of Computer Science and Information Security (IJCSIS), Vol. 16, No. 5, May 2018 25 https://sites.google.com/site/ijcsis/ ISSN 1947-5500
  • 4. basis kernel, or two layer feed-forward neural network kernels [49]. B. Dataset This section gives a brief description of the dataset that is used in the work. The KDDCUP’99 dataset was prepared by the 1998 DARPA Intrusion Detection Evaluation program by MIT Lincoln Laboratories [50]. It can be seen from the literature that this dataset has been used widely for the evaluation of anomaly based IDS. A lot of researchers are using the KDDCUP’99 dataset as it is the only publicly available dataset for the ID problem, and also because it is possible to extract useful information from it [51] [52]. The full dataset contained around 5 million instance/records. This is where each data raw has its connection records. Connection is defined in many references as a sequence of TCP packets that start and end at some time between a source and a destination under a protocol that is well-defined [50]-[52]. This dataset contains a number of different attack types, which are classified into 4 major categories. These are R2L, DOS, Probing and U2R. The KDD cup 99 set has a total of 41 attributes or features for each instance, or a sample plus 1 class label. The total number is, therefore, 42 attributes. The 41 attributes are destination bytes, count, dst host count, diff srv rate, wrongfragment and urgent. The 42nd field is a label that can be generalized as either normal or anomaly (U2R, DoS, Probing and R2L) [50] [53] (see Table 1). TABLE 1: TYPES OF ATTACKS IN KDD’99 DATASET Classification Short Description Name of Attacks DoS Attacker attempts to deny or prevent legitimate users from using a service. smurf, land, pod, teardrop, neptune, back R2L Attacker attempts to send packets to the victim machine in order to gain access because he does not have an account on it. ftp_write, phf, spy, warezmaster, warezclient, imap, guess_passwd, multihop U2R The attacker tries to exploit some vulnerability to gain root/super user access to the system. perl, buffer_overflow, rootkit, loadmodule Probe The attacker attempts to gather information about a computer network. portsweep, nmap, ipsweep, satan Tavallaee et al. [26] highlighted that the features of KDD’99 can be categorize into three different groups. These are Basic features, Content features and Traffic features. Basic features are utilized to encapsulate all of the attributes that have been extracted from the TCP/IP connection. The majority of these features can help to detect the major causes of network delays. There is then a second class, which is the Traffic features. These depend on window interval and they can be divided into 2 major features, which are “same host” features and also “same service” features. They are, therefore, called time-based features. “Same host” features are used to carry out an examination of network connections in the previous two seconds, and they have the same target host as the current connection. "Same service” features are utilized to test the network’s connections and have the same service as the current connection in the previous two seconds. The last of these classes is called Content features, which helps to detect U2R and R2L attacks. This is because these types of attacks do not have either a well-defined structured feature or well- defined pattern. Therefore, Content features have some features that enable IDS to detect any intrusion that is tending to cause or create suspicion in the data portion, like a number of failed log on attempts [26]. IV. THE PROPOSED SYSTEM The focus of this research work is on the original “10% KDD 99” dataset because of the limited memory capacity. The system flow for the proposed IDS is shown in Figure 2. The original “10% KDD 99” dataset is firstly loaded into the system. The next step is pre-processing, in which the input file is properly prepared. Figure 2: Block diagram of the proposed IDS International Journal of Computer Science and Information Security (IJCSIS), Vol. 16, No. 5, May 2018 26 https://sites.google.com/site/ijcsis/ ISSN 1947-5500
  • 5. Figure 3: Original KDD'99 10% dataset distributions In this step, a total of 9000 instances are randomly selected from 10% of the KDD CUP 1999 dataset with nearly the same distribution as the KDD'99 10% dataset. Figure 3 and Figure 4 clearly highlighted this point. Two phases are performed after that. Firstly, the training/learning phase, which enables the intelligent system to build up the right knowledge base. IDS learns about relationships that exist in the built training dataset. This training phase is seen as an adaptation process to IDS in order to give the best response during the next phase, which is the testing phase. In this phase, the intelligent system will receive different dataset, testing data set and processes it to produce an output. To test and evaluate MLP RBF and Linear-SVM algorithms, a 5-fold cross validation is utilized as a test option. The dataset is split into 5 subsets, and for each running time, one of these five subsets are used as the training set and then the other subsets as the test set. In order to evaluate the algorithms’ effectiveness for IDS, three experiments are carried out. The WEKA simulator version 3.6 [54] is utilized in the classification process. That is, the available algorithms for RBF, MLP and SVM on the Weka simulator are employed. For the Weka parameters of the algorithms, the Weka system’s default settings are utilized, except for the fold cross validation, where we utilized value five. V. DISCUSSION OF RESULTS The confusion matrix is used to measure the three intelligent algorithms’ performance [55][56]. This provides visualization of how the classifier performs on the input dataset. A number of different performance metrics, including recall, accuracy and specificity, are derived from the confusion matrix. Table 2 shows the structure of this matrix. The 4 possible outcomes/cases are true positive (TP), false positive (FP), and false negative (FN) and true negative (TN) [51][57]. TABLE 2: CONFUSION MATRIX Predicted class Positive Negative Actual class Positive TP FP Negative FN TN Figure 4: Selected dataset distributions We evaluated these algorithms by using accuracy as the performance metric in this study. Accuracy in this instance represents the overall correctness of the intelligent classification of the dataset. It is given by: FN)+FP+TP+(TN TP)+(TN Accuracy = As shown in Figure 5, the obtained results out from our dataset show the comparison between the three intrusion detection systems. Figure 5: Accuracy comparison graph between MLP, RBF and Linear-SVM as classifiers and cross Validation (folds-5) as Test Option over our selected dataset If we compare RBF, MLP and SVM (linear kernel), we can see that under the cross Validation Method (5-flod) Test Option, it is SVM with linear kernel that has the highest identification of correct instances (it is 99.84 % ((1817+7010+114+37+8)/9000*100= 99.84%). The second highest is RBF, which is around 99.64%. MLP has the least with 98.98%. It is worth noting that when it comes to the average time to build the model, RBF proves to be much faster than MLP as the hidden layer is computed through a single function, rather than a series of weights, as is the case with MLP. It therefore can be concluded that Linear-SVM provides the highest accuracy and the lowest error rates. We can therefore generally conclude that the SVM’s performance with linear kernel was the best of the other International Journal of Computer Science and Information Security (IJCSIS), Vol. 16, No. 5, May 2018 27 https://sites.google.com/site/ijcsis/ ISSN 1947-5500
  • 6. classifiers in detecting these attacks. So it is more accurate than either RBF or MLP. In addition, Linear-SVM is the quickest classifier in terms of building the detection model, compared to either RBF and MLP. Figure 6 and Figure 7, respectively, show the accuracy and error rates for the three algorithms. It is also important to note that Tables 3, 4, and 5 highlight the Confusion matrixes for RBF, MLP and Linear-SVM, respectively. Figure 6: Overall accuracy rate for the three intelligent algorithms Figure 7: Overall error rate for the three intelligent algorithms TABLE 3: CONFUSION MATRIX FOR MLP AS CLASSIFIER OVER OUR SELECTED DATASET Normal DoS Probe R2L U2R Accuracy % Normal 1815 4 1 0 1 99.67 DoS 7 7004 2 0 0 99.87 Probe 24 22 68 0 0 59.65 R2L 9 7 9 16 0 39.02 U2R 1 4 0 0 6 54.55 The confusion matrixes show the number of instances that have been assigned to each class. They show how many instances for each class received various classifications. The sum of the diagonals represents the amount of samples that are correctly classified. For example, the total amount of samples for MLP that have been correctly classified is the sum of 1815, 7004, 68, 16 and 6. TABLE 4 CONFUSION MATRIX FOR RBF NETWORK AS CLASSIFIER OVER OUR SELECTED DATASET OVER OUR SELECTED DATASET Normal DoS Probe  R2L    U2R Accuracy %  Normal 1809 1 5 5 1 99.34 DoS 3 7010 0 0 0 99.957 Probe 8 0 106 0 0 92.98 R2L 3 0 0 35 3 85.37 U2R 2 0 0 1 8 72.73 TABLE 5: CONFUSION MATRIX FOR LINEAR-SVM AS CLASSIFIER OVER OUR SELECTED DATASET Normal DoS Probe R2L U2R Accuracy % Normal 1817 3 0 0 1 99.78 DoS 3 7010 0 0 0 99.96 Probe 0 0 114 0 0 100 R2L 3 0 0 37 1 90.24 U2R 2 0 0 1 8 72.73 VI. CONCLUSIONS AND FUTURE WORK Network intrusion detection has recently become an area of rapid advancement. There are similar advances in intelligent computing, which have led to several classification techniques being introduced to identify network traffic and differentiate it into anomalous and normal. Intrusion detection that is based on computational intelligence has been attracting much interest from researchers in the research community. Its characteristics, including adaptation, high computational speed, fault tolerance, and error resilience in the face of noisy information, fit the requirements that are needed to build a good intrusion detection system. In this paper, we have explained the requirement to apply intelligent algorithms to network events in order to classify network attack events. In particular, the performance of the 3 intelligent algorithms, which are MLP, RBF and Linear-SVM, on an adapted KDD 1999 dataset was evaluated. This was by done by both simulation and a comparison study. The results obtained reveal that SVM with linear kernel will perform better than MLP and the RBF network for detecting attacks in terms of achieving better accuracy and a lower error rate. Experiments show that Linear-SVM proves to be an efficient algorithm that is able to detect various kinds of intrusions/attacks in network, such as DoS, Probe, U2R and R2L. Linear-SVM has the best detection accuracy when it comes to detecting different types of attacks. It, therefore, has the lowest error rate of all. As future work, we intend to evaluate SVM’s under the other benchmarking datasets. In addition, we will conduct a performance comparison between SVM and different kernels, such as Gaussian or sigmoid kernels. This will be done to find the best kernel or activation function for SVM which can give the best attack detection rate for building IDS. REFERENCES [1] Depren, O., Topallar, M., Anarim, E., & Ciliz, M. K. (2005). An intelligent intrusion detection system (IDS) for anomaly and misuse detection in computer networks. Expert systems with Applications, 29(4), 713-722. International Journal of Computer Science and Information Security (IJCSIS), Vol. 16, No. 5, May 2018 28 https://sites.google.com/site/ijcsis/ ISSN 1947-5500
  • 7. [2] MeeraGandhi, G. (2010). Machine learning approach for attack prediction and classification using supervised learning algorithms. Int. J. Comput. Sci. Commun, 1(2). [3] Nguyen, H. A., & Choi, D. (2008). Application of data mining to network intrusion detection: classifier selection model. In Asia-Pacific Network Operations and Management Symposium (pp. 399-408). Springer Berlin Heidelberg. [4] Subramanian, S., Srinivasan, V. B., & Ramasa, C. (2012). Study on classification algorithms for network intrusion systems. Journal of Communication and Computer, 9(11), 1242-1246. [5] Li, M., & Dongliang, W. (2009). Anormaly intrusion detection based on SOM. In Information Engineering, 2009. ICIE'09. WASE International Conference on (Vol. 1, pp. 40-43). IEEE. [6] Summers, R. C. (1997). Secure computing: threats and safeguards. McGraw-Hill, Inc.. [7] Peddabachigari, S., Abraham, A., Grosan, C., & Thomas, J. (2007). Modeling intrusion detection system using hybrid intelligent systems. Journal of network and computer applications, 30(1), 114-132. [8] Jamali, S., & Jafarzadeh, P. (2011). An intelligent intrusion detection system by using hierarchically structured learning automata. Neural Computing and Applications, 1-8. [9] Wu, S. X., & Banzhaf, W. (2010). The use of computational intelligence in intrusion detection systems: A review. Applied Soft Computing, 10(1), 1-35. [10] Sundaram, A. (1996). An introduction to intrusion detection. Crossroads, 2(4), 3-7. [11] Chimedtseren, E., Iwai, K., Tanaka, H., & Kurokawa, T. (2014, December). Intrusion detection system using Discrete Fourier Transform. In Computational Intelligence for Security and Defense Applications (CISDA), 2014 Seventh IEEE Symposium on (pp. 1-5). IEEE. [12] Igbe, O., Darwish, I., & Saadawi, T. (2016). Distributed Network Intrusion Detection Systems: An Artificial Immune System Approach. In Connected Health: Applications, Systems and Engineering Technologies (CHASE), 2016 IEEE First International Conference on (pp. 101-106). IEEE. [13] Di Pietro, R., & Mancini, L. V. (Eds.). (2008). Intrusion detection systems (Vol. 38). Springer Science & Business Media. [14] Kulothungan, K., Ganapathy, S., Yogesh, P., & Kannan, A. An Agent based Intrusion Detection System for Wireless Sensor Networks Using Multilevel Classification. International Journal of Modern Engineering Research (IJMER), 1(2), 55-60. [15] Anderson, J. A. (1995). An indroduction to Neural Networks, MIT Press. [16] Rhodes, B. C., Mahaffey, J. A., & Cannady, J. D. (2000). Multiple self- organizing maps for intrusion detection. In Proceedings of the 23rd national information systems security conference (pp. 16-19). [17] Al-Yaseen, W. L., Othman, Z. A., & Nazri, M. Z. A. (2017). Multi-level hybrid support vector machine and extreme learning machine based on modified K-means for intrusion detection system. Expert Systems with Applications, 67, 296-303. [18] Chen, C. M., Chen, Y. L., & Lin, H. C. (2010). An efficient network intrusion detection. Computer Communications, 33(4), 477-484. [19] Deepa, A. J., & Kavitha, V. (2012). A comprehensive survey on approaches to intrusion detection system. Procedia Engineering, 38, 2063-2069. [20] Thaseen, S., & Kumar, C. A. (2013). An analysis of supervised tree based classifiers for intrusion detection system. In Pattern Recognition, Informatics and Mobile Engineering (PRIME), 2013 International Conference on (pp. 294-299). IEEE. [21] Feng, W., Zhang, Q., Hu, G., & Huang, J. X. (2014). Mining network data for intrusion detection through combining SVMs with ant colony networks. Future Generation Computer Systems, 37, 127-140. [22] Kuang, F., Xu, W., & Zhang, S. (2014). A novel hybrid KPCA and SVM with GA model for intrusion detection. Applied Soft Computing, 18, 178-184. [23] Horng, S. J., Su, M. Y., Chen, Y. H., Kao, T. W., Chen, R. J., Lai, J. L., & Perkasa, C. D. (2011). A novel intrusion detection system based on hierarchical clustering and support vector machines. Expert systems with Applications, 38(1), 306-313. [24] Hasan, M., Nasser, M., Pal, B., & Ahmad, S. (2013). Intrusion detection using combination of various kernels based support vector machine. International Journal of Scientific & Engineering Research, 4(9), 1454-1463. [25] Mukkamala, S., Sung, A. H., & Abraham, A. (2003). Intrusion detection using ensemble of soft computing paradigms. In Intelligent systems design and applications (pp. 239-248). Springer Berlin Heidelberg. [26] Tavallaee, M., Bagheri, E., Lu, W., & Ghorbani, A. A. (2009). A detailed analysis of the KDD CUP 99 data set. In Computational Intelligence for Security and Defense Applications, 2009. CISDA 2009. IEEE Symposium on (pp. 1-6). IEEE. [27] Hassim, Y. M. M., & Ghazali, R. (2012). Training a functional link neural network using an artificial bee colony for solving a classification problems. arXiv preprint arXiv:1212.6922. [28] Pal, A. K., & Pal, S. (2013). Classification model of prediction for placement of students. International Journal of Modern Education and Computer Science, 5(11), 49. [29] Purnami, S. W., Zain, J. M., & Heriawan, T. (2011). An alternative algorithm for classification large categorical dataset: k-mode clustering reduced support vector machine. International Journal of Database Theory and Application, 4(1), 19-30. [30] Barnawi, A. Y., & Keshta, I. M. (2014). Energy management of wireless sensor networks based on multi-layer perceptrons. In European Wireless 2014; 20th European Wireless Conference; Proceedings of (pp. 1-6). VDE. [31] Barnawi, A. Y., & Keshta, I. M. (2016). Energy Management in Wireless Sensor Networks Based on Naive Bayes, MLP, and SVM Classifications: A Comparative Study. Journal of Sensors, 2016. [32] Gupta, G. R., & Ramanathan, P. (2007). Level set estimation using uncoordinated mobile sensors. In International Conference on Ad-Hoc Networks and Wireless (pp. 101-114). Springer Berlin Heidelberg. [33] Magno, M., Brunelli, D., Zappi, P., & Benini, L. (2010). Energy efficient cooperative multimodal ambient monitoring. In European Conference on Smart Sensing and Context (pp. 56-70). Springer Berlin Heidelberg. [34] Sazonov, E. S., & Fontana, J. M. (2012). A sensor system for automatic detection of food intake through non-invasive monitoring of chewing. IEEE sensors journal, 12(5), 1340-1348. [35] Bal, M., Amasyali, M. F., Sever, H., Kose, G., & Demirhan, A. (2014). Performance evaluation of the machine learning algorithms used in inference mechanism of a medical decision support system. The Scientific World Journal, 2014. [36] Yuan, S., Liang, D., Qiu, L., & Liu, M. (2012). Mobile multi-agent evaluation method for wireless sensor networks-based large-scale structural health monitoring. International Journal of Distributed Sensor Networks. [37] Chandolikar, N. S., & Nandavadekar, V. D. (2012). Comparative Analysis of Two Algorithms for Intrusion Attack Classification Using KDD CUP Dataset. International Journal of Computer Science and Engineering (IJCSE), 1(1), 81-88.. [38] Panda, M., & Patra, M. R. (2007). Network intrusion detection using naive bayes. International journal of computer science and network security(IJCSNS), 7(12), 258-263. [39] Wang, J., Yang, Q., & Ren, D. (2009). An intrusion detection algorithm based on decision tree technology. In Information Processing, 2009. APCIP 2009. Asia-Pacific Conference on (Vol. 2, pp. 333-335). IEEE. [40] Chandolikar, N. S., & Nandavadekar, V. D. (2012). Efficient algorithm for intrusion attack classification by analyzing KDD Cup 99. In Wireless and Optical Communications Networks (WOCN), 2012 Ninth International Conference on (pp. 1-5). IEEE. [41] Bhavsar, Y. B., & Waghmare, K. C. (2013). Intrusion detection system using data mining technique: Support vector machine. International Journal of Emerging Technology and Advanced Engineering, 3(3), 581- 586. [42] Ektefa, M., Memar, S., Sidi, F., & Affendey, L. S. (2010). Intrusion detection using data mining techniques. In Information Retrieval & International Journal of Computer Science and Information Security (IJCSIS), Vol. 16, No. 5, May 2018 29 https://sites.google.com/site/ijcsis/ ISSN 1947-5500
  • 8. Knowledge Management,(CAMP), 2010 International Conference on (pp. 200-203). IEEE. [43] Das, A., & Nayak, R. B. (2012). A divide and conquer feature reduction and feature selection algorithm in KDD intrusion detection dataset. In Sustainable Energy and Intelligent Systems (SEISCON 2012), IET Chennai 3rd International on (pp. 1-4). IET. [44] Battiti, R., Brunato, M., & Mascia, F. (2008). Reactive search and intelligent optimization (Vol. 45). Springer Science & Business Media. [45] Karlik, B., & Olgac, A. V. (2011). Performance analysis of various activation functions in generalized MLP architectures of neural networks. International Journal of Artificial Intelligence and Expert Systems, 1(4), 111-122. [46] Bouzgou, H., & Benoudjit, N. (2011). Multiple architecture system for wind speed prediction. Applied Energy, 88(7), 2463-2471. [47] Haykin, S. S. (2001). Neural networks: a comprehensive foundation. Tsinghua University Press. [48] Vapnik, V. (2013). The nature of statistical learning theory. Springer science & business media. [49] Bennett, K. P., & Campbell, C. (2000). Support vector machines: hype or hallelujah?. ACM SIGKDD Explorations Newsletter, 2(2), 1-13. [50] KDD Cup 1999 Data, Information and Computer Science, University of California, Irvine. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99. html [51] Bijone, M. (2016). A Survey on Secure Network: Intrusion Detection & Prevention Approaches. American Journal of Information Systems, 4(3), 69-88. [52] Ashfaq, R. A. R., Wang, X. Z., Huang, J. Z., Abbas, H., & He, Y. L. (2017). Fuzziness based semi-supervised learning approach for intrusion detection system. Information Sciences, 378, 484-497 [53] Htun, P. T., & Khaing, K. T. (2012). Anomaly Intrusion Detection System using Random Forests and k-Nearest Neighbor. Probe, 41102(4107), 2377. [54] http://www.cs.waikato.ac.nz/ml/weka/ [55] Weiss, S. M., & Indurkhya, N. (1998). Predictive data mining: a practical guide. Morgan Kaufmann. [56] Ahmim, A., & Ghoualmi-Zine, N. (2013). A new fast and high performance intrusion detection system. International Journal of Security and Its Applications, 7(5), 67-80. [57] Kim, J., Shin, N., Jo, S. Y., & Kim, S. H. (2017). Method of intrusion detection using deep neural network. In Big Data and Smart Computing (BigComp), 2017 IEEE International Conference on (pp. 313-316). IEEE. AUTHOR PROFILE Ismail Keshta is an assistant professor in the Department of Computer and Information Technology at Dammam Community College (DCC), King Fahd University of Petroleum and Minerals (KFUPM), Dhahran, Saudi Arabia. He received the B.Sc. and M.Sc. degrees in computer engineering and the Ph.D. degree in computer science and engineering from the King Fahd University of Petroleum and Minerals (KFUPM), Dhahran, Saudi Arabia, in 2009, 2011, and 2016, respectively. He was a Lecturer with the Computer Engineering Department, KFUPM, from 2012 to 2016. Prior to that, in 2011, he was a Lecturer with Princess Nora Bint Abdulrahman University (PNU) and Imam Muhammad ibn Saud Islamic University (IMAMU), Riyadh, Saudi Arabia. He His research interests include software process improvement, modeling, and intelligent systems. International Journal of Computer Science and Information Security (IJCSIS), Vol. 16, No. 5, May 2018 30 https://sites.google.com/site/ijcsis/ ISSN 1947-5500