An effective approach for tackling network security
problems is Intrusion detection systems (IDS). These kind of
systems play a key role in network security as they can detect
different types of attacks in networks, including DoS, U2R Probe
and R2L. In addition, IDS are an increasingly key part of the
system’s defense. Various approaches to IDS are now being used,
but are unfortunately relatively ineffective. Data mining techniques
and artificial intelligence play an important role in security
services. We will present a comparative study of three wellknown
intelligent algorithms in this paper. These are Radial Basis
Functions (RBF), Multilayer Perceptrons (MLP) and Support
Vector Machine (SVM).This work’s main interest is to benchmark
the performance of these3 intelligent algorithms. This is done by
using a dataset of about 9,000 connections, randomly chosen from
KDD'99’s 10% dataset. In addition, we investigate these
algorithms’ performance in terms of their attack classification
accuracy. The Simulation results are also analyzed and the
discussion is then presented. It has been observed that SVM with a
linear kernel (Linear-SVM) gives a better performance than MLP
and RBF in terms of its detection accuracy and processing speed.
Report
Share
Report
Share
1 of 8
Download to read offline
More Related Content
Intelligent Intrusion Detection System Based on MLP, RBF and SVM Classification Algorithms: A Comparative Study
1. Intelligent Intrusion Detection System Based on
MLP, RBF and SVM Classification Algorithms: A
Comparative Study
Ismail M. Keshta
ismailk@dcc.kfupm.edu.sa
dr.ismail.keshta@gmail.com
Abstract—An effective approach for tackling network security
problems is Intrusion detection systems (IDS). These kind of
systems play a key role in network security as they can detect
different types of attacks in networks, including DoS, U2R Probe
and R2L. In addition, IDS are an increasingly key part of the
system’s defense. Various approaches to IDS are now being used,
but are unfortunately relatively ineffective. Data mining techniques
and artificial intelligence play an important role in security
services. We will present a comparative study of three well-
known intelligent algorithms in this paper. These are Radial Basis
Functions (RBF), Multilayer Perceptrons (MLP) and Support
Vector Machine (SVM).This work’s main interest is to benchmark
the performance of these3 intelligent algorithms. This is done by
using a dataset of about 9,000 connections, randomly chosen from
KDD'99’s 10% dataset. In addition, we investigate these
algorithms’ performance in terms of their attack classification
accuracy. The Simulation results are also analyzed and the
discussion is then presented. It has been observed that SVM with a
linear kernel (Linear-SVM) gives a better performance than MLP
and RBF in terms of its detection accuracy and processing speed.
Keywords- Intrusion detection system; Network security; Machine
learning; Anomaly detection; KDD Cup 99
I. INTRODUCTION
Network security is fast becoming a big challenge. As
interconnections among computer systems grow rapidly
Computer networks need to be protected against the
unauthorized disclosure of information, denial-of-service (DoS)
attacks and the modifying or destroying of data [1].
Attack detection techniques have become a critical issue
that are being used to secure networks. Making a network
secure is so difficult for many reasons, including the
complexity of computers and networks, a lack of awareness of
the various risks and threats, increasing internet usage and the
computer system’s vulnerabilities [2][3]. It is vital to note here
that detection techniques have become a vital difficulty of open
research and so they get given the additional attention of the
research community. Furthermore, it is important to state that
the network attacks’ complex properties are key issues that
work against these detection techniques [4][5].
The traditional techniques, including avoiding any
programming errors and firewalls, have not succeeded in fully
protecting networks and systems from the dangers of malware
and so attacks are becoming increasingly sophisticated [6].
Peddabachigari et al. [7] showed that programming errors can
no longer be avoided as the system’s complexity and
application software is rapidly evolving, leaving weaknesses
that can be exploited. Jamali et al. [8] state that firewalls are
not sufficient to give the network total security because they
just throttle attacks that come from outside and do not have any
effect on the risk of inside attacks. It is likely that computer
systems will remain unsecured in the near future.
Therefore, IDS have now become a vital and indispensable
part of security infrastructure that are used to detect any
sophisticated attacks and malware early before they can inflict
any wide spread damage [7]-[9]. IDS is, therefore, needed as an
extra wall to protect systems despite these prevention
techniques. Detection of intrusion is useful in the detection of
intrusions that are successful, as well as monitoring bids to
break security [10]-[12]. IDS protects computer systems
against hateful operations by detecting the violation of security
policies and active defenders, including by alarming operators
[13]. It particularly helps the network to provide resistance
against external attacks [14].
Figure 1: Organization of a generalized IDS
It is vital to state that many issues need to be considered
when building an IDS, including data collection, response, data
preprocessing, reporting and intrusion recognition, which is at
the heart of it. The organization of an IDS is illustrated in
Figure 1.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
23 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
2. Existing IDS systems are able to be divided into two
categories in general, according to each of the detection
approaches, which are anomaly detection and misuse detection
[15] [16]. Misuse-based IDS is able to detect known attacks
efficiently, but fails to find new attacks which fail to embody
the rules in the database [17]. Therefore, a database has to be
continuously updated to store the signatures of every attack that
is known. This IDS type is obviously unable to detect new
attacks unless it is trained to [18]. Anomaly-based IDS can
build a normal behavior model and it then distinguishes any
major deviations from the model as being an intrusion. This
IDS type is able to detect new attacks or unknown ones but it
features a high rate of false alarms [19]-[20].
Research efforts have been made to reduce these false
alarms by proposing an intelligent IDS that is based on
machine learning. A number of anomaly detection systems are
developed in literature that are based on a lot of different kinds
of machine learning techniques[21]-[25].Some of these studies
can apply single learning techniques. But some systems are
based on a combination of different learning techniques.
Machine learning classification algorithms provide a very
promising solution and are able to discover novel attacks that
are based on their own features[26]. In addition, they can be
utilized to study and then identify correlated data, make
decisions, make predictions and classify data[24]-[26].
Algorithms like Multilayer Perceptron (MLP), Radial Basis
Functions (RBF) and Support Vector Machine (SVM) are all
examples of algorithms which are well-known, widely adopted
and have been investigated in neural networks, machine
learning and artificial intelligence. MLP can, for a start,
successfully perform the classification operation[27][28], while
MLP neural network training is hard because of its structure’s
complexity [28]. SVM is also a very strong algorithm in data
mining, which has been applied successfully in a number of
scientific applications [29].
Despite how vital machine learning algorithms are for
intrusion detection systems[21]-[25], more could be done to
provide comparison studies between the algorithms, as little
attention has been given to this, particularly when it comes to
the designing of an effective IDS for both computer and
network systems. Furthermore, little has been done to specify
an intelligent IDS that would reduce the anomaly-based
detection’s false alarm rate.
We conducted in this work a comprehensive and detailed
comparative study across a total of 3 intelligent classification
algorithms, which are RBF, SVM and MLP, with linear kernel.
This is a polynomial kernel with exponent 1, and we chose it to
be linear for SVM as Linear-SVM is both efficient and fast.
Linear-SVM is able to consume less energy in the course of the
learning process in the deployment phase, unlike MLP and
RBF [30] [31]. Moreover Gupta and Ramanathan [32], as well
as Magno et al. [33], stated that Linear-SVM is a low
complexity classifier. Magno et al. [33] also highlighted that
Linear-SVM gives a good balance between the computational
and memory cost and the percentage of correctly classified
data. Sazonov et al. [34] said there were two powerful SVM
characteristics, which are high generalization and robustness.
In addition, Bal et al. [35] found that SVM with linear kernel is
a very promising algorithm that exists in the machine learning
field. Yuan et al. [36] also concluded that an SVM classifier,
especially one with linear kernel, can both learn and build the
knowledge that is needed from less training samples and yet
can still provide a high level of classification accuracy, unlike a
number of other classifiers such as MLP and RBF.
The following are our major contributions in this work.
Firstly, we provided a number of detailed and state-of-the-art
related IDS models, which were based on the intelligent
machine learning algorithms. Secondly, we undertook a
comprehensive comparison between three intelligent classifiers
by using a real benchmark dataset. Thirdly, the performance of
all three was examined by utilizing confusion matrix. Lastly,
we were able to propose an intelligent IDS framework for
effective and efficient IDS management computer and network
systems. The framework was addressed at classification level.
Utilizing Linear-SVM as an intelligent classifier, it is
considered a core element in the building of the framework.
We also discussed an evaluation of the proposed framework,
and the simulation of results for detecting malicious attacks,
like Remote to Local (R2L), Denial of Service (DoS) Attacks,
Remote to User (R2L) Attacks and Probing attacks, are all
provided.
The rest of this paper is organized in the following way.
Section II gives a literature review of the recent approaches that
have been proposed for IDS that is based on intelligent
classification algorithms. Section III highlights some
background into the classification algorithms that were utilized
in the work, which is RBF, MLB and SVM. It also provides a
useful overview of the experimental dataset. The paper’s main
contribution is discussed in SectionIV, while simulation
experiments and the ways they were setup is illustrated in
SectionV. This section also summarizes and discusses the
results of the simulation. Lastly, Section VI provides the
conclusion of the paper and also highlights any future research
directions.
II. RELATED WORK
There has been a lot of researches into anomaly-based
intrusion detection, and some of them have used machine
learning, as well as data mining techniques. Decision tree,
neural networks, clustering and Bayesian parameter estimation
are some techniques that have been used to detect any intrusive
behaviors in the computer network.
Chandolikar et al. [37] evaluated the performance of 2
classification algorithms. These were Bayes net and J48
algorithm, which are both used for detecting computer attacks.
The results reveal that J48 learning algorithm was more
accurate than Bayes net algorithm in terms of achieving better
accuracy and it had a lower error rate. A benchmark was used
in the evaluation. This was the KDD cup dataset. It was
emphasized that J48 algorithm had a higher accuracy which
helps to increase the IDS’ efficiency.
The Principal Component Analysis and Naive Bayes
classifier was employed by Panda et al. [38] to give them a
way of detecting intrusion by using machine learning
algorithms. These experiments were carried out ontheKDD’99
cup dataset, an intrusion detection dataset. The dimensionality
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
24 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
3. of the dataset was reduced by utilizing principal component
analysis, as well as the Naïve Bayes classifier classification of
the dataset. This was done in both the normal and attack
classes. They concluded that the approach they used was a
description of a Network Intrusion detection system framework
which used two algorithms, Naive Bayes and Principal
Component Analysis. The result they obtained showed that
their approach was faster compared to a number of the other
existing systems.
An intrusion detection system was proposed by Wang et al.
[39]. This was based on C4.5 decision tree, one of the
algorithm-based neural networks. The result revealed that the
intrusion detection system was effective and feasible, and had a
high rate of accuracy. All of their experiments were conducted
on a KDD CUP1999 dataset, a test set that is widely used for
intrusion detection fields. The tree that is generated by the C4.5
neural network classification algorithm for intrusion detection
was used to build rules. These then can use the knowledge base
of IDS. In other words, the rules are able to give an indication
if a new network behavior is either normal or abnormal, based
on the built knowledge.
The J48 intelligent algorithm was utilized by Chandolikar
et al. [40] in the experiments they did to make IDS. Their
results show that J48 is an effective and efficient algorithm of
the classification in the KDD CUP1999 dataset.
Yogita et al. [41] proposed IDS that used SVM as a data
mining technique. It is vital to mention here that SVM is a very
popular classification algorithm. However, they highlighted the
main drawback, which is that SVM takes a very long time to
train the neural network. These experiments were done by
utilizing the NSL-KDD Cup’99 dataset’s improved version of
the KDD Cup’99 dataset. They used the Gaussian RBF as the
kernel function and a 10-fold cross validation as the test option
parameter that was used for SVM. In addition, they pointed out
that the method based SVM that was proposed was able to
increase the accuracy of intrusion detection and cut down on
the time taken to build this classification model.
The aim of Mohammadreza et al. [42] was to use data
mining techniques, which included SVM and the classification
tree for IDS. The results reveal that the C4.5 algorithm is better
than SVM at detection of any network intrusions. These
experiments were carried out on a KDD CUP 99 dataset. Das et
al. [43] looked at the IDS at its preprocessing level, which is
the level before the classification process, and proposed what is
called a divide and conquer algorithm. The aim of this was to
reduce the feature set from the large KDD 99 dataset. The
proposed algorithm successfully reduced the IDS’s overhead
for analyzing the entire KDD dataset. This was done by
selecting the vital features and then classifying them all with a
maximized rate of classification. It was a generic algorithm and
it could be applied to absolutely any dataset. The authors used
LDA, KNN, C4.5, SVM and a number of classification
algorithms in order to classify the various feature sets that had
been obtained.
III. PRELIMINARIES
This section gives a brief background about the three
intelligent algorithms used in this study, as well as about the
dataset for the experimental comparison.
A. Classification Algorithms
The various classification algorithms that were used in the
research project are described in brief below.
1) Multilayer Perceptron (MLP)
This is composed of a big amount of widely interconnected
neurons that all work in parallel in order to solve a particular
problem. MLP is organized in a series of layers that have a
feed-forward information flow. An MLP network’s main
architecture consists of a number of signals which flow
sequentially through these various layers, starting with the
input layer, through to the output layer. Between these two
layers are a number of intermediate layers, which are also
known as hidden layers because you cannot see them at either
the input or the output. Each of the units is first utilized to
calculate what the difference is between a vector of weights
and a vector provided by the outputs of the previous layer. In
order to generate the next layer’s input, a transfer function,
which is also called activation, was applied to the result [44].
RBF, unipolar sigmoid and bipolar sigmoid are all examples of
activation functions that are both well-known and commonly
used.[45]. The training phase’s main steps in an MLP network
are the following: Firstly, after being given the dataset’s input
pattern, this particular pattern is forward-propagated to the
MLP network’s output and it is then compared with the output
desired. Secondly, the error signal that exists between the
network’s output and the desired response is then back-
propagated to the network. Lastly, a number of adjustments are
made to the synaptic weights [46]. The process is the repeated
for the next input vector and this continues until all of the
training patterns have been passed right through the network.
2) Radial Basis Functions (RBF)
This involves a total of three layers. The first is called the
input layer and it is made up of source nodes (or sensory units).
The amount of these source nodes is equal to the input vector’s
dimension. The second is the hidden layer, which consists of
nonlinear units. These are directly connected to every one of
the sensory units in the input layer. The RBF network has only
a single hidden layer that has RBF activation functions. Lastly,
the output layer is utilized to linearly combine the hidden
layer’s outputs and give the network’s response to the input
data [47].
3) Support Vector Machine (SVM)
This splits the dataset into two different classes. These are
separated by placing a linear boundary between both the
normal and attack classes in a way that maximizes the margin.
SVM finds the hyperplane that is able to provide the maximum
distance there is between the hyperplane and the closest of the
positive and negative samples [48][49]. The SVM network’s
basic structure is similar to the structure of the ordinary RBF
network. However, the kernel activating function is applied
instead of the exponential activating function (which is
generally Gaussian activation functions).This kernel activating
function can be either a polynomial kernel, a Gaussian radial
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
25 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
4. basis kernel, or two layer feed-forward neural network kernels
[49].
B. Dataset
This section gives a brief description of the dataset that is
used in the work. The KDDCUP’99 dataset was prepared by
the 1998 DARPA Intrusion Detection Evaluation program by
MIT Lincoln Laboratories [50]. It can be seen from the
literature that this dataset has been used widely for the
evaluation of anomaly based IDS. A lot of researchers are
using the KDDCUP’99 dataset as it is the only publicly
available dataset for the ID problem, and also because it is
possible to extract useful information from it [51] [52]. The full
dataset contained around 5 million instance/records. This is
where each data raw has its connection records. Connection is
defined in many references as a sequence of TCP packets that
start and end at some time between a source and a destination
under a protocol that is well-defined [50]-[52].
This dataset contains a number of different attack types,
which are classified into 4 major categories. These are R2L,
DOS, Probing and U2R. The KDD cup 99 set has a total of 41
attributes or features for each instance, or a sample plus 1 class
label. The total number is, therefore, 42 attributes. The 41
attributes are destination bytes, count, dst host count, diff srv
rate, wrongfragment and urgent. The 42nd
field is a label that
can be generalized as either normal or anomaly (U2R, DoS,
Probing and R2L) [50] [53] (see Table 1).
TABLE 1: TYPES OF ATTACKS IN KDD’99 DATASET
Classification Short Description Name of Attacks
DoS Attacker attempts to deny or
prevent legitimate users
from using a service.
smurf, land, pod,
teardrop, neptune,
back
R2L Attacker attempts to send
packets to the victim
machine in order to gain
access because he does not
have an account on it.
ftp_write, phf,
spy, warezmaster,
warezclient, imap,
guess_passwd,
multihop
U2R The attacker tries to exploit
some vulnerability to gain
root/super user access to the
system.
perl,
buffer_overflow,
rootkit,
loadmodule
Probe The attacker attempts to
gather information about a
computer network.
portsweep, nmap,
ipsweep, satan
Tavallaee et al. [26] highlighted that the features of
KDD’99 can be categorize into three different groups. These
are Basic features, Content features and Traffic features. Basic
features are utilized to encapsulate all of the attributes that have
been extracted from the TCP/IP connection.
The majority of these features can help to detect the major
causes of network delays. There is then a second class, which is
the Traffic features. These depend on window interval and they
can be divided into 2 major features, which are “same host”
features and also “same service” features. They are, therefore,
called time-based features. “Same host” features are used to
carry out an examination of network connections in the
previous two seconds, and they have the same target host as the
current connection. "Same service” features are utilized to test
the network’s connections and have the same service as the
current connection in the previous two seconds. The last of
these classes is called Content features, which helps to detect
U2R and R2L attacks. This is because these types of attacks do
not have either a well-defined structured feature or well-
defined pattern. Therefore, Content features have some features
that enable IDS to detect any intrusion that is tending to cause
or create suspicion in the data portion, like a number of failed
log on attempts [26].
IV. THE PROPOSED SYSTEM
The focus of this research work is on the original “10%
KDD 99” dataset because of the limited memory capacity. The
system flow for the proposed IDS is shown in Figure 2. The
original “10% KDD 99” dataset is firstly loaded into the
system. The next step is pre-processing, in which the input file
is properly prepared.
Figure 2: Block diagram of the proposed IDS
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
26 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
5. Figure 3: Original KDD'99 10% dataset distributions
In this step, a total of 9000 instances are randomly selected
from 10% of the KDD CUP 1999 dataset with nearly the same
distribution as the KDD'99 10% dataset. Figure 3 and Figure 4
clearly highlighted this point. Two phases are performed after
that. Firstly, the training/learning phase, which enables the
intelligent system to build up the right knowledge base. IDS
learns about relationships that exist in the built training dataset.
This training phase is seen as an adaptation process to IDS in
order to give the best response during the next phase, which is
the testing phase.
In this phase, the intelligent system will receive different
dataset, testing data set and processes it to produce an output.
To test and evaluate MLP RBF and Linear-SVM algorithms, a
5-fold cross validation is utilized as a test option. The dataset is
split into 5 subsets, and for each running time, one of these five
subsets are used as the training set and then the other subsets as
the test set.
In order to evaluate the algorithms’ effectiveness for IDS,
three experiments are carried out. The WEKA simulator
version 3.6 [54] is utilized in the classification process. That is,
the available algorithms for RBF, MLP and SVM on the Weka
simulator are employed. For the Weka parameters of the
algorithms, the Weka system’s default settings are utilized,
except for the fold cross validation, where we utilized value
five.
V. DISCUSSION OF RESULTS
The confusion matrix is used to measure the three
intelligent algorithms’ performance [55][56]. This provides
visualization of how the classifier performs on the input
dataset. A number of different performance metrics, including
recall, accuracy and specificity, are derived from the confusion
matrix. Table 2 shows the structure of this matrix. The 4
possible outcomes/cases are true positive (TP), false positive
(FP), and false negative (FN) and true negative (TN) [51][57].
TABLE 2: CONFUSION MATRIX
Predicted class
Positive Negative
Actual class Positive
TP FP
Negative
FN TN
Figure 4: Selected dataset distributions
We evaluated these algorithms by using accuracy as the
performance metric in this study. Accuracy in this instance
represents the overall correctness of the intelligent
classification of the dataset. It is given by:
FN)+FP+TP+(TN
TP)+(TN
Accuracy =
As shown in Figure 5, the obtained results out from our
dataset show the comparison between the three intrusion
detection systems.
Figure 5: Accuracy comparison graph between MLP, RBF and Linear-SVM
as classifiers and cross Validation (folds-5) as Test Option over our selected
dataset
If we compare RBF, MLP and SVM (linear kernel), we can
see that under the cross Validation Method (5-flod) Test
Option, it is SVM with linear kernel that has the highest
identification of correct instances (it is 99.84 %
((1817+7010+114+37+8)/9000*100= 99.84%). The second
highest is RBF, which is around 99.64%.
MLP has the least with 98.98%. It is worth noting that
when it comes to the average time to build the model, RBF
proves to be much faster than MLP as the hidden layer is
computed through a single function, rather than a series of
weights, as is the case with MLP. It therefore can be concluded
that Linear-SVM provides the highest accuracy and the lowest
error rates. We can therefore generally conclude that the
SVM’s performance with linear kernel was the best of the other
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
27 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
6. classifiers in detecting these attacks. So it is more accurate than
either RBF or MLP. In addition, Linear-SVM is the quickest
classifier in terms of building the detection model, compared to
either RBF and MLP.
Figure 6 and Figure 7, respectively, show the accuracy and
error rates for the three algorithms. It is also important to note
that Tables 3, 4, and 5 highlight the Confusion matrixes for
RBF, MLP and Linear-SVM, respectively.
Figure 6: Overall accuracy rate for the three intelligent algorithms
Figure 7: Overall error rate for the three intelligent algorithms
TABLE 3: CONFUSION MATRIX FOR MLP AS CLASSIFIER OVER OUR
SELECTED DATASET
Normal DoS Probe R2L U2R Accuracy
%
Normal 1815 4 1 0 1 99.67
DoS 7 7004 2 0 0 99.87
Probe 24 22 68 0 0 59.65
R2L 9 7 9 16 0 39.02
U2R 1 4 0 0 6 54.55
The confusion matrixes show the number of instances that
have been assigned to each class. They show how many
instances for each class received various classifications. The
sum of the diagonals represents the amount of samples that
are correctly classified. For example, the total amount of
samples for MLP that have been correctly classified is the
sum of 1815, 7004, 68, 16 and 6.
TABLE 4 CONFUSION MATRIX FOR RBF NETWORK AS CLASSIFIER
OVER OUR SELECTED DATASET OVER OUR SELECTED DATASET
Normal DoS Probe R2L U2R Accuracy
%
Normal 1809 1 5 5 1 99.34
DoS 3 7010 0 0 0 99.957
Probe 8 0 106 0 0 92.98
R2L 3 0 0 35 3 85.37
U2R 2 0 0 1 8 72.73
TABLE 5: CONFUSION MATRIX FOR LINEAR-SVM AS CLASSIFIER
OVER OUR SELECTED DATASET
Normal DoS Probe R2L U2R Accuracy
%
Normal 1817 3 0 0 1 99.78
DoS 3 7010 0 0 0 99.96
Probe 0 0 114 0 0 100
R2L 3 0 0 37 1 90.24
U2R 2 0 0 1 8 72.73
VI. CONCLUSIONS AND FUTURE WORK
Network intrusion detection has recently become an area of
rapid advancement. There are similar advances in intelligent
computing, which have led to several classification techniques
being introduced to identify network traffic and differentiate it
into anomalous and normal. Intrusion detection that is based on
computational intelligence has been attracting much interest
from researchers in the research community. Its characteristics,
including adaptation, high computational speed, fault tolerance,
and error resilience in the face of noisy information, fit the
requirements that are needed to build a good intrusion detection
system.
In this paper, we have explained the requirement to apply
intelligent algorithms to network events in order to classify
network attack events. In particular, the performance of the 3
intelligent algorithms, which are MLP, RBF and Linear-SVM,
on an adapted KDD 1999 dataset was evaluated. This was by
done by both simulation and a comparison study. The results
obtained reveal that SVM with linear kernel will perform better
than MLP and the RBF network for detecting attacks in terms
of achieving better accuracy and a lower error rate.
Experiments show that Linear-SVM proves to be an
efficient algorithm that is able to detect various kinds of
intrusions/attacks in network, such as DoS, Probe, U2R and
R2L. Linear-SVM has the best detection accuracy when it
comes to detecting different types of attacks. It, therefore, has
the lowest error rate of all.
As future work, we intend to evaluate SVM’s under the
other benchmarking datasets. In addition, we will conduct a
performance comparison between SVM and different kernels,
such as Gaussian or sigmoid kernels. This will be done to find
the best kernel or activation function for SVM which can give
the best attack detection rate for building IDS.
REFERENCES
[1] Depren, O., Topallar, M., Anarim, E., & Ciliz, M. K. (2005). An
intelligent intrusion detection system (IDS) for anomaly and misuse
detection in computer networks. Expert systems with
Applications, 29(4), 713-722.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
28 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
7. [2] MeeraGandhi, G. (2010). Machine learning approach for attack
prediction and classification using supervised learning algorithms. Int. J.
Comput. Sci. Commun, 1(2).
[3] Nguyen, H. A., & Choi, D. (2008). Application of data mining to
network intrusion detection: classifier selection model. In Asia-Pacific
Network Operations and Management Symposium (pp. 399-408).
Springer Berlin Heidelberg.
[4] Subramanian, S., Srinivasan, V. B., & Ramasa, C. (2012). Study on
classification algorithms for network intrusion systems. Journal of
Communication and Computer, 9(11), 1242-1246.
[5] Li, M., & Dongliang, W. (2009). Anormaly intrusion detection based on
SOM. In Information Engineering, 2009. ICIE'09. WASE International
Conference on (Vol. 1, pp. 40-43). IEEE.
[6] Summers, R. C. (1997). Secure computing: threats and safeguards.
McGraw-Hill, Inc..
[7] Peddabachigari, S., Abraham, A., Grosan, C., & Thomas, J. (2007).
Modeling intrusion detection system using hybrid intelligent
systems. Journal of network and computer applications, 30(1), 114-132.
[8] Jamali, S., & Jafarzadeh, P. (2011). An intelligent intrusion detection
system by using hierarchically structured learning automata. Neural
Computing and Applications, 1-8.
[9] Wu, S. X., & Banzhaf, W. (2010). The use of computational intelligence
in intrusion detection systems: A review. Applied Soft Computing, 10(1),
1-35.
[10] Sundaram, A. (1996). An introduction to intrusion
detection. Crossroads, 2(4), 3-7.
[11] Chimedtseren, E., Iwai, K., Tanaka, H., & Kurokawa, T. (2014,
December). Intrusion detection system using Discrete Fourier
Transform. In Computational Intelligence for Security and Defense
Applications (CISDA), 2014 Seventh IEEE Symposium on (pp. 1-5).
IEEE.
[12] Igbe, O., Darwish, I., & Saadawi, T. (2016). Distributed Network
Intrusion Detection Systems: An Artificial Immune System Approach.
In Connected Health: Applications, Systems and Engineering
Technologies (CHASE), 2016 IEEE First International Conference
on (pp. 101-106). IEEE.
[13] Di Pietro, R., & Mancini, L. V. (Eds.). (2008). Intrusion detection
systems (Vol. 38). Springer Science & Business Media.
[14] Kulothungan, K., Ganapathy, S., Yogesh, P., & Kannan, A. An Agent
based Intrusion Detection System for Wireless Sensor Networks Using
Multilevel Classification. International Journal of Modern Engineering
Research (IJMER), 1(2), 55-60.
[15] Anderson, J. A. (1995). An indroduction to Neural Networks, MIT
Press.
[16] Rhodes, B. C., Mahaffey, J. A., & Cannady, J. D. (2000). Multiple self-
organizing maps for intrusion detection. In Proceedings of the 23rd
national information systems security conference (pp. 16-19).
[17] Al-Yaseen, W. L., Othman, Z. A., & Nazri, M. Z. A. (2017). Multi-level
hybrid support vector machine and extreme learning machine based on
modified K-means for intrusion detection system. Expert Systems with
Applications, 67, 296-303.
[18] Chen, C. M., Chen, Y. L., & Lin, H. C. (2010). An efficient network
intrusion detection. Computer Communications, 33(4), 477-484.
[19] Deepa, A. J., & Kavitha, V. (2012). A comprehensive survey on
approaches to intrusion detection system. Procedia Engineering, 38,
2063-2069.
[20] Thaseen, S., & Kumar, C. A. (2013). An analysis of supervised tree
based classifiers for intrusion detection system. In Pattern Recognition,
Informatics and Mobile Engineering (PRIME), 2013 International
Conference on (pp. 294-299). IEEE.
[21] Feng, W., Zhang, Q., Hu, G., & Huang, J. X. (2014). Mining network
data for intrusion detection through combining SVMs with ant colony
networks. Future Generation Computer Systems, 37, 127-140.
[22] Kuang, F., Xu, W., & Zhang, S. (2014). A novel hybrid KPCA and SVM
with GA model for intrusion detection. Applied Soft Computing, 18,
178-184.
[23] Horng, S. J., Su, M. Y., Chen, Y. H., Kao, T. W., Chen, R. J., Lai, J. L.,
& Perkasa, C. D. (2011). A novel intrusion detection system based on
hierarchical clustering and support vector machines. Expert systems with
Applications, 38(1), 306-313.
[24] Hasan, M., Nasser, M., Pal, B., & Ahmad, S. (2013). Intrusion detection
using combination of various kernels based support vector
machine. International Journal of Scientific & Engineering
Research, 4(9), 1454-1463.
[25] Mukkamala, S., Sung, A. H., & Abraham, A. (2003). Intrusion detection
using ensemble of soft computing paradigms. In Intelligent systems
design and applications (pp. 239-248). Springer Berlin Heidelberg.
[26] Tavallaee, M., Bagheri, E., Lu, W., & Ghorbani, A. A. (2009). A
detailed analysis of the KDD CUP 99 data set. In Computational
Intelligence for Security and Defense Applications, 2009. CISDA 2009.
IEEE Symposium on (pp. 1-6). IEEE.
[27] Hassim, Y. M. M., & Ghazali, R. (2012). Training a functional link
neural network using an artificial bee colony for solving a classification
problems. arXiv preprint arXiv:1212.6922.
[28] Pal, A. K., & Pal, S. (2013). Classification model of prediction for
placement of students. International Journal of Modern Education and
Computer Science, 5(11), 49.
[29] Purnami, S. W., Zain, J. M., & Heriawan, T. (2011). An alternative
algorithm for classification large categorical dataset: k-mode clustering
reduced support vector machine. International Journal of Database
Theory and Application, 4(1), 19-30.
[30] Barnawi, A. Y., & Keshta, I. M. (2014). Energy management of wireless
sensor networks based on multi-layer perceptrons. In European Wireless
2014; 20th European Wireless Conference; Proceedings of (pp. 1-6).
VDE.
[31] Barnawi, A. Y., & Keshta, I. M. (2016). Energy Management in
Wireless Sensor Networks Based on Naive Bayes, MLP, and SVM
Classifications: A Comparative Study. Journal of Sensors, 2016.
[32] Gupta, G. R., & Ramanathan, P. (2007). Level set estimation using
uncoordinated mobile sensors. In International Conference on Ad-Hoc
Networks and Wireless (pp. 101-114). Springer Berlin Heidelberg.
[33] Magno, M., Brunelli, D., Zappi, P., & Benini, L. (2010). Energy
efficient cooperative multimodal ambient monitoring. In European
Conference on Smart Sensing and Context (pp. 56-70). Springer Berlin
Heidelberg.
[34] Sazonov, E. S., & Fontana, J. M. (2012). A sensor system for automatic
detection of food intake through non-invasive monitoring of
chewing. IEEE sensors journal, 12(5), 1340-1348.
[35] Bal, M., Amasyali, M. F., Sever, H., Kose, G., & Demirhan, A. (2014).
Performance evaluation of the machine learning algorithms used in
inference mechanism of a medical decision support system. The
Scientific World Journal, 2014.
[36] Yuan, S., Liang, D., Qiu, L., & Liu, M. (2012). Mobile multi-agent
evaluation method for wireless sensor networks-based large-scale
structural health monitoring. International Journal of Distributed Sensor
Networks.
[37] Chandolikar, N. S., & Nandavadekar, V. D. (2012). Comparative
Analysis of Two Algorithms for Intrusion Attack Classification Using
KDD CUP Dataset. International Journal of Computer Science and
Engineering (IJCSE), 1(1), 81-88..
[38] Panda, M., & Patra, M. R. (2007). Network intrusion detection using
naive bayes. International journal of computer science and network
security(IJCSNS), 7(12), 258-263.
[39] Wang, J., Yang, Q., & Ren, D. (2009). An intrusion detection algorithm
based on decision tree technology. In Information Processing, 2009.
APCIP 2009. Asia-Pacific Conference on (Vol. 2, pp. 333-335). IEEE.
[40] Chandolikar, N. S., & Nandavadekar, V. D. (2012). Efficient algorithm
for intrusion attack classification by analyzing KDD Cup 99. In Wireless
and Optical Communications Networks (WOCN), 2012 Ninth
International Conference on (pp. 1-5). IEEE.
[41] Bhavsar, Y. B., & Waghmare, K. C. (2013). Intrusion detection system
using data mining technique: Support vector machine. International
Journal of Emerging Technology and Advanced Engineering, 3(3), 581-
586.
[42] Ektefa, M., Memar, S., Sidi, F., & Affendey, L. S. (2010). Intrusion
detection using data mining techniques. In Information Retrieval &
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
29 https://sites.google.com/site/ijcsis/
ISSN 1947-5500
8. Knowledge Management,(CAMP), 2010 International Conference
on (pp. 200-203). IEEE.
[43] Das, A., & Nayak, R. B. (2012). A divide and conquer feature reduction
and feature selection algorithm in KDD intrusion detection dataset.
In Sustainable Energy and Intelligent Systems (SEISCON 2012), IET
Chennai 3rd International on (pp. 1-4). IET.
[44] Battiti, R., Brunato, M., & Mascia, F. (2008). Reactive search and
intelligent optimization (Vol. 45). Springer Science & Business Media.
[45] Karlik, B., & Olgac, A. V. (2011). Performance analysis of various
activation functions in generalized MLP architectures of neural
networks. International Journal of Artificial Intelligence and Expert
Systems, 1(4), 111-122.
[46] Bouzgou, H., & Benoudjit, N. (2011). Multiple architecture system for
wind speed prediction. Applied Energy, 88(7), 2463-2471.
[47] Haykin, S. S. (2001). Neural networks: a comprehensive foundation.
Tsinghua University Press.
[48] Vapnik, V. (2013). The nature of statistical learning theory. Springer
science & business media.
[49] Bennett, K. P., & Campbell, C. (2000). Support vector machines: hype
or hallelujah?. ACM SIGKDD Explorations Newsletter, 2(2), 1-13.
[50] KDD Cup 1999 Data, Information and Computer Science, University of
California, Irvine. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.
html
[51] Bijone, M. (2016). A Survey on Secure Network: Intrusion Detection &
Prevention Approaches. American Journal of Information Systems, 4(3),
69-88.
[52] Ashfaq, R. A. R., Wang, X. Z., Huang, J. Z., Abbas, H., & He, Y. L.
(2017). Fuzziness based semi-supervised learning approach for intrusion
detection system. Information Sciences, 378, 484-497
[53] Htun, P. T., & Khaing, K. T. (2012). Anomaly Intrusion Detection
System using Random Forests and k-Nearest
Neighbor. Probe, 41102(4107), 2377.
[54] http://www.cs.waikato.ac.nz/ml/weka/
[55] Weiss, S. M., & Indurkhya, N. (1998). Predictive data mining: a
practical guide. Morgan Kaufmann.
[56] Ahmim, A., & Ghoualmi-Zine, N. (2013). A new fast and high
performance intrusion detection system. International Journal of
Security and Its Applications, 7(5), 67-80.
[57] Kim, J., Shin, N., Jo, S. Y., & Kim, S. H. (2017). Method of intrusion
detection using deep neural network. In Big Data and Smart Computing
(BigComp), 2017 IEEE International Conference on (pp. 313-316).
IEEE.
AUTHOR PROFILE
Ismail Keshta is an assistant professor in the Department of Computer
and Information Technology at Dammam Community College (DCC),
King Fahd University of Petroleum and Minerals (KFUPM), Dhahran,
Saudi Arabia. He received the B.Sc. and M.Sc. degrees in computer
engineering and the Ph.D. degree in computer science and engineering
from the King Fahd University of Petroleum and Minerals (KFUPM),
Dhahran, Saudi Arabia, in 2009, 2011, and 2016, respectively. He was a
Lecturer with the Computer Engineering Department, KFUPM, from
2012 to 2016. Prior to that, in 2011, he was a Lecturer with Princess
Nora Bint Abdulrahman University (PNU) and Imam Muhammad ibn
Saud Islamic University (IMAMU), Riyadh, Saudi Arabia. He His
research interests include software process improvement, modeling, and
intelligent systems.
International Journal of Computer Science and Information Security (IJCSIS),
Vol. 16, No. 5, May 2018
30 https://sites.google.com/site/ijcsis/
ISSN 1947-5500