Rationalized Red Alert Connection For Collective Intrusion

Published Date: 02 Nov 2017

5.1 Introduction

It is very obvious that a major threat to the reliability of internet service is the growth in stealthy and coordinate attacks such as scans, worms and distributed denial-of-service (DDOS) attacks. In what aspect does mode IDSs differ from traditional IDSs? It is crystal clear that while intrusion detection system provides the ability to detect a wide variety of attacks, traditional IDSs focus on monitoring a single network. This limits their ability to correlate evidence from multiple networks. The predominant role of intrusion detection research is how efficiently correlating evidence from multiple networks (Cheung et al 2003). Collective intrusion detection systems (CIDSs) aim to address this research challenge. To put it briefly, CIDSs consist of a set of individual IDSs. It comes from different network administrative domains or organizations, which cooperates and to detect coordinated attacks. How does it function? Each IDSs report any alerts of suspicious behavior that it has collected from its local monitored network. The next step is that the CIDS correlate these alerts to identify coordinated attacks that attack multiple sub networks. A key component of a CIDS is the alert correlation algorithm, which clusters similar incidents and identifiers false alerts generated by individual IDSs. The first and foremost contribution to the problem is to constrain the search space for multi-dimensional alert patterns by using knowledge of the types of attack categories of interest.

Next step is that arrangement of patterns into a index, with the most general patterns at the top of the index and the most specific patterns at the bottom of the index. This index structures provide a bias to constrain the search space of the correlate and filter algorithm for finding frequent, non-redundant patterns of alerts in our proposed CIDS.

5.2 Multi-Layered Alert Correlation

Our next successful effort to achieve high detection accuracy in the proposed CIDS, a multilayered alert correlation algorithm was developed for use in a CIDS. The peculiarity is that the absence of excessive computational overhead. The raw alerts reported by the participating IDSs are the inputs to the alert correlation process. Each raw alert corresponds to a suspicious flow that has been identified by IDS. Here each raw alert contains five main features: SrcIP â€“ the source IP address, SrcPrt â€“ the source port, dstIP â€“ the destination IP address, dstPrt â€“ the destination port and protocol. Aggression of alerts that have common feature values is the aim of correlation. Predefined sets of features can be used to determine which feature must have common values for guiding the correlation process. A pattern is a combination of features, example: {srcIP, protocol}. Searching for frequent instance of these patterns among the input alert is done by the algorithm. A pattern instance is the combination of feature values that correspond to a specific pattern, example: {srcIP = 192.240.134.12, protocol = UDP}.

5.2.1 Open Challenges

There are two challenges when a CIDS constructs multi-layered pattern instances. Choosing the appropriate dimensionality of pattern instances is the first challenge, if all dimensions in the alert are presented, pattern instances will be more explicit. The compactness of the inferred patterns instances is the second challenge in which different patterns can overlap in terms of having one or more features in common (Dain et al 2001).

So it leads to a large number of redundant patterns instance, which originate from the same attack. Many false alarms in the CIDS will be generated by redundant pattern instance. The second challenge is the compactness of the inferred pattern instances. Different patterns can overlap in terms of having one or more features in common. Therefore, multi-layered patterns may result in a large number of redundant pattern instances, which originate from the same attack.

As a result, these redundant pattern instances will generate many false alarms in the CIDS. For example, suppose a two-dimensional pattern using the protocol (protocol) and destination port (dstPrt) from raw alerts. There will be three possible combinations, {protocol}, {dstPrt} and {protocol + dstPrt}. During the SQL-Slammer worm outbreak (CERT Coordination Center, 2003a), the combination of {UDP +1434} (i.e., protocol + dstPrt) would be reported as suspicious. However, pattern instances {UDP} (i.e., {protocol}) and {1434} (i.e., {dstPrt}) would be reported as well. Suppose the majority of the traffic using UDP goes to port 1434, then the instances {UDP} and{1434} would be redundant pattern instances.

5.2.2 Defining Pattern instances

Based on the above considerations, our multi- layered pattern instances using five features from the alert five-tuple: source IP address (srcIP), source port (srcPrt), destination IP address (dstIP),destination port (dstPrt), and protocol (protocol). Pattern instances are defined by sets of values for each of these features. The supported values are: a single value, all values (*) or a subset of possible values.

For example, a pattern instance with the values (srcIP = 192:240:134:0=24, srcPrt *, dstIP = 192.240.152.0, dstPrt = 1434, protocol = UDP) represents UDP traffic from a network where IP addresses start with 192.240.134 to port 1434 on a monitored network. There are two ways to originate the attack flows. It may be outside the monitored networks or from inside the monitored networks. However our definition is close to the multi-dimensional cluster proposal in (Estan et al 2003 and Hu et al 2006).

5.2.3 Search Space of Patterns

Patterns are the basic templates. It guides the alert correlation process. So it should be symptomatic of potential intrusion activities. The only possible way to restrict the search space of the alert correlation process will takes place only by controlling the set of possible patterns. Thirteen significant combinations of the five key features are based on an analysis of real attack scenario. Here srcIP to be more important than the other four features based on the following analysis.

Table 5.1 Index Patterns (alert signatures) Attack types

srcIP

Most scans

srcIP+srcPrt+ dstIP

Flash crowds response (Jung et al., 2002)

srcIP+dstPrt+ dstIP

DDoS by Trinoo (CERT Coordination Center,1999)

srcIP + dstIP + protocol

Most worms

srcIP + srcPrt+ dstPrt + dstIP

Distributed reï¬‚ector DoS (Gibson, 2002)

srcIP + srcPrt + dstIP + protocol

SYN ï¬‚ood response(CERT Coordination Center,1996)

srcIP + dstPrt + dstIP + protocol

W32/Blast worm(CERT Coordination Center,2003b)

srcIP + srcPrt + dstPrt + dstIP + Protocol

SQL-Slammer worm (CERTCoordinationCenter,2003a)

srcIP + dstIP

Most portscans ([10])

srcIP + dstPrt

W32/Blast worm(CERT Coordination Center,2003b)

srcIP + srcPrt + dstPrt

MS-SQL server worm(CERTCoordinationCenter,2003a)

srcIP + dstPrt + Protocol

N / A

srcIP + srcPrt + dstPrt+ Protocol

Nonâ€“IP spoofing syn (CERTCoordinationCenter,2004)

The same source IP address will be shared by the most coordinated attacks, such as worms or coordinated scans. More over many attack signature included the srcIP field as shown in the above analysis (jung et al 2002). So there is no wonder to say that this field has a central role in the alert correlation process compared to other fields. Our possible concern of using srcIP as a key attribute is source spoofing i.e., a source that uses a false address. It is unlikely to create a significant patterns instance since source spoofing is usually done randomly ( katti et al 2005 ).

5.3 Proposed method and Correlation Algorithm

The possible patterns are restricted to thirteen combinations; there are still an impractically large number of possible pattern instances. Obviously these are up to 232 possible instances for the first pattern alone in principle. Therefore, a need for correlation algorithm to generate instances is required. It must be with a frequency above a certain threshold as potentially interesting intrusion patterns.

In addition there is some overlap between these thirteen patterns. For example, consider an instance of the seventh pattern in fig.1 has been identified as significant. (i.e., its frequency exceeds a given threshold). There is chance for the corresponding instance of the first, third and fourth pattern may be redundant. Only if the seventh pattern which has been already reported contributes a significant proposition of suspicious alerts. In other words, they are triggered by the same underlying cause. So it is very clear that a filtering process is required to reduce the redundant pattern instances by inferring the most appropriate pattern to generalize the raw alerts.

Defining the context in which alerts correlation occurs before describing our correlation algorithm will be more useful, i.e., within the CIDS. A CIDS is proposed to correlates alerts that are reported by a set of m IDSs, denoted as D,

i.e., D={ di / i =1;2â€¦â€¦â€¦â€¦..,m},

Where each IDS di is responsible for monitoring incoming traffic to its own sub network and generating raw alerts.

In the case of a centralized CIDS which contains a central mode C that runs the correlation algorithm. Each IDS reports a set of raw alerts are correlated from its monitored sub network to the central server from its monitored sub networks to the central server C for correlation periodically within a time interval âˆ†. At the end of a given interval of length âˆ†, let ni denote the number of raw alerts reported by IDS di. Then the total set of alerts received by the central server C in that period is denoted as RA,

i.e., RA={rij / i = 1,2,â€¦m; j=1,2,â€¦..n }

Next milestone is proposing a correlate-and-filter algorithm. First correlate these raw alerts and then filter out any significant or redundant alert pattern instances. Correlation of the raw alerts is the first stage of algorithm. How does it happen? First of all a raw alert rij is received by the correlation stage on the central server C. It is first passed into appropriate instances of thirteen patterns defined in fig 1, based on the different combinations of alert. Then these thirteen pattern instances of rij are happened into appropriate instances of the thirteen patterns based on the different combinations of alerts. Then these thirteen pattern instances of rij are put into an index of pattern instances in memory.

Algorithm 2. Correlate and Filter Algorithm

1 INPUT - raw alerts RA

2 INPUT- minimum support threshold S

3 OUTPUT - set NRSP of non-redundant,

significant Pattern instances

4 // initialize the set of Pattern indexed

by srcIP: Pattern = { Patternip | ip âˆˆ IP }

5 Pattern <--{ };

6 // correlating process

7 for each rij âˆˆ RA do

8 ip<-- get_srcIP(rij );

9 if Patternip Not an Element Pattern then

10 Patternip<--create_ Pattern(ip);

11 end if

12 for k =1 to 13 do

13 PP<-- parse_patternk(rij);

14 // update the support of pattern PP in the Pattern of ip

15 Patternip. PP. support < -- ++( Patternip .PP .count) / |R|;

16 end for

17 end for

18 // filtering process

19 for each Patternip âˆˆ Pattern do

20 for each PP âˆˆ Patternip do

21 if PP.support < s then

22 delete PP from Patternip;

23 end if

24 end for

25 end for

26 // Filtering redundant patterns

27 // initialize non-redundant significant pattern instance set

28 NRSP <-- { };

29 for each Patternip âˆˆ Pattern do

30 // compress revised Pattern Patternip using threshold S

31 NRSP += compress_Pattern(Patternip , S);

32 end for

33 return NRSP;

This index is based on the pattern structure, where there is separate index pattern kept for each source IP address that has been received by C. The count of the pattern instance is said to be updated when each pattern instance derived from rij, if that instance already been indexed in the Pattern. If not, a new node is added to the index pattern, corresponding to the new pattern instance. This process will be repeated for each raw alert that is received during the current update Î”. The second stage of algorithm is to filter any insignificant or redundant pattern instances from the set of all pattern instances P based on its support among a set of raw alerts RA as Supportr(r)=count r(p) / |r| in at where count r(P) is the number of raw alerts in RA that match the pattern instance P.

For example, given a minimum support threshold of S=0.7 these thirteen pattern instances in the index. If there are thirteen significant pattern instances. These are pattern instances that occurred in at least 158 raw alerts out of the original set of |R|=210 alerts. Whenever the pattern instances in Ps are significant in terms of their frequency, they may have considerable overlap too in terms of the raw alerts. These are covered by each instance. For example, the four significant generalizations of the most specific pattern instances thus got considerable redundancy. A method for compressing Ps is needed to remove redundant instances. An approach to compression that was initially proposed in (Estan el al 2003).

5.4 Correlation Architecture

A single stage correlation scheme and a two stage approach are the two possible approaches to implement this correlate and filter algorithms.

5.4.1 Single Stage Approach

Intuitively, an easy approach to collective detection is to use a centralized server to correlate all information. In this approach, each IDS plays a role as a detection unit in the CIDS, which collects alerts locally. Then the alerts are reported to a central server, which works as a correlation unit for analysis. However, this centralized approach introduces the following problems. (1) Central point of failure: although this centralized approach is straightforward to implement, the central server creates a central point of failure and can be a target for DDoS attacks. (2) Poor scalability: with increasing numbers of participants, the QoS of the central server will degrade dramatically due to the resource limitations of the server in terms of bandwidth and computational power [12].

5.4.2 Two Stage Approach

In this approach each IDS runs the correlate and filter algorithm independently. Such preprocessed, locally significant pattern pn will be sent to the central server C for further examination. Based on this information from the participants, the central server C will first reconstruct index patterns based on the pattern instance received. The next step is to run the correlate and filter algorithm over these patterns to generate globally significant pattern instances pg âŠ† pn. As a continuous these global pattern instances pg can then be passed to the participating IDSs D. So the further action can be taken as required.

In this paper the two-stage approach was adopted, based on the following consideration. It is a known fact that the numbers of alert messages that are sent to the central server are significantly reduced compared to the single stage approach. The advantage is that the two-stage approach requires less communication bandwidth. Second, the computational load is the two-stage approach can be shared by the participants. The local support threshold S, determines the number of messages sent to the server C. Thus the local threshold affects the performance of the two-stage approach. (i)Detection accuracy (ii) Bandwidth savings compared to the single stage.

5.4.3 Red alert Collective interruption exposure

In order to evade being detected by loss, a the central server aims to uniformly spread its access over multiple sites simultaneously, since in any given monitoring interval the attack evidence is not concentrated in any single network. The global support threshold Sg, does its work by detecting scan pattern instances launched by the attack if all the alerts from participating IDSs where to be centrally correlated. So, it is not necessary that all of the scan pattern instances will compulsorily be reported to the central server C. This is because of the support of the pattern instance can vary slightly between sub networks. As a result of the scan traffic being randomly distributed across the sub networks. It may not be significant everywhere because the support level in some networks will be below the global support threshold.

As a result a proposition of these pattern instances will be filtered locally. If some attack pattern instances from local network fails report, then there will be insufficient support for other network for the pattern to be identified as globally significant at the central server C. Thus, the attack may not be detected if Sl = Sg. The central server C correlates raw alerts and sent it to its sub networks for correlation of raw alerts from 1 to 1000 is sent to sub network1, correlation of raw alerts from 1001 to 2000 is sent to sub network2, correlation of raw alerts from 2001 to 3000 is sent to sub network3, etc. Thus the consolidated report once again is sent to the central server C. The central server C collective exposes the raw alert to all its sub nodes or networks.

5.5 Evaluation

In this section, both the two-stage correlate-and-filter algorithm and single-stage algorithmâ€™s are evaluate. A study of the feasibility of using the proposed two-stage correlate-and-filter algorithm by comparing it against a fully centralized scheme was conducted. These two schemes are compared in terms of their detection accuracy and message exchange rate using a simulation based on a real-world intrusion data set. Finally the fully Rationalized Multilayered Red Alert Connection for collective Intrusion Detection architecture by conducting a large scale experiment on ICE Lab (ICE PVT LTD) using a real-world data set was evaluated.

Table 4.2 Weekly summary of the Snort dataset.

Week

No. scans

No. Src IPs

No.Target IPs

01-01-12 to 01-06-12

19,67,80,234

51,73,895

5,13,299

01-07-11 to 01-12-11

23,69,06,662

60,34,209

10,94,354

The real-world intrusion data set was first introduced in the experiments. Then a report on the simulation results of the proposed two-stage correlate-and-filter algorithm for non-stealthy attack scenarios using the naive threshold selection scheme and in stealthy attack scenarios using the probabilistic threshold selection scheme, respectively. Finally, the real-world performance of our proposed fully Rationalized multilayered red alert connection for collective intrusion detection architecture was presented in terms of the computational and communication load in the system.

5.5.1. Intrusion trace data

The first data setâ€”snort data set that is used in our simulation comprises a large set of firewall and NIDS logs collected from 1472 firewall/NIDS platforms all over the world for the time period from 1 to 15 January 2012. There are 201,488,037 records in these logs, and the size of the whole data set is more than 11.1 GB. The date and time field are standardized to GMT, which allows for information correlation based on either the provider hash or the target IP field. The source IP field and the number of scans field (i.e., the number of scans launched by the source IP) record the potentially suspicious evidence observed by the provider.

5.5.2 Simulation setup and measurements

Our simulation program is written in Java and run on Blade Server, 32 GB of RAM, and the Linux 12 operating system. Note that the alerts are not generated by the simulation itself. They are generated by the local IDSs in the snort providerâ€™s original system, and replayed here from the trace provided by snort. The snort data set is stored in a mysql relational database. A simulate 2n participating IDSs by varying n from 1 to 13. Each simulated IDS is assigned a unique provider-id which is a field of the alert table in the database. The following metrics was considered in evaluating our two-stage correlate-and-filter algorithm. Detection Rate (DR), which represents the percentage of pattern instances detected by the two-stage scheme compared to the single-stage scheme, i.e..,

Where Ptwo represents the alert patterns detected by the two-stage approach, and Psingle represents the alert patterns detected by the single-stage approach. The two-stage approach does not generate false positives compared to the old standard, since the locally pre-processed alerts of the two-stage approach will be processed again on the central server using the same algorithm with exactly the same parameter settings as the single-stage approach, i.e., any false positives will be filtered due to the second correlate-and-filter stage on the central server. Note that exact matches between patterns are used when calculating the Detection rate, although a less stringent measure was used based on matching alerts with the same source IP address. Message Exchange Rate (MER), which denotes the percentage of messages exchanged in the two-stage scheme compared to the single-stage scheme,

i.e.,MER=[ |Mesgtwo| / |Mesgc | ] *100%

Where Mesgtwo represents the messages exchanged by the two-stage approach, and Mesgc represents the messages exchanged by the single-stage approach. Alert Pattern Distribution investigates the relationship between the distribution of the frequencies of the alert patterns and the message reduction rate in the simulation.

5.5.3 Results of naive threshold selection

In this simulation, the local support threshold (Sl ) is set equal to the global support threshold (Sg). Randomly 160 providers were selected from the snort data set as the raw alerts. The Detection Rate and Message Exchange Rate was measured the by varying Sg from 1*10-6 to 0.01.

The performance of the two stage scheme shows consistent behavior when the number of participants vary. The average Detection Rate of the CIDS with different numbers of participants varies from DR=93.1% when the support is set as Sg=10-5 to DR=100% when the support is Sg=0:01. Note that Sg = 0:01 is a comparatively large support in the simulation using a real intrusion data set, where there is no alert pattern being detected by the centralized approach.

The detection rate in this case is defined as 100%, instead of an undefined value based on the equation for DR. The Message Exchange Rate experiences a significant reduction when the support varies from Sg=10-6 to 0.01. The Message Exchange Rate is only 25% that of the centralized approach when a lower support is set to (Sg = 10-6). The exchange rate of messages decreases linearly as the support is increased from Sg = 10-6 to 0.01. The average message exchange rate drops to MER = 0.0009% when an extreme large support (Sg = 0:01) is set. In summary, Note that the false negatives (i.e., the number of attack pattern instances that are detected by the single-stage correlation but missed by the two-stage correlation scheme), are generated by an incorrect level of generalization.

In this evaluation, the exact match is tested when the detection accuracy is calculated. Fewer false positives and false negatives are observed if a prefix match is conducted, such as considering two pattern instances as belonging to the same instance if they come from the same source. In order to analyze the underlying reasons for our ability to effect such a significant reduction in the message exchange rate, the distribution of the frequency of the raw alert patterns are analyzed.

Each plot shows the distribution of the frequencies of patterns, where the patterns are sorted in rank order. Note that this is a logâ€“log scale. The distribution follows Zipfâ€™s law, i.e., it is heavy-tailed in the sense that a few patterns have a very large frequency, while the vast majority of patterns have a low frequency.

Distribution of alert patterns (note: the vertical line corresponds to a support threshold Sl = 0:01).

On each plot, a vertical line is represented corresponding to a support of Sl=0:01. All patterns to the right of this line would be filtered locally by our two-stage approach when Sl=0:01, while all those to the left would be reported to the centralized server for further analysis. For a CIDS with ten participants, only the top 1017 alert patterns among all 428,523 patterns will be reported to the centralized server. In the case of a CIDS with 20 participants, the top 2036 alert patterns among all 557,942 patterns will be reported to the centralized server. These top alert patterns account for the majority of all alerts. Thus, our two-stage scheme can achieve significant improvements in communication efficiency without significantly affecting the detection rate.

Our Service Portfolio

Want To Place An Order Quickly?

Then shoot us a message on Whatsapp, WeChat or Gmail. We are available 24/7 to assist you.

Do not panic, you are at the right place

Visit Our essay writting help page to get all the details and guidence on availing our assiatance service.

Get 20% Discount, Now
£19 £14/ Per Page
14 days delivery time

Our writting assistance service is undoubtedly one of the most affordable writting assistance services and we have highly qualified professionls to help you with your work. So what are you waiting for, click below to order now.

Get An Instant Quote

ORDER TODAY!

Our experts are ready to assist you, call us to get a free quote or order now to get succeed in your academics writing.

Get a Free Quote Order Now