5 Pages • 2,856 Words • PDF • 703.8 KB
Uploaded at 2021-09-24 14:56
This document was submitted by our user and they confirm that they have the consent to share it. Assuming that you are writer or own the copyright of this document, report to us by using this DMCA report button.
FastDetict: A Data Mining Engine for Predecting and Preventing DDoS Attacks Mais Nijim1, Hisham Albataineh2, Mohammad Khan1, Deepak Rao1 1
Department of Electrical Engineering and Computer Science. Texas A&M University – Kingsville 1 Department of Physics and Geoscience, Texas A&M University - Kingsville Kingsville, TX, USA {mais.nijim, hisham.albataineh, mohammad.khan}@tamuk.edu
I. INTRODUCTION
Abstract— One imperative mission of the Department of Homeland Security (DHS) is to prevent, combat, and mitigate extensive and sophisticated Distributed Denial of Service (DDoS) attacks in early stages. The problem with DDoS attacks is that they can cripple even the most established and largest organizations through generating a huge amount of traffic that will cause the network to crash. A Denial of Service (DoS) attack is considered to be one of the cyber attacks where the perpetrator seeks to make a single machine or network resources unavailable to its legitimate users. On the other hand, DDos attacks are launched from multiple connected devices that are distributed across the Internet. Unlike single-source DoS attacks, DDoS assaults tend to target the network infrastructure in an attempt to saturate the system with huge volumes of traffic. Over time, attackers have shown creativity and ingenuity in ways to perform DDoS attacks ranging from - attacks using Rapid Scanning Tree Protocol (RSTP) which is a layer2 attack, attacks using Internet Protocol (IP) on layer3 or using TCP/UDP on layer4 and scripting attacks on layer7 using the most basics of JavaScript. To prevent DDoS attack, we proposed a data mining engine framework to control the dynamic priority assignment of communication processes and automatically downgrade or upgrade open connection processes according to their resource usage history and the dynamism in the whole system. All remote communication requests will be clustered based on their history of resource usage such as CPU time, memory size, and network bandwidth. Keywords— DDoS Attacks; Data Mining Engine; Network Bandwidth
978-1-5090-6356-7/17/$31.00 ©2017 IEEE
Cybersecurity is a multidimensional topic that includes the national, international, governmental, and private sectors alike. Recently, the world has witnessed a growing collection of cyberspace threats as the Internet usage has grown, and the affected parties struggled to keep emerging vulnerabilities under control. Governmental and corporate networks are increasingly at risk, malicious attacks are rising sharply, and organized crime and terrorists are improving their cyber capabilities. Individuals as well as countries have initiated attacks against others including the public and private sectors. Targets have included government networks, military defenses, companies, or political organizations, depending upon whether the attacker was seeking military intelligence, conducting diplomatic or industrial espionage, or intimidating political activists. As noted in a recent study by the Center for Strategic and International Studies, an educated Cybersecurity workforce is urgently needed: “We not only have a shortage of the highly technically skilled people required to operate and support systems already deployed, but also an even more desperate shortage of people who can design secure systems, write safe computer code, and create the ever more sophisticated tools needed to prevent, detect, mitigate and reconstitute from damage due to system failures and malicious acts [1]. A Denial of Service (DoS) attack is considered to be one of the cyber attacks where the perpetrator seeks to make a single machine or network resources unavailable to its legitimate users. On the other hand, DDos attacks are launched from multiple connected devices that are distributed across the Internet. Unlike single-source DoS attacks, DDoS assaults tend to target the network infrastructure in an attempt to saturate the system with huge volumes of traffic. Over time, attackers have shown creativity and ingenuity in ways to perform DDoS attacks
ranging from - attacks using Rapid Scanning Tree Protocol (RSTP)[1] which is a layer2 attack, attacks using Internet Protocol (IP) on layer3 or using TCP/UDP on layer4 and scripting attacks on layer7 using the most basics of JavaScript. An effective solution to mitigate the DDoS attack is through early detection as most DDoS are designed to slowly eat away resources and not flag an alert to the security system. Most common approach for protection against DDoS attacks is to add a filtering system over the general anti-malware/firewall system, which is expected to filter out most malicious requests. However, in most situations attackers finds la flaw or loophole in the security system and bypass them and as a result the user gets DDoS over and over again. Nowadays, the wide spread of broadband access to the Internet all over the world provides opportunities to everyone including large industrial companies and different organizations to unify and better coordinate their projects and different all kind of activities. All large commercial companies heavily depend on networking to provide the best service to their customers. In addition to that they rely on networking to optimize their work to meet minimum costs by sharing resources. Despite all the advantages of global networking, large companies and organizations are always at high risk of hackers, international cyber attackers, illegal organizations, and even youngsters who try to prove their capabilities. Their weapon of choice is Distributed Denial-of-Service (DDoS) attacks, a class of computer security threat aims to plant pugs to disrupt service and to steal different types of sensitive information. The main goal of this paper is to address the growing threat of application layer DDoS attacks, which use long-lasting slow communication to tie up network and computing resources, thus preventing legitimate communication requests from being processed. Our main idea is to prevent the binding of resources by potential DDoS-requests that last over a long period of time. To remedy this problem, we propose a design of a data mining engine to control the dynamic priority assignment of communication processes and automatically downgrade or upgrade open connection processes according to their resource usage history and the dynamism in the whole system. All remote communication requests will be clustered based on their history of resource usage such as CPU time, memory size, and network bandwidth. The data mining engine is an important block in the whole system as we collect a huge amount of data and traffic. Classifying and modeling different types of collected data requires developing different algorithms to extract patterns and predict behaviors in order to detect attacks.
978-1-5090-6356-7/17/$31.00 ©2017 IEEE
II. DATA MINING ENGINE FOR DDOS ATTACK
One important mission of the Department of Homeland Security (DHS) is to prevent, combat, and mitigate extensive and sophisticated Distributed Denial of Service (DDoS) attacks in early stages. The problem with DDOS attacks is that they can cripple even the most established and largest organizations through generating a huge amount of traffic that will cause the network to crash. Traditional malware protection such as Intrusion Detection Systems (IDS), Firewalls and Blackholing are not appropriate for DDoS defense and mitigation. In this paper, we propose to develop a new method to mitigate DDoS attack using Data mining Techniques. We will develop a data mining framework that will be able to address the previous priority assignments of the data and all requests whether attacks or legitimate requests to the remote server. Based on all the previous information such as CPU, memory, and network bandwidth, we will be able to cluster all requests based on the provided information. The data mining engine is an important block in the whole system as we collect a huge amount of data and traffic. Classifying and modeling different types of collected data require developing different algorithms to extract patterns and predict behaviors in order to detect attacks. We need to analyze and model the collected data in order to get beneficial information that will be to predict the attacks and react accordingly. Using statistical modeling and prediction, we are able to track and identify all the legitimate requests and track the time when a request is made until releasing it. This way we are able to identify all the resources communicating to a particular server. Once, a DDoS attack is occurring, the data mining engine is able to identify the attack; deprioritize the attack resources and reprioritize other legitimated resource. Additionally, we are able to release the server (by overriding it manually or automatically if it exceeded the average or the maximum release time) after deprioritizing the attack. The data mining engine is able to behave as a data collection system, analyzing the data collected, sorting it and perform the deprioritizing and reprioritizing. The system will allow frequent monitoring and early detection of attacks. A. Run-time Behavior Analysis
A data-mining engine will be developed in order to model and respond to resource usage such as network bandwidth, CPU and memory in order to respond to a DDoS attacks in real time [3][4][5]. Mining of data and
discovery of knowledge is defined as an activity of small unimportant removal of inherent, unknown resources before instances and highly beneficial information could be extracted from raw data to identify attacks [6][7]. Data Mining defines the practice of query submission and extraction of patterns, not previously known from a large repository of data [8][9]. Often pattern matching or other reasoning techniques are used to achieve the extraction of knowledge. Ease and availability of a variety of collected data is necessary for the automatic and intelligent extraction of knowledge from the acquired or collected data. The data mining engine consolidates the desired resources data in real time, and then the engine analyzes and clusters the data for actionable information that is available to reprioritize legitimate resources. The data mining engine is an important block in the whole system as we collect a huge amount of data and traffic. Classifying and modeling different types of collected data require developing different algorithms to extract behavior and sources along with predicting release time of a resource. We need to analyze and model the collected data in order to get beneficial information that will be sent to prioritize the requests in order to release a DDoS attack and respond to legitimate requests. Several types of data are captured based on different characteristics such as network bandwidth, CPU, memory, etc. Collecting a huge amount of different types of data require methods to collect the data, cluster them, analyze and classify the data to use them efficiently and in timely manner. The aim of this research is to deduce significant information for DDoS analysis, characterization and identification of possible attacks based on the collected time and time it takes to release a remote resource. Creating intelligence is the goal of all the actuarial work being performed and assists in better and informed decision-making. From an intelligent system perspective the primary role of the engine is to support and store raw data. As illustrated in Figure 1, the data-mining engine consists of the following major blocks: A. Cluster and consolidating the collected data B. Correlating the data based on types and resources when applicable, correlating data based on the clustered data based on the different resources C. Performing modeling and prediction of the data, and data analysis (collecting the different types of data and clustering them based on response time, location, etc.). D. Setting the priorities or reprioritize legitimate requests in order to halt an attack. Additionally,
978-1-5090-6356-7/17/$31.00 ©2017 IEEE
Report the attack to IT personnel in order to track down and block. B. Collecting and clustering different types of data
We collect data and transform them to information based on the type, location, response time, remote server and all the collected related data. In capturing the information, we will use context sensitive decision method that will act on detecting and collecting the information based on changing the behavior of the received the data (network bandwidth, CPU, and memory), then in real time processes; we analyze the data in order to get the analytical parameters that will allow us to cluster the data. From the clustered data, we will be creating different paths to the data to allow for analyzing and quicker response time while we are integrating data from the multiple resources. C. Correlating data based on type and time
Data is passed from the clustering block within the data mining engine in order to further analyze the data as shown in Figure 1. The data mining engine will perform data correlation among clusters and if necessary among the collected data based on type of resource such as network bandwidth, etc. Correlating the data among clusters will reduce the latency time for the other blocks within the data mining engine that leads to a faster response time. In most cases, clustered data contain much more information than needed to identify a problem. Transforming the clustered data in a different domain rather than the time domain will allow us to extract the main components of the clustered data and certain features from the collected data. D. Feature Extraction and Modeling
Different techniques and algorithms will be developed in order to extract behavior of the legitimate requests or patterns across the clustered data. As different behavior /features from the collected data will be used to model the legitimate behavior and when the collected data refers to the DDoS attack, it will passed as an attack when compared to the legitimate model and behavior. The extracted behavioral model will be based on the network bandwidth, CPU, response time to a request, and memory. The developed algorithm is needed to extract authentic request and identify problems or attacks associated with the data which will allow the data mining engine as illustrated in Figure 1 to predict data behavior and predict an attack along with the response
Figure 1: General Framework of the Data Mining Engine
978-1-5090-6356-7/17/$31.00 ©2017 IEEE
time to any request to allow quicker response to attacks and reprioritize the authentic requests and ignoring the attacks. The purpose of the behavior analysis here is to develop an algorithm that will extract features and valid behaviors versus attacks. We will look into data regression such as linear least square method and statistical classifications for modeling among clusters for the different data on the same request. We could extend the project on using gaming theory in order to model the requests. We looked at two distinctive patterns, urgent patterns or attack occurring patterns, and the second type is called non-urgent patterns. Given the captured data time stamps, location, and other characteristics we will reinforce the knowledge of when additional data with the pattern or changing pattern will occur and the changes in data that could flag urgencies to respond. Finding and distinguishing between the urgent or nonurgent data is based on the discovered patterns and the prediction model specified for future occurrences.
IV. REFERENCES [1]
[2] [3] [4]
[5]
[6]
E. Prioritize Requests
The distributed system will be designed and implemented based identifying authentic requests and attacks based on different types of resources. The behavioral analysis will be based on the clustered data. Once an attack has been identified, we will reprioritize all authentic requests and deprioritize the attack while we are performing additional analysis on the attack in order to classify as a complete attack. Additionally, after analyzing the data and deciding an attack occurred; we will respond to the attack by reprioritizing it and upon further analysis ignoring it and pass the analysis to a personnel in order to act according if further actions are required. Additionally, passing the info to a person will allow blocking any attacks from that particular location. III. CONCLUSION
The main goal of this paper is the design and the implementation of a data mining engine to predict and prevent DDoS attacks. The proposed Data Mining Engine will allow Cyber duplicity to be detected sooner and even beforehand. The intention of the data mining engine is to control the dynamic priority assignment of communication process and automatically downgrade or upgrade upon connection processes according to their previous resource usage. All remote communication requests will be clustered based on their history of resource usage such as CPU time, memory size, and network bandwidth.
978-1-5090-6356-7/17/$31.00 ©2017 IEEE
[7]
[8]
Karen Evans and Franklin Reeder. “Human Capital Crisis in Cybersecurity Technical Proficiency Matters, “A Report of the CSIS Commission on Cybersecurity for the 44th Presidency, Center for Strategic and International Studies, November 2010. Attacks at the Data Link Layer by Guillermo Mario Marro The largest DDoS attack in history. http://blog.cloudflare.com/the-ddos-that-almost-brokethe-internet. Guangxue Rui Zhong. Ddos detection system based on data mining. Proceeding of the Second International Symposium on Networking and Network Security, 2010. Kyung Rhee Bayu Tama. Data mining techniques in dos/ddos attack detection: A literature review. Proceeding of The 3rd International Conference on Computer Application and Information Processing Technology, 2015. Hsun Huang Tsung Yang Chu Lin, Jung Liu. Using adaptive bandwidth allocation approach to defend ddos attacks. Proceeding of IEEE International Conference on Multimedia and Ubiquitous Engineering, 2008. Nabajyoti Medh Chaitanya Buragohain. Flowtrapp: An sdn based architecture for ddos attack detection and mitigation in data centers. Proceeding of IEEE 3rd International Conference on Signal Processing and Integrated Networks, 2016. M. Nijim, Young Lee, A data mining algorithm for
multilevel prefetching in storage systems. Proceeding of Ubiquitous Computing and Communication Journal, 2011. [9] Mais Nijim, Vamshi Reddy, Remzi Seker, Dm-pas: A data mining prefetching algorithm for storage systems. Proceeding of IEEE International Symposium on Advanced of High Performance Computing and Networking, 2011.