Türkçe için buraya tıklayın.

Türkçe için buraya tıklayın

Comparison of Machine Learning Algorithms to Detect RPL-Based IoT Devices Vulnerability

Introductory

Introductory

Undoubtedly, informatics and the internet are two factors that change the world order, paving the way for globalization and starting a brand new era. We can accept that this change started when Claude Shannon showed that Boolean algebra could apply to electronic circuits in his master’s thesis written in 1936 when he was just 21 years old. (Chiu, Lin, Mcferron, Petigara, & Seshasai, 2001 ¹ ) After this period, when humanity laid the foundations of the modern computer, everything that 1s and 0s can express has profoundly affected human life. All kinds of data have been transformed into information and have caused human beings to make tremendous decisions.Scientists introduced the concept of the “Internet of Things” (IoT) when some devices were needed to collect and process data obtained from devices, environments, and even living things, operate devices, or transfer the obtained information to a place. The Internet of Things is “devices that work in a network, collect data, transmit data, process data, analyze the data and draw a conclusion, start and stop devices with the data they process in order to achieve a certain purpose.” we can say. Today, IoT devices can do much automation.IoT devices, which provide enormous benefits with their introduction into human life, need independent energy sources instead of directly meeting their energy needs with grid voltage as they collect, produce or transfer data from many points. In addition, most of these devices communicate wirelessly. Moreover, in an environment where there are many, maybe thousands, of these IoT devices, transferring data to a single point will not result in cost-effective energy consumption. Hence the device farther from the data transfer point (modem, router) will consume more power. In addition, the device will lose data packets during data transfer to the remote point. To solve this problem, connecting the devices to the transfer point by communicating with each other will save a great deal of energy and minimize the corruption in data packets. An example of a topology that communicates with a single data transfer point and communicates over each other is in Figure 1.1. This type of system is called Multipoint to Multipoint.

In March 2012, the RPL protocol (IPv6 Routing Protocol for Low Power and Lossy Networks) was revealed by the Internet Engineering Task Force (IETF). (Winter, et al., March 2012 ² ) With this protocol, IoT devices transmit data over each other. communication, low power operation is ensured and data is transmitted with less loss.

Designed for the 6LoWPAN protocol, RPL aims to optimize the power consumption of IoT devices. Because the RPL protocol is complex and 6LoWPAN devices are not secure, RPL is vulnerable to attacks inside or outside the network. Although the RPL protocol includes security controls, failure to implement or enforce them will make it vulnerable to attacks.

Attacks against the RPL working at the third layer will increase the system’s power consumption. This consumption causes the IoT devices to perform more operations and transmit data than average. It will cause the battery to run out early, thus causing this system, designed for power optimization, not to work as desired. Therefore, this study aims to develop a fast, practical, uncomplicated, and reliable intrusion detection system at the network layer. If an attack occurs on IoT devices working with RPL, an anomaly will occur in network packets at layer 3.

Machine learning methods are widely used in cyber security to detect anomalies or attacks. As it is known, machine learning is the ability of the computer to learn the existing data as a result of making mathematical calculations with the data at hand. Finally, the computer can make robust predictions with similar data.

In order to detect the attacks in the RPL protocol in the OSI (Open Systems Interconnection)-3 layer, which is the network layer, in a fast, practical, uncomplicated, and reliable way, the methodology will be:

Collecting data packets of the simulation to be done with normal nodes (IoT devices) with a software that simulates IoT devices,
Collecting the data packets of the simulation to be performed with the vulnerable nodes performing different attacks with the same simulation,
Process the data obtained from the simulations and label the data made with benign and malicious nodes.
Training and testing the newly created data set with Machine Learning algorithms,
To determine the machine learning algorithm that gives the best results and build the intrusion detection system on this algorithm.

As is known, there are many different machine learning algorithms. These algorithms have advantages and disadvantages over each other.

Purpose of the Thesis

In this master’s thesis, in order to detect attacks on RPL in a fast, practical, uncomplicated, and reliable way, data packets in the network layer, which is the OSI 3rd layer, are analyzed with six machine learning algorithms and the fastest, most effective, uncomplicated and reliable among them. We aim to detect the algorithm.

Similar Studies (Literature Review)

Yavuz carried out attacks on Cooja simulation with different numbers of benign and malicious devices to detect attacks on RPL IoT devices in his master’s thesis in 2018. He obtained the data set and created a deep learning-based attack detection system with the data set. (Yavuz, 2018 ³)

Müller et al. developed a machine learning method with Kernel Density Estimation (KDE) in 2019. With this method, we can detect Black Hole, Hello Flood (HF) and Version Number Change attack with 84.91 percent true positive (GP) and less than 0.5 percent false positive (FC) rate. they have succeeded. (Müller, Debus, Kowatsch, & Böttinger, 2019 ⁴)

Neerugatti and Reddy proposed an intrusion detection technique based on the machine learning approach called MLTKNN, which is based on the K-nearest neighbor algorithm. With up to 30 various amounts of IoT devices (nodes), they have achieved a 90%-98% GP rate and a 0.9%-0.2% FC ratio. (Neerugatti & Reddy, 2019 ⁵)

In 2019, Verma and Ranga designed a Network Intrusion Detection System architecture called ELNIDS (Ensemble Learning-based Network Intrusion Detection System) to detect attacks against RPL. They implemented this design with the Boosted Trees, Bagged Trees, Subspace Discriminant, and RUSBoosted Trees algorithms. The study detected Sinkhole, Blackhole, Sybil, Clone ID, Selective Forwarding, HF, and Local Repair attacks with machine learning methods using 20 features of the RPL-NIDDS17 dataset. The Boosted Trees algorithm had the highest accuracy with 94.5%, while the Subspace Discriminant method had the lowest accuracy with 77.8%. (Verma & Ranga, 2019 ⁶ )

Belavagi and Muniyal have separately designed the number of nodes in the system created with the RPL protocol in 2020 as 10, 40, and 100 nodes, respectively. In each system, the rate of vulnerable nodes is 10%, 20%, and 30%. They observed the behavior of the grid according to the percentage of inconsistency, energy consumption, GP, and FX ratio they obtained. In this study, they used other parameters and network packets for machine learning algorithms. (Belavagi & Muniyal, 2020 ⁷ )

Çakır, Toklu, and Yalçın proposed a Gated Repetitive Unit network model-based deep learning algorithm to predict and prevent RPL overflow attacks in IoT networks. They compared this model with Support Vector Machines (SVM-Support Vector Machines) and Logistic Regression (LR-Logistic Regression); They also tested different power states and total energy consumption of the nodes. The model they presented detected HF attacks with a much lower error rate than the literature studies. (Çakır, Toklu, & Yalçın, 2020 ⁸)

Shafiq et al. have worked on a model that enables the selection of an effective machine learning algorithm among many machine learning algorithms for the cyber attack detection system to be used in IoT security. The study concluded that the Naive Bayes Machine Learning algorithm effectively detects anomalies and attacks in the IoT network. However, this study did not contain RPL-based IoT devices. (Shafiq, Tian, Sun, Du, & Guizani, 2020 ⁹ )

In this thesis, unlike other studies, a single machine learning method has not been developed to detect attacks such as Overflow Attacks, Version Number Increase Attacks, and Descending Rank attacks in RPL. Instead, the focus is on getting the best results by comparing multiple machine learning methods. This study uses only the data obtained from the third layer network packets to detect the attacks. The reason for this is quite simple. Receiving, processing, and transmitting parameters such as instantaneous power and energy consumption for each IoT device will require extra processing and capacity power.

For this reason, third-layer network packages, which can be obtained very quickly, are used in this study. Benign and malicious RPL data transmitted in the third layer were divided into 1-second frames. A data set was prepared by summarizing the durations, sizes, RPL message types, and rates of the packets in this frame. These datasets have been tested with Decision Trees, Logistic Regression, Random Forest, K Nearest Neighbor, Naive Bayes, and Artificial Neural Network machine learning algorithms. Finally, the study compared these machine learning algorithms to find the best result.

In this study, in the second part, IoT devices, other protocols used in these devices, RPL protocol, and attacks on RPL protocol are explained in detail. The third chapter explains the experiments to obtain the data set of the attacks in detail and which methods were specified. The fourth chapter interprets the data obtained from the experiments.

Of course, artificial neural networks are the machine learning algorithms that are the most successful among machine learning algorithms and can easily detect even the most complex problems. Artificial neural networks appear as a powerful algorithm with their non-linear mapping feature from numerical data and their parallel operation feature. However, artificial neural networks are more costly than others because of their long training times. The study expects that artificial neural networks will give results with high accuracy in the detection of attacks on the RPL protocol. However, it is assumed that other machine learning algorithms such as Decision Trees, Logistic Regression, Random forest, K Nearest Neighbor, and Naive Bayes will have shorter training times. Moreover, detect the attack with as high accuracy as artificial neural networks. (Yılmaz, 2015 ¹⁰)

References

1. Chiu, E., Lin, J., Mcferron, B., Petigara, N., & Seshasai, S. (2001). Mathematical Theory of Claude Shannon. The Structure of Engineering Revolutions. (Back)

2. Winter, T., Thubert, P., Brandt, A., Hui, J., Kelsey, R., Levis, P., . . . Alexander, R. (Mart 2012). RPL: IPv6 Routing Protocol for Low-Power and Lossy Networks. Internet Engineering Task Force. https://www.hjp.at/doc/rfc/rfc6550.html (Back)

3. Yavuz, F. Y. (2018). Deep Learning in Cyber Security for Internet of Things. (Back)

4. Müller, N., Debus, P., Kowatsch, D., & Böttinger, K. (2019). Distributed Anomaly Detection of Single Mote Attacks in RPL Networks. Proceedings of the 16th International Joint Conference on e-Business and Telecommunications (ICETE 2019),. doi:DOI: 10.5220/0007836003780385 (Back)

5. Neerugatti, V., & Reddy, A. M. (2019). Machine Learning Based Technique for Detection of Rank Attack in RPL based Internet of Things Networks. International Journal of Innovative Technology and Exploring Engineering. (Back)

6. Verma, A., & Ranga, V. (2019). ELNIDS: Ensemble Learning based Network Intrusion Detection System for RPL based Internet of Things. 2019 4th International Conference on Internet of Things: Smart Innovation and Usages (IoT-SIU). doi:10.1109/IoT-SIU.2019.8777504 (Back)

7. Belavagi, M. C., & Muniyal, B. (2020). Multiple intrusion detection in RPL based networks. International Journal of Electrical and Computer Engineering (IJECE). doi:10.11591/ijece.v10i1.pp467-476 (Back)

8. Çakır, S., Toklu, S., & Yalçın, N. (2020). RPL Attack Detection and Prevention in the Internet of Things Networks Using a GRU Based Deep Learning. EEE Access. doi:10.1109/ACCESS.2020.3029191 (Back)

9. Shafiq, M., Tian, Z., Sun, Y., Du, X., & Guizani, M. (2020). Selection of effective machine learning algorithm and Bot-IoT attacks traffic identification for internet of things in smart city. Future Generation Computer Systems 107 (2020), 433-442. (Back)

10. Yılmaz, A. (2015). Sinirsel Bulanık Mantık Modeliyle Kanser Risk Analizi. Sakarya: Sakarya Üniversitesi Fen Bilimleri Enstitüsü. (Back)

Blog summary

In this section, the introduction part of the thesis, the aim of the thesis and literature review are included.Keywords : RPL, Machine Learning, Flooding Attacks, Version Number Increase Attacks, Decreased Rank Attacks

About the Author

Murat Ugur KIRAZ

Self-disciplined, systematic, productive, hardworking, adaptable, and focused web programmer who has gained all these abilities while working as a military electronics communication officer for 12 years in real operations and multinational areas, offering 10 years of software development experience and more than three years of experience providing high-impact web development solutions with React, Angular, JavaScript, UX, UI, Django, Flask, .NET Core, and .NET. Skilled in designing, developing, and testing web-based applications.

Other Posts

Python Programming

Export Tables from Access to PostgreSQL

This article describes how to export a table from Access DB to postgresql via excel and python.

March 14, 2023 No Comments

My Thesis

Conclusion

In this blog post, the Flooding Attack, Decreased Rank Attack and Version Number Increase Attack in the RPL protocol were trained and detected by “Decision Tree”, “Logistic Regression”, “Random Forest”, “Naive Bayes”, “K Nearest Neighbor” and “Artificial Neural Networks” algorithms.

The test results for the attacks were compared, as a result of the comparison, the Artificial Neural Networks algorithm with an accuracy rate of 97.2% in the detection of Flooding Attacks, the K Nearest Neighbor algorithm with an accuracy rate of 81% in the detection of Version Number Increase Attacks, and the Artificial Neural Networks with an accuracy rate of 58% in the detection of Decreased Rank attacks algorithm has been found to show success.

October 3, 2022 No Comments

My Thesis

Interpretation of Machine Learning Values

I continue to share how I did my master’s thesis titled Comparison of Machine Learning Algorithms for the Detection of Vulnerability of RPL-Based IoT Devices, my experiences in this process, and the codes in this thesis in a series of articles on my blog.

So far, I have provided detailed information about the RPL protocol and the attacks that take place in the RPL protocol. Then, I experimented with Flooding Attacks, Version Number Increased Attack, and Decreased Rank Attack, extracting the raw data and making sense of that raw data. I compared the results of experiments with weak knots with statistical methods.

In this section, I will interpret the numerical results of the attacks we detect with machine learning algorithms.

September 4, 2022 No Comments