Comparison of Machine Learning Algorithms to Detect RPL-Based IoT Devices Vulnerability

Table of Contents

Simulation and Raw Data

In the previous article, I explained how to obtain nodes created for Flooding Attacks, Decreased Rank Attacks, and Version Number Increase Attacks from the RPL Attacks Framework that D’Hondt and others have done. You can find this article here.

In this article, I will simulate these nodes using Cooja and obtain network data. The article on how Contiki and Cooja are set up can also be found here.

For machine learning, we will need two classified data sets. One of them is the data generated from the simulation with completely normal IoT nodes that do not contain vulnerable nodes. The other is the data generated from simulation with normal IoT nodes containing vulnerable nodes. Thus, we will classify these two data sets and detect the anomaly with classification algorithms.

You can reach the video here about simulating and obtaining the raw data.

On the computer where the simulation is done, we have a folder named “Experiments” on the desktop. Detailed information on how this folder was created here can be found here.

Experiment Folder

Inside the “Experiments” folder there are folders of attacks. Within these folders, there are nodes. These nodes have a “.z1” extension.

Attacks Folder
Motes

To start the cooja simulator, we open a new terminal, we just have to enter the folder where we are going to run the cooja.

Running Cooja Simulator

				
					cd contiki-ng/tools/cooja
				
			

Cooja is run with the following command.

				
					cd contiki-ng/tools/cooja
				
			

Adding the New Simulation

The reason why we enter the big-mem parameter here is to run the simulator with more memory space in RAM.

When the Cooja simulator opens, to create a new simulation:

File->New Simulation… button.

We give the simulation a name. Here

It is named HF-1R10M.

When naming names, we made the following coding.

HF: Hello Flood

DR: Decreased Rank

VI: Version Number Increase

R: Root

N: Normal

M: Malicious

Numbers: Node Counts

For example

HF-1R10N1M

Flooding Attack with 1 root mote, 10 normal motes and 1 vulnerable mote

 

In the first simulation, we will use completely normal nodes. Thus, we will be able to generate data for the normal class.

New Simulation
Simülation Name

Adding Motes

Adding the Root Mote

Then we will add the nodes to the simulation in order. Therefore

Motes->AddMotes->create new mote type->Z1 mote…

We select the option.

In the”” Descriptions section we write “root” and we select root.z1 in the

Experiments ->hello-flood->with-malicious->motes folder and press “Create.” Then, in the window that opens, it asks us how many of these nodes to place where. Since we have determined 1 root node, we will leave the “Number of motes” section as “1” and select “Random positioning”. When we press the “Add motes” button, it will place 1 root node in the simulation.

Adding Motes
Adding Root

Adding the Normal Motes

The same method will be used to add regular nodes. Motes->AddMotes->create new mote type->Z1 mote…

We select the option.

In the Descriptions section we write “normal” and

We select “sensor.z1” in the Experiments ->hello-flood->with-malicious->motes folder and press Create. Then, in the window that opens, it asks us how many of these nodes to place where. Here we enter the number 10 and press the “create” button. In the same way, we add the vulnerable node to the simulation, but in the add mote option, we press the “Do not add motes” button. We will use this node later to extract data from the simulation with the vulnerable node.

Note: When we record the simulation with normal nodes differently and then try to add the vulnerable node, the program gives an error. In order not to change the location of the nodes and to perform the simulation with the same conditions, we added the vulnerable node to the simulation, but we did not use it in the simulation.

Düğümleri birbirinden ayırt edebilmek için, düğümlerin yerleştirildiği penceredeki “view” menüsünde, “mote type” ı seçiyoruz. Böylece, farklı türdeki düğümler farklı renklerle gösterilecektir.

Adding Normal Motes

Recording the Network Packages

We will only need to save the network data to be able to obtain the raw data from the simulation with normal nodes. To add network data,

Tools -> Radio Messages…

We choose the option

Radio Messages

In the new window that opens, we select the option “6LoWPAN Analyzer with PCAP”. Thus, the PCAP file from which we will obtain the raw data will be saved. In addition, we will be able to see the network packets here during the simulation.

6LoWPAN Analyzer with PCAP

Adding Timeout Script to Simulation

We want to run the simulation for 5 minutes. For this we can use the script editor.

Tools -> Simulation Script Editor…

In the window that appears after selecting the option

We just need to write TIMEOUT(300000).

(5 x60 = 300 sec. = 3000000 milliseconds )

After entering the script, the script will not work if the Run-Activate option is not selected from the menu in the window.

Simulation Script Editor

Starting the Simulation

We start the simulation by pressing the Start button.

Starting Simulation

The simulation will run for 5 minutes. The data we will use for machine learning will be saved in PCAP format in the folder below.

Network Packages

Converting PCAP File to CSV File

Home/contiki-ng/tools/cooja/build

We record the name of the resulting PCAP file with the name of the simulation.

We need to convert packages from PCAP format to CSV (Comma Seperated Value) format for us to analyze them. For this we will use the Wireshark program.

After opening the Wireshark program

File-> Open…

After pressing the button,

Home/contiki-ng/tools/cooja/build

We select our PCAP file located under its folder. Then,

File->Export Packet Dissections->As CSV…

We press the button. In the window that opens, we give the name of the CSV file and save the file.

In this way, we have converted the PCAP file to a CSV file.

Converting PCAP File to CSV File

Reloading the Simulation and Simulating with Malicious Mote

After simulating with normal nodes and obtaining network packet data, we did the same with the vulnerable node and obtained the network packets.

For this, we need to reinstall the simulation.

We reload the simulation by pressing the “Reload” button in the simulation.

Reloading the Simulation

We delete the most recently added node from the normal nodes and add the “malicious” node that we added to the simulation earlier at the point where the most recent node is located.

Adding Malicious Mote

Making Simulations for Other Attacks

We convert the PCAP file obtained from the simulation with the vulnerable node to the CSV file with the method described above.

When we compare the data we obtained in the Hello Flood attack with the normal data, we can see that there are more rows of data in the attack.

We do the same experiment for the Decreased Rank Attack and the Version Number Increase Attack and record the data we obtained from our experiments with normal and malicious nodes.

The vulnerable, root, and normal node counts of the simulations are in table 3.1.

 

Normal

Malicious

Attack Type

Root

Normal

Root

Normal

Malicious

Flooding Attack

1

10

1

9

1

Decreased Rank Attack

1

10

1

9

1

Version Number Increase Attack

1

10

1

9

1

Table 3. 1: Types and Quantities of Simulation Nodes.

The generated raw data format is in Table 3.2.

No

Time

Source

Destination

Protocol

Length

Info

1

0

fe80::c30c:0:0:c

ff02::1a

ICMPv6

64

RPL Control (DODAG Information Solicitation)

2

3270

fe80::c30c:0:0:c

ff02::1a

ICMPv6

64

RPL Control (DODAG Information Solicitation)

3

6565

fe80::c30c:0:0:c

ff02::1a

ICMPv6

64

RPL Control (DODAG Information Solicitation)

…………..

Table 3. 2: Raw data generated as a result of simulation.

You can also download the generated raw data here.

Raw Data

Blog summary

In the previous article, I explained how to obtain nodes created for Flooding Attacks, Decreased Rank Attacks, and Version Number Increase Attacks from the RPL Attacks Framework that D'Hondt and others have done.In this article, I will simulate these nodes using Cooja and obtain network data. For machine learning, we will need two classified data sets. One of them is the data generated from the simulation with completely normal IoT nodes that do not contain vulnerable nodes. The other is the data generated from simulation with normal IoT nodes containing vulnerable nodes. Thus, we will classify these two data sets and detect the anomaly with classification algorithms.

About the Author

Other Posts

My Thesis
Murat Ugur KIRAZ

Conclusion

In this blog post, the Flooding Attack, Decreased Rank Attack and Version Number Increase Attack in the RPL protocol were trained and detected by “Decision Tree”, “Logistic Regression”, “Random Forest”, “Naive Bayes”, “K Nearest Neighbor” and “Artificial Neural Networks” algorithms.

The test results for the attacks were compared, as a result of the comparison, the Artificial Neural Networks algorithm with an accuracy rate of 97.2% in the detection of Flooding Attacks, the K Nearest Neighbor algorithm with an accuracy rate of 81% in the detection of Version Number Increase Attacks, and the Artificial Neural Networks with an accuracy rate of 58% in the detection of Decreased Rank attacks algorithm has been found to show success.

Read More »
My Thesis
Murat Ugur KIRAZ

Interpretation of Machine Learning Values

I continue to share how I did my master’s thesis titled Comparison of Machine Learning Algorithms for the Detection of Vulnerability of RPL-Based IoT Devices, my experiences in this process, and the codes in this thesis in a series of articles on my blog.

So far, I have provided detailed information about the RPL protocol and the attacks that take place in the RPL protocol. Then, I experimented with Flooding Attacks, Version Number Increased Attack, and Decreased Rank Attack, extracting the raw data and making sense of that raw data. I compared the results of experiments with weak knots with statistical methods.

In this section, I will interpret the numerical results of the attacks we detect with machine learning algorithms.

Read More »
My Thesis
Murat Ugur KIRAZ

Statistical Analysis

I explained my master’s thesis titled ” Comparison of Machine Learning Algorithms for the Detection of RPL-Based IoT Devices Vulnerability” by using this blog page. So far, I have provided detailed information about the RPL protocol and the attacks that take place in the RPL protocol. Then, I experimented with Flooding Attack, Version Number Increase Attack, and Decreased Rank Attack, extracting the raw data, and making meaning of that raw data. In this section, I will compare the results of experiments with weak motes with statistical methods. Statistical methods will tell us if machine learning methods are working properly.

Read More »
My Thesis
Murat Ugur KIRAZ

Experiments and Experiment Results

I continue to share how I did my master’s thesis titled Comparison of Machine Learning Algorithms for the Detection of Vulnerability of RPL-Based IoT Devices, my experiences in this process, and the codes in this thesis in a series of articles on my blog.

In this article, we will train the processed data and detect the attack with machine learning algorithms in the RPL protocol.

Read More »
My Thesis
Murat Ugur KIRAZ

Machine Learning Algorithms Used in Attack Detection

While making the raw data meaningful, the data set obtained from the simulation with the malicious node was labeled with 1 and the simulation with normal nodes was labeled with 0, and these two data sets were combined. This new data set will be compared with the “classification” algorithms. The definitions of machine learning algorithms to be compared are explained in this page.

Read More »
My Thesis
Murat Ugur KIRAZ

Making Raw Data Meaningful

The information obtained from the raw data set will not be enough to apply machine learning. The raw data obtained from simulations containing weak nodes is completely different from the raw data obtained from simulations containing normal motes. It has been observed that this difference is the number of packets, message types, total packet lengths and rates. To detect this anomaly, the raw data is divided into 1-second frames. Within frames of each second, the following values were calculated, and a new data set was created.

Read More »
My Thesis
Murat Ugur KIRAZ

Simulation and Raw Data

In the previous article, I explained how to obtain nodes created for Flooding Attacks, Decreased Rank Attacks, and Version Number Increase Attacks from the RPL Attacks Framework that D’Hondt and others have done.

In this article, I will simulate these nodes using Cooja and obtain network data.
For machine learning, we will need two classified data sets. One of them is the data generated from the simulation with completely normal IoT nodes that do not contain vulnerable nodes. The other is the data generated from simulation with normal IoT nodes containing vulnerable nodes. Thus, we will classify these two data sets and detect the anomaly with classification algorithms.

Read More »
My Thesis
Murat Ugur KIRAZ

Obtaining Nodes

In my previous article, Contiki ve Cooja, I described how to set up Cooja to simulate IoT devices on a virtual computer with the Ubuntu 18.04 operating system. With this virtual computer, we will simulate the data transfer of benign and malicious IoT devices and get network information. Of course, we need ” benign ” and ” malicious ” nodes to do this simulation. I explained how to install the framework that D’Hondt et al. (2015) did under the heading D’Hondt’s RPL Framework to obtain these vulnerable nor normal nodes.
In this article, I will explain how we obtain the weak nodes and normal nodes where “Hello Flood”, “Decreased Rank” and “Version Number Increase” attacks will be made from the work done by D’Hondt and others (2015).

Read More »
My Thesis
Murat Ugur KIRAZ

D’Hondt’s RPL Framework

In an academic report by D’Hondt et al. (2015), they were able to simulate Flooding Attacks, Version Number Increase Attacks, and Decreased Rank Attacks on the RPL protocol using the Cooja IoT simulator. Here you can find information about how to set up D’Hondt’s RPL Attack Framework.

Read More »
My Thesis
Murat Ugur KIRAZ

Contiki and Cooja

How to install Contiki Operation System and Cooja on Ubuntu 18.04 ?You can find answer and a good solution for this question in this page.

Read More »
My Thesis
Murat Ugur KIRAZ

Simulation and Raw Data

Under this title, experiments will be conducted on Flooding Attacks, Version Number Increase Attacks and Decreased Rank Attacks that may occur in the RPL protocol and a data set will be created. For this purpose, the following stages will be followed:

Read More »
My Thesis
Murat Ugur KIRAZ

Attacks on the RPL Protocol

Detailed information about the RPL protocol is given in the protocol layers section of the thesis. In this section, the attacks implemented in the RPL protocol will be discussed.
RPL has many parameters due to its structure. DIS messages, DAO Messages, Version number, tree structure etc. Any change in these protocol parameters will be an attack as it will prevent the system from functioning properly. This section shows the attacks made on the RPL protocol.

Read More »
My Thesis
Murat Ugur KIRAZ

6LoWPAN and RPL

Since the subject of this thesis study is “comparing machine learning methods that can be used to detect attacks on RPL-based internet of things devices,” the RPL protocol will be focused a little more.

Read More »
My Thesis
Murat Ugur KIRAZ

Other Protocols

Since the thesis topic includes the RPL protocol, protocols other than RPL and 6LoPWAN are explained in this section. These are: CoAP (Constrained Application Protocol), DDS (The Data Distribution Service), MQTT (Message Queuing Telemetry Transport), AMQP (Advanced Message Queuing Protocol) and XMPP (Extensible Messaging and Presence Protocol), Multicast Domain Name System, Domain Name System Service Discovery, Infrastructure Layer Protocols, Physical Link Layer Protocols, EPCglobal, Z-wave
Bluetooth Low Energy(BLE),
IEEE 802.15.4, IEEE 802.11ah, ZigBee
Long-term Evolution, Advanced (LTE-A)
Network / Routing Layer Protocols.

Read More »

Share this post

LinkedIn
Twitter