Segmented Federated Learning

This is an implementation of the 2020 IJCNN paper Intrusion Detection with Segmented Federated Learning for Large-Scale Multiple LANs.

General information

Predominant network intrusion detection systems (NIDS) aim to identify malicious traffic patterns based on a handcrafted dataset of rules. Recently, the application of machine learning in NIDS helps alleviate the enormous effort of human observation. Federated learning (FL) is a collaborative learning scheme concerning distributed data. Instead of sharing raw data, it allows a participant to share only a trained local model. Despite the success of existing FL solutions, in NIDS, a network’s traffic data distribution does not always fit into the single global model of FL; some networks have similarities with each other but other networks do not. We propose Segmented-Federated Learning (Segmented-FL), where by employing periodic local model evaluation and network segmentation, we aim to bring similar network environments to the same group.

Setup instructions

This is a quick guide to get started with the sources.

Dependencies

You will need Python 3 to run the systems.

Upgrade pip to the latest version, use:

sudo python3 -m pip install --upgrade pip

Forking or cloning

Consider forking the project if you want to make changes to the sources. If you simply want to run it locally, you can simply clone it.

Forking

If you decide to fork, follow the instructions given by github. After that you can clone your own copy of the sources with:

git clone https://github.com/YOUR_USER_NAME/homogeneous-learning.git

Make sure you change YOUR_USER_NAME to your user name.

Running the systems

First, please download the Segmented Intrusion Detection Dataset (SIDD). Place the unzipped files at the following location in the root dir to form the following tree structure:

|-- Segmented-FL
    |-- main.py
    |-- cnn
    |-- n005
    |  |-- pcap
    |  |-- local.npy
    |-- n006
    |  |-- pcap
    |  |-- local.npy
    |-- ...

Then, the algorithm can be run by simply typing:

python main.py

Segmented Intrusion Detection Dataset (SIDD)

SIDD is the first image-based network intrusion detection dataset. This large-scale dataset includes the feature maps (images) of network traffic data from 15 different observation locations of different countries in Asia. This dataset is used to identify two different types of anomalies from benign network traffic. Each image with a size of 48 × 48 contains multi-protocol communications within 128 seconds. The SIDD dataset can be to applied to a broad range of tasks such as machine learning-based network intrusion detection, non-iid federated learning, and so forth.

The folders in the dataset are named as> node name_collection date_device id_anomaly type. For example, n005_20191001_000001_1 means the traffic data including Type A anomaly collected on October 1, 2019 by device id 1 at node 005.
In each folder, traffic data images were separated into the benign and anomaly.
Currently, we are providing the images of two types of anomaly:
- Type A: Server Message Block (SMB) attack (folder names like xx_xx_xx_1)
- Type B: TCP SYN flood attack (folder names like xx_xx_xx_3)
Refer to PyTorch Writing Custom Datasets, DataLoaders and Transforms for preprocesing the dataset. We will release the preprocessed dataset and the code shortly.

Citation

If this repository is helpful for your research or you want to refer the provided results in this work, you could cite the work using the following BibTeX entry:

@inproceedings{sun2020segmented,
  author    = {Yuwei Sun and
               Hideya Ochiai and
               Hiroshi Esaki},
  title     = {Intrusion Detection with Segmented Federated Learning for Large-scale
               Multiple LANs},
  booktitle = {International Joint Conference on Neural Networks (IJCNN)},
  year      = {2020}
}

@article{sun2020sfl,
  author    = {Yuwei Sun and
               Hiroshi Esaki and
               Hideya Ochiai},
  title     = {Adaptive Intrusion Detection in the Networking of Large-scale LANs
               with Segmented Federated Learning},
  journal   = {IEEE Open Journal of the Communications Society},
  volume    = {2},
  pages     = {102--112},
  year      = {2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
000		000
cnn		cnn
init		init
module		module
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
architecture.png		architecture.png
main.py		main.py
server_dict.txt		server_dict.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation