This code was implemented as part of the NDSS FlowPrint [1] paper, it implements the Single Large Random Forest Classifier of AppScanner [2]. We ask people to cite both works when using the software for academic research papers.
The easiest way to install appscanner
is using pip
pip install appscanner
If you would like to install appscanner manually, please make sure you have installed the required dependencies.
This code is written in Python3 and depends on the following libraries:
- Numpy
- Pandas
- Scikit-learn
- Scapy
To install these use the following command
pip install -U scapy numpy pandas scikit-learn
The AppScanner implementation can be tested with the main.py
script. This script allows you to specify .pcap files to load. After loading, the script splits the data into training and testing data and evaluates the performance. See main.py --help
for more information.
It is also possible to directly use the AppScanner code as an API. There are two main classes which need to be understood.
appscanner.preprocessor.Preprocessor
for extracting features from.pcap
files.appscanner.appscanner.AppScanner
for applying the AppScanner detection.
The Preprocessor
object is used to extract data from .pcap
files and label them. To this end, it uses the process
function which requires a list of files and a list of labels. The list of files must be pathnames to pcap files. The list of labels must be labels corresponding to each file. The example below shows how the Preprocessor
can be used.
from appscanner.preprocessor import Preprocessor
# Create object
preprocessor = Preprocessor()
# Load from files
X, y = preprocessor.process(['<path_file_1>', ..., '<path_file_n>'],
['<label_1>' , ..., '<label_n>'])
The AppScanner
object is used to find known applications in network traffic. AppScanner requires a confidence threshold
(default=0.9). The threshold means AppScanner only returns labels for which it is confident enough or -1
otherwise, a threshold of 0 gives labels for every predicted sample. It can be fit
with X_train
and y_train
arrays obtained by the Preprocessor
. After it has been fit
, the AppScanner
is able to predict
unknown samples X_test
. The example below shows how AppScanner
can be used.
from appscanner.appscanner import AppScanner
# Create object
scanner = AppScanner(threshold=0.9)
# Fit scanner
scanner.fit(X_train, y_train)
# Predict labels of test data
y_pred = scanner.predict(X_test)
[1] van Ede, T., Bortolameotti, R., Continella, A., Ren, J., Dubois, D. J., Lindorfer, M., Choffnes, D., van Steen, M. & Peter, A. (2020, February). FlowPrint: Semi-Supervised Mobile-App Fingerprinting on Encrypted Network Traffic. In 2020 NDSS. The Internet Society.
[2] Taylor, V. F., Spolaor, R., Conti, M., & Martinovic, I. (2016, March). Appscanner: Automatic fingerprinting of smartphone apps from encrypted network traffic. In 2016 IEEE European Symposium on Security and Privacy (EuroS&P) (pp. 439-454). IEEE.
@inproceedings{vanede2020flowprint,
title={{FlowPrint: Semi-Supervised Mobile-App Fingerprinting on Encrypted Network Traffic}},
author={van Ede, Thijs and Bortolameotti, Riccardo and Continella, Andrea and Ren, Jingjing and Dubois, Daniel J. and Lindorfer, Martina and Choffness, David and van Steen, Maarten, and Peter, Andreas}
booktitle={NDSS},
year={2020},
organization={The Internet Society}
}
@inproceedings{taylor2016appscanner,
title={Appscanner: Automatic fingerprinting of smartphone apps from encrypted network traffic},
author={Taylor, Vincent F and Spolaor, Riccardo and Conti, Mauro and Martinovic, Ivan},
booktitle={2016 IEEE European Symposium on Security and Privacy (EuroS\&P)},
pages={439--454},
year={2016},
organization={IEEE}
}