Optimizing Self-Organizing Maps for Bacterial Genome Identification on Parallel Ultra-Low-Power Platforms

Get Started

The kernel design includes a Python model and a C program. The Python model generates the input dataset, computes the kernel output as a golden reference, and assesses the accuracy using a customizable error metric.

Prerequisites

The golden model is built on top of PyTorch data types.

Python

This implementation has been tested and verified with Python 3.10.

PyTorch

if you are not going to use float8, the following packages need to be installed:

pip install torch

Float8|e5m2

To enable support for float8, utilize our e5m2 implementation. Subsequently, proceed with the instructions for installing PyTorch from the source provided in this directory.

We conducted tests using our implementation on the CPU. To replicate our setup, kindly disable CUDA support by exporting the environment variable USE_CUDA=0:

export USE_CUDA=0

Please use

git clone [email protected]:ahmad-mirsalari/PyTorch_E5M2.git

instead of

git clone --recursive https://github.com/pytorch/pytorch

in the "Get the PyTorch Source" step.

In case you encounter the error "multiple definition of `gloo::rendezvous::Store::kDefaultTimeout'", please refer to the solution outlined in this GitHub issue. It's important to note that this issue is unrelated to our implementation.

Once Torch is installed, navigate to the root directory of the modified PyTorch codebase in the terminal or command prompt. Run the following command to install PyTorch in editable mode:

pip install -e .

The . at the end indicates that the current directory should be installed in editable mode.

Once the installation is complete, you can import the modified version of PyTorch in your Python code just like you would with the regular PyTorch library:

import torch

This will import the modified version of PyTorch that you installed in editable mode.

PULP-SDK

These tests requires the PULP-SDK. Once you cloned the PULP-SDK repository and you have the RISC-V GNU Compiler Toolchain installed, you need to compile GVSOC. Please refer to the links to correctly setup your working environment.

Here is my suggestion:

1- First install and compile the RISC-V GNU Compiler Toolchain.

Follow the next steps in the RISC-V GNU Compiler Toolchain repository.

2- Install and compile PULP-SDK.

Please follow the next setups in the PULP-SDK repository

3- Finally, test the installation according to Test execution

Don't forget to source the file corresponding to the desired configuration when you want to use the the project again :

cd pulp-sdk
source configs/pulp-open.sh

How to run a test

After the platform and the SDK setup, you can run the test.

Generating the golden model

If you want to generate a golden model, you can use the data_generator.py script with the following command:

./data_generator.py --I=Train_sequence --T=Test_sequence --R=Number_of_Bacteria --S=slice_length --N=Neurons --float_type= --MAC_flag=false --vec_flag=false

specifies the floating-point format for data; by default, it is set to FP32, but you can also choose FP16, FP16ALT, and FP8 formats. Also, you can run the mixed-precision golden model by using --float_type=FP_INP,FP_Weight,FP_OUT (input, SOM weights, output).
MAC_flag is used to emulate the multiply-and-add operator available on most DSP instruction sets for embedded devices. It can be true or false. To emulate FP16, FP8, and FP16ALT behavior on PULP, true this flag.
vector flag to emulate SIMD vector instructions. It can be true or false. To emulate vectorized FP16, FP8 and FP16ALT behavior on PULP, true this flag.
I is the number of train data(e.g., 40000)
T is the number of test data(e.g., 1000)
R is the number of Bacteria(SOM)(e.g., 2).
S is the number of slice lengths (e.g., 8)
N is the number of neurons per each network(e.g., 40000) The script will generate floating-point data and a reference output of format fmt (FP32/FP16/FP8/FP16ALT):

Running the C code

make clean all run stats=1 check=1 w_block=128 i_block=32 vec=1 cores=1 fmt=FP16 verbose=1  IN_ORDER=1

make clean all run stats=1 check=1 w_block=128 i_block=32 cores=1 fmt=FP16 verbose=1 IN_ORDER=1 There are several flags useful to activate some functionalities:

cores=N_CORES sets the number of cores used for the execution to N_CORES, by default cores=1. There is also the ability to run on the Fabric controller by using FABRIC=1 instead of cores=N_CORE.
fmt=FP_FMT specifies the floating-point format for data, by default, it is set to FP32 but you can also choose FP16, FP8 or FP16ALT formats.
vec=1 activates vectorial format only for half-precision floating point (FP16 and FP16ALT) or FP8
check=1 activates the result check
verbose=1 prints the wrong results
stats=1 activates performance measurement
PRINT_RESULTS=1 print outputs of C code
w_block the tile size of the SOM network (The number of neurons must be divisible by this number)
i_block the tile size of the input (The number of input must be divisible by this number)
IN_ORDER=1 if you want to use the Vertical Mapping approach. Please consider that the number of cores should be >1 in the Horizontal mapping mode

Roadmap

Extend the support to additional FP data types (e.g., different flavors of 8-bit FP types)

License

This project is released under Apache 2.0, see the LICENSE file in the root of this repository for details.

Acknowledgements

This work was supported by the APROPOS project (g.a. no. 956090), founded by the European Union’s Horizon 2020 research and innovation program.

Contributors

Seyed Ahmad Mirsalari, University of Bologna,E-mail

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.vscode		.vscode
BUILD/PULP/GCC_RISCV		BUILD/PULP/GCC_RISCV
dataset		dataset
Makefile		Makefile
README.md		README.md
config.h		config.h
data.h		data.h
data_generator.py		data_generator.py
main.c		main.c
param.h		param.h
som.c		som.c
stats.h		stats.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Optimizing Self-Organizing Maps for Bacterial Genome Identification on Parallel Ultra-Low-Power Platforms

Get Started

Prerequisites

Python

PyTorch

Float8|e5m2

PULP-SDK

How to run a test

Generating the golden model

Running the C code

Roadmap

License

Acknowledgements

Contributors

🚀 Contact Me

About

Releases

Packages

Languages

ahmad-mirsalari/SOM-on-PULP

Folders and files

Latest commit

History

Repository files navigation

Optimizing Self-Organizing Maps for Bacterial Genome Identification on Parallel Ultra-Low-Power Platforms

Get Started

Prerequisites

Python

PyTorch

Float8|e5m2

PULP-SDK

How to run a test

Generating the golden model

Running the C code

Roadmap

License

Acknowledgements

Contributors

🚀 Contact Me

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages