Implementations of multiclass version of SEFR linear-time fast classifier (TinyML)


SEFR Multiclass Classifier

A simple and fast linear-time TinyML algorithm for low-powered microcontrollers


This is based on SEFR: A Fast Linear-Time Classifier for Ultra-Low Power Devices and its implementation sefr-classifier/sefr, which was originally a binary classifier. I use one-vs-rest (ovr) strategy to expand it into a working multi-class (multi-label) version.

The idea is to quickly calculate weighted averages and find hyperplanes between classes, so it needs far less computing resources/time and can actually run on-board training on low-end microcontrollers, including 16 MHz AVR processors that have as much as 2 KB ram. It would also be possible to directly re-train models on-device whenever there are new data avaliable.

Note that I made a minor change from the authors' paper: the weights and bias are calculated based on "not 0", "not 1", "not 2"...instead of 0, 1, 2. In other words, I treat "not N" as the positive label and "N" as negative label. For prediction the model would find the least possible label of "not N" (so it would most possibly to be N). I've found this way generates more accurate results in many datasets.

On the other hand, SEFR would not be super accurate when the difference between classes are not so clear. So it is more suitable for structured data or simple image patterns.

File Usage The original binary SEFR classifier with demostration results (the graphs you see above) A scikit-learn like Python classifier class
sefr.ino Arduino C++ implementation (can be run on AVRs)
sefr.go Golang/TinyGo implementation (can be run on any 32-bit boards supported by TinyGo) MicroPython implementation (ESP8266, ESP32 & Raspberry Pico) CircuitPython 7.0.0 implementation (for CP 7.0.0+ firmwares that has the ulab module)

SEFR Example (on computer)

Iris dataset

This is an example of using the famout Iris dataset (3 classes, 4 features x 150 instances):

from sefr import SEFR  # import from
import pandas as pd
from sklearn.model_selection import train_test_split, cross_val_predict
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import accuracy_score, classification_report

# load Iris dataset
# source:
df = pd.read_csv('',
                     header=None, names=('sepal length', 'sepal width', 'petal length', 'petal width', 'class'))
# extract data and target and convert to ndarray
X = df.drop(['class'], axis=1).to_numpy()
y = df['class'].to_numpy()
# encode labels to intergers
le = LabelEncoder()
y = le.fit_transform(y)
class_names = le.classes_  # save class names

# prepare training and test dataset
X_train, X_test, y_train, y_test = \
    train_test_split(X, y, test_size=0.2, random_state=0)

# train model and predict labels
clf = SEFR(), y_train)
predicted = clf.predict(X_test)
cv_predicted = cross_val_predict(clf, X_train, y_train, cv=5)

# view prediction results
print('Training time:', clf.training_time, 'ns')
print('Training CV score:', accuracy_score(y_train, cv_predicted).round(3))
print('Test accuracy:', accuracy_score(y_test, predicted).round(3))
print('Test classification report:')
print(classification_report(y_test, predicted, target_names=class_names))

Which generates the result below:

Training time: 0 ns
Training CV score: 0.942
Test accuracy: 0.967

Test classification report:
                 precision    recall  f1-score   support

    Iris-setosa       1.00      1.00      1.00        11
Iris-versicolor       1.00      0.92      0.96        13
 Iris-virginica       0.86      1.00      0.92         6

       accuracy                           0.97        30
      macro avg       0.95      0.97      0.96        30
   weighted avg       0.97      0.97      0.97        30

MNIST dataset

Here is another example of using the MNIST dataset (10 classes, 28x28 images, 70,000 instances):

from sefr import SEFR  # import from
from tensorflow.keras.datasets import mnist
from sklearn.model_selection import train_test_split, cross_val_predict
from sklearn.metrics import accuracy_score, classification_report
# load mnist dataset
(X_train, y_train), (X_test, y_test) = mnist.load_data()
# flatten images to one-dimensional
img_size = X_train.shape[1]
X_train = X_train.reshape(-1, img_size ** 2)
X_test = X_test.reshape(-1, img_size ** 2)

# train model and predict labels
clf = SEFR(), y_train)
predicted = clf.predict(X_test)
cv_predicted = cross_val_predict(clf, X_train, y_train, cv=5)

# view prediction results
print('Training time:', clf.training_time, 'ns')
print('Training CV score:', accuracy_score(y_train, cv_predicted).round(3))
print('Test accuracy:', accuracy_score(y_test, predicted).round(3))
print('Test classification report:')
print(classification_report(y_test, predicted))

Which gets you

Training time: 1141000000 ns
Training CV score: 0.797
Test accuracy: 0.809

Test classification report:
              precision    recall  f1-score   support

           0       0.96      0.76      0.85       980
           1       0.94      0.88      0.91      1135
           2       0.91      0.78      0.84      1032
           3       0.86      0.77      0.81      1010
           4       0.81      0.77      0.79       982
           5       0.66      0.83      0.74       892
           6       0.91      0.87      0.89       958
           7       0.97      0.72      0.82      1028
           8       0.60      0.86      0.70       974
           9       0.69      0.85      0.76      1009

    accuracy                           0.81     10000
   macro avg       0.83      0.81      0.81     10000
weighted avg       0.83      0.81      0.81     10000

It takes only 1.141 seconds (on my machine) to train a image recognition model with about 80% accuracy.

On Microcontrollers

All the microcontroller versions have a built-in Iris dataset (some are quantized into integers to speed up calculation). They will perform training (using the whole dataset) on startup, then predict existing data added with 0~30% random noises. Below is some serial output of an Arduino Uno:

Test data: 6.10 2.70 5.20 3.00 
Predicted label: 2 / actual label: 2 / (SEFR training time: 68 ms)

Test data: 6.20 3.30 1.40 0.30 
Predicted label: 0 / actual label: 0 / (SEFR training time: 68 ms)

Test data: 8.30 3.30 4.00 2.30 
Predicted label: 2 / actual label: 2 / (SEFR training time: 68 ms)

Test data: 3.50 2.40 2.40 1.30 
Predicted label: 1 / actual label: 1 / (SEFR training time: 68 ms)

Test data: 3.60 3.80 1.60 0.20 
Predicted label: 0 / actual label: 0 / (SEFR training time: 68 ms)

For the Iris dataset, it only takes 0.068 seconds to train on a 16 MHz AVR microcontroller.

The MicroPython version, with minor modifications, can also be used as a pure Python 3.4 implementation.

Experiment Projecet

I've demostrated SEFR in a color recognition experiment project, which solely use a Arduino Nano and some cheap sensors to train and run the classifier. See my page fore details.