Learn the American Manual Alphabet (AMA)

Introduction

This repository contains scripts to identify the letters of the American Manual Alphabet (AMA).

You can either train & run it locally or head directly to this repositories GitHub Page to see a demonstration using your webcam.

Requirements

This project was done using Python 3.8. The following packages were used:

Extracting
Training
- scikit-learn
- TensorFlow 2 or CatBoost
- Joblib
Running local inference
- NumPy

Dataset acquirement

Dataset sources

Two on kaggle published datasets from SigNN Team were used.

ASL Sign Language Alphabet Pictures [Minus J, Z]
ASL Sign Language Alphabet Videos [J, Z]

The first one only contains images from the alphabet excluding J and Z. The second dataset contains video files of the letters J and Z, because these signs involve movements.

Extraction

To extract the landmarks, the solution MediaPipe Hands is used. Passing an image to MediaPipe it results a list of hand landmarks.

The figure above shows the resulting hand landmarks (MediaPipe Hands).

This project includes two script to extract landmarks from either image- or video-files. You can set the number of workers, to accelerate the extraction. Every worker processes one letter in the dataset and yields a CSV file.

If the extraction encounters an image or video with a left hand, it mirrors the x-axis of the landmarks, so it behaves like a right hand.

These resulting 26 files (A.csv, B.csv, ..., Z.csv) then can be merged into one single CSV file and used for training a model.

Training

This project includes Jupyter Notebooks to train two different models. Both notebooks take the same extracted dataset CSV file.

train_catboost.ipynb trains a CatBoostClassifier.
train_neuralnetwork.ipynb trains a Multilayer perceptron using TensorFlow 2.

The CatBoostClassifier converges quickly and yields great accuracy. However, while developing this project, there was this idea to include a model into a single webpage, ideally with no Python backend. So I decided to train a Multilayer perceptron with TensorFlow. The trained model then can be converted for the TensorFlow.js library and included directly in JavaScript without the need of a Python backend server.

Local inference

You can run your trained models by either running run_asl_catboost.py or run_asl_neuralnetwork.py.

Web Demo

To demonstrate and play with the trained model you can head to this repositories GitHub Page.

It loads the trained model, and uses the JavaScript capabilities of MediaPipe. The extracted landmarks from your webcam get passed to the Multilayer perceptron and the prediction is displayed on the screen.

Dependencies

The following dependencies are used for the web demo:

@mediapipe/camera_utils
@mediapipe/drawing_utils
@mediapipe/hands
@tensorflow/tfjs
splitting
Bootstrap - as CDN
Gallaudet TrueType - a beautiful font displaying the letter signs

The modules are compiled using webpack.js., the source files can be found here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Learn the American Manual Alphabet (AMA)

Introduction

Requirements

Dataset acquirement

Dataset sources

Extraction

Training

Local inference

Web Demo

Dependencies

Files

README.md

Latest commit

History

README.md

File metadata and controls

Learn the American Manual Alphabet (AMA)

Introduction

Requirements

Dataset acquirement

Dataset sources

Extraction

Training

Local inference

Web Demo

Dependencies