In this project, we create our own dataset, the Dataset-A, by arranging the following data from Kaggle.com:
We want to challenge if our model is general and robust or not, so we build this hybrid dataset. Links for download:
In both train data and test data, they contain alphabet A to Z, 26 classes folder of right hand image instances data.
- All data are the subset of dataset1
- We get rid of some images that cannot pass our image-pipeline in dataset1
- The image count for each alphabet is approximately to the amount 2220.
Here's the chart of our image count distribution in Dataset-A training set:
- For each alphabet, we select 555 image instances (2220 * 0.2 = 555) from dataset2, dataset3 and dataset4
Here's the chart of our image count distribution in Dataset-A testing set:
The demo images are in the folder pipeline-demo
, the image file name prefix indicates the pipeline fucntion type
This kind of pipeline can be used in any kinds of preprocessing stage.
// work flow
1. roi normalization (by mediapipe)
2. background normalization (by rembg)
3. skin normalization
4. channel normalization
5. resolution normaliztion
This kind of pipeline can only be used in training preprocessing stage.
// work flow
1. background normalization (by rembg)
2. roi normalization (by mediapipe)
3. skin normalization
4. channel normalization
5. resolution normaliztion
- Implement by
keras.ImageDataGenerator
withzoom_range=0.1,
,width_shift_range=0.1
,height_shift_range=0.1
,shear_range=0.1
- Implement by
keras.model.Engine
, we create our own Spatial Transformer Layerstn()
.
You can download our models at here: saved_models_v1
The implement code is in asl_model/models.py
-get_model_1()
model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), strides=(1, 1), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.BatchNormalization())
model.add(layers.MaxPool2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), strides=(1, 1), activation='relu'))
model.add(layers.Dropout(0.2))
model.add(layers.BatchNormalization())
model.add(layers.MaxPool2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), strides=(1, 1), activation='relu'))
model.add(layers.BatchNormalization())
model.add(layers.MaxPool2D((2, 2)))
# finish feature extraction
model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dropout(0.25))
model.add(layers.Dense(26, activation='softmax'))
The implement code is in asl_model/stl/struct_a/model.py
-get_stn_a_model_8()
input_layers = layers.Input((size, size, 1))
x = stn(input_layers)
x = layers.Conv2D(32, (3, 3), strides=(1, 1), activation='relu',
kernel_initializer=initializers.glorot_uniform(seed=STN_SEED))(x)
x = layers.BatchNormalization()(x)
x = layers.Conv2D(64, (3, 3), strides=(1, 1), activation='relu',
kernel_initializer=initializers.glorot_uniform(seed=STN_SEED))(x)
x = layers.BatchNormalization()(x)
x = layers.Conv2D(128, (3, 3), strides=(1, 1), activation='relu',
kernel_initializer=initializers.glorot_uniform(seed=STN_SEED))(x)
x = layers.BatchNormalization()(x)
x = layers.Flatten()(x)
x = layers.Dense(512, activation='relu')(x)
x = layers.Dropout(0.5)(x)
x = layers.Dense(256, activation='relu')(x)
x = layers.Dropout(0.25)(x)
output_layers = layers.Dense(26, activation="softmax")(x)
model = tf.keras.Model(input_layers, output_layers)
- 57717 train images, 20% will become the validation data
- 14430 test image, 555 test images for each alphabet
Select the best structure for normal-model and stl-model. With following hyper-parameters:
lr = 0.001
epoch = 10
batch_size = 128
optimizer=tf.keras.optimizers.Adam(learning_rate = lr),
loss='categorical_crossentropy',
metrics=['accuracy']
BATCH = 128
EPOCH = 100 # max epoch
# call back functions
es_callback = tf.keras.callbacks.EarlyStopping(patience=5, restore_best_weights=True)
reduce_lr_callback = tf.keras.callbacks.ReduceLROnPlateau()
loss="categorical_crossentropy"
optimizer="adam"
metrics=["accuracy"]
There are more evaluation charts in the folder charts
conda create --name aslt python=3.8 -y
(Options) - If you want to use jupyter run these commands
// actactivate venv
pip install ipykernel
python -m ipykernel install --user --name aslt --display-name "ASLT"
install Pytorch for package rembg
- get pyTorch install instructions on pytorch.org For example:
conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch