Dataset
dataset.zip
: this .zip file contains Tashkeela dataset split into train
, val
, and test
.
Pretrained Char-based BERT Checkpoint
char_bert_model_pretrained.pt
: a pretrained char-based BERT model with the following specs: d_model = 512
, # of heads = 16
and # of layers = 6
. It was trained for 6 epochs.
Encoder-Decoder Checkpoints
Long training
best_ed_epoch_175.pt
: trained with randomly initialized weights for 175 epochs.
best_ed_mlm_epoch_175.pt
: initialized from pretrained MLM weights and trained for 175 epochs.
best_ed_mlm_ns_epoch_178.pt
: best checkpoint after noisy-student training resumed from best_ed_mlm_epoch_175.pt
.
Short training
ed_epoch_5.pt
: trained with randomly initialized weights for 5 epochs.
ed_mlm_epoch_5.pt
: initialized from pretrained MLM weights and trained for 5 epochs.
Encoder-Only Checkpoints
Long training
best_eo_epoch_192.pt
: trained with randomly initialized weights for 192 epochs.
best_eo_mlm_epoch_192.pt
: initialized from pretrained MLM weights and trained for 192 epochs.
best_eo_mlm_ns_epoch_193.pt
: best checkpoint after noisy-student training resumed from best_eo_mlm_epoch_193.pt
.
Short training
eo_epoch_5.pt
: trained with randomly initialized weights for 5 epochs.
eo_mlm_epoch_5.pt
: initialized from pretrained MLM weights and trained for 5 epochs.