Skip to content

CATT Models

Latest
Compare
Choose a tag to compare
@farisalasmary farisalasmary released this 04 Jul 17:37
· 6 commits to main since this release

Dataset

dataset.zip : this .zip file contains Tashkeela dataset split into train, val, and test.

Pretrained Char-based BERT Checkpoint

char_bert_model_pretrained.pt: a pretrained char-based BERT model with the following specs: d_model = 512, # of heads = 16 and # of layers = 6. It was trained for 6 epochs.

Encoder-Decoder Checkpoints

Long training

best_ed_epoch_175.pt : trained with randomly initialized weights for 175 epochs.
best_ed_mlm_epoch_175.pt : initialized from pretrained MLM weights and trained for 175 epochs.
best_ed_mlm_ns_epoch_178.pt : best checkpoint after noisy-student training resumed from best_ed_mlm_epoch_175.pt.

Short training

ed_epoch_5.pt : trained with randomly initialized weights for 5 epochs.
ed_mlm_epoch_5.pt : initialized from pretrained MLM weights and trained for 5 epochs.

Encoder-Only Checkpoints

Long training

best_eo_epoch_192.pt : trained with randomly initialized weights for 192 epochs.
best_eo_mlm_epoch_192.pt : initialized from pretrained MLM weights and trained for 192 epochs.
best_eo_mlm_ns_epoch_193.pt : best checkpoint after noisy-student training resumed from best_eo_mlm_epoch_193.pt.

Short training

eo_epoch_5.pt : trained with randomly initialized weights for 5 epochs.
eo_mlm_epoch_5.pt : initialized from pretrained MLM weights and trained for 5 epochs.