You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Why separate the words with spaces, when the resulting string is then tokenized using the tokenizer from the transformers library? I assume those tokenizers are not usually trained on pre-tokenized text, and neither are the pretrained models?
Why remove the space before "." characters, but not anywhere else?
Thanks for any explanations.
The text was updated successfully, but these errors were encountered:
I'm trying to adapt TransformerSum to a non-English custom dataset and currently very confused about this code in
extractive.py
:TransformerSum/src/extractive.py
Lines 1093 to 1107 in 15bd11d
"."
characters, but not anywhere else?Thanks for any explanations.
The text was updated successfully, but these errors were encountered: