TrOCR decoder_start_token should be `eos` instead of `cls`. #362

thariq-nugrohotomo · 2023-10-30T09:36:29Z

Using the pretrained model, when I pass cls or bos as the initial decoder token, the output (first decoded token) rarely get correct. But once I try to use eos, the output is correct, or at least similar with the output returned by model.generate().

In the official code from Microsoft, they will fallback to eos if the token is not specified https://github.com/microsoft/unilm/blob/6f60612e7cc86a2a1ae85c47231507a587ab4e01/trocr/generator.py#L84

Code excerpt to manually see the first decoded token:

decoder_start_token_id = processor.tokenizer.eos_token_id # processor.tokenizer.bos_token_id 
x = model(pixel_values, torch.tensor([[decoder_start_token_id]]))
x = x.logits
x = torch.argmax(x, -1)
print(processor.tokenizer.batch_decode(x))

Switch eos_token_id to bos_token_id then observe the different output.

When I pass `cls` or `bos` as the initial decoder token, the output (first decoded token) rarely get correct. But once I try to use `eos`, the output is correct, or at least similar with the output returned by `model.generate()`. In the official code from Microsoft, they will fallback to `eos` if the token is not specified https://github.com/microsoft/unilm/blob/6f60612e7cc86a2a1ae85c47231507a587ab4e01/trocr/generator.py#L84

review-notebook-app · 2023-10-30T09:36:34Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

thariq-nugrohotomo marked this pull request as ready for review October 30, 2023 09:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TrOCR decoder_start_token should be `eos` instead of `cls`. #362

TrOCR decoder_start_token should be `eos` instead of `cls`. #362

thariq-nugrohotomo commented Oct 30, 2023 •

edited

Loading

review-notebook-app bot commented Oct 30, 2023

TrOCR decoder_start_token should be eos instead of cls. #362

Are you sure you want to change the base?

TrOCR decoder_start_token should be eos instead of cls. #362

Conversation

thariq-nugrohotomo commented Oct 30, 2023 • edited Loading

review-notebook-app bot commented Oct 30, 2023

TrOCR decoder_start_token should be `eos` instead of `cls`. #362

TrOCR decoder_start_token should be `eos` instead of `cls`. #362

thariq-nugrohotomo commented Oct 30, 2023 •

edited

Loading