We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hello,
I have noticed a potential inconsistency in the LLAVA-OV implementation regarding input embedding truncation.
During training, the code truncates input_embed based on the tokenizer's maximum length https://github.com/LLaVA-VL/LLaVA-NeXT/blob/main/llava/model/llava_arch.py#L499. However, in the inference code for sglang, input_embed is not truncated https://github.com/sgl-project/sglang/blob/main/python/sglang/srt/models/llava.py#L378. Instead, sglang only check context length in tokenizer manager. https://github.com/sgl-project/sglang/blob/main/python/sglang/srt/managers/tokenizer_manager.py#L226
Could this lead to discrepancies between training and inference?
Thank you for your attention to this matter.
Best regards
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Hello,
I have noticed a potential inconsistency in the LLAVA-OV implementation regarding input embedding truncation.
During training, the code truncates input_embed based on the tokenizer's maximum length https://github.com/LLaVA-VL/LLaVA-NeXT/blob/main/llava/model/llava_arch.py#L499. However, in the inference code for sglang, input_embed is not truncated https://github.com/sgl-project/sglang/blob/main/python/sglang/srt/models/llava.py#L378. Instead, sglang only check context length in tokenizer manager. https://github.com/sgl-project/sglang/blob/main/python/sglang/srt/managers/tokenizer_manager.py#L226
Could this lead to discrepancies between training and inference?
Thank you for your attention to this matter.
Best regards
The text was updated successfully, but these errors were encountered: