Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Support LLaVA #196

Merged
merged 218 commits into from
Dec 26, 2023
Merged

[Feature] Support LLaVA #196

merged 218 commits into from
Dec 26, 2023

Conversation

LZHgrla
Copy link
Collaborator

@LZHgrla LZHgrla commented Nov 2, 2023

Docs: https://github.com/LZHgrla/xtuner/blob/lzh/llava/xtuner/configs/llava/README.md

Features

  • Support fine-tuning LLaVA model, with LoRA/QLoRA/Full/Freeze LLM and LoRA/Full/Freeze visual encoder
  • Chat with the fine-tuned LLaVA model
  • Evaluate the fine-tuned LLaVA model with MMBench
  • Involve the LLaVA official pretrain, fine-tune datasets

TODO

@LZHgrla LZHgrla marked this pull request as draft November 2, 2023 06:19
@pppppM pppppM merged commit 6b962e6 into InternLM:main Dec 26, 2023
1 check passed
llkn-2 pushed a commit to llkn-2/xtuner that referenced this pull request Jul 31, 2024
* v1

* add load_image

* update cfg image url

* del fig

* update

* temp

* update convert

* update chat_mm

* add exclude_frozen_parameters for deepspeed

* update chat

* update xtuner help msg

* fix bugs

* revert bf16 deepspeed

* fix bugs

* add visual_select_layer for chat

* improve pth_to_hf

* rename projecter_pth to pretrained_pth

* temp

* update requirements

* add cfgs

* update

* fix pre-commit

* optim chat

* optim chat

* Delete xtuner/model/unused.py

* move dispatch to a deeper folder

* add projector

* update

* del model/projector

* fix bugs

* add docs

* update

* update

* update

* update

* enhance resume for map_fn

* update import

* add llava_internlm_chat_7b_clip_vit_large_p14

* update dispatch

* update dispatch

* add link

* update max_length

* update max_length

* update hyp

* align

* move yi flash attn

* fix pre-commit

* update deepspeed requirements

* add mmbench script

* install openpyxl

* add entry_point for mmbench

* save args

* update mmbench

* update max_length

* add llama2 qlora

* update mmbench

* fix mmbench bugs

* use osp instead of os.path

* refactor pth_to_hf

* update chat and mmbench to support --llava

* align to chat

* update entry_point

* add vicuna template

* add vicuna_7b_v15

* fix pre-commit

* add vicuna_7b_v1.5 qlora

* skip_special_tokens for decode text

* remove do_sample

* add warmup

* fix pre-commit

* Update dataset_prepare.md

* Update dataset_prepare.md

* Add KEEP_STSTEM for template

* remove

* fix vicuna template

* clean cfgs

* add cfgs

* fix pre-commit

* add --language for mmbench

* fix bugs

* fix pretrain bug

* support visual_encoder lora

* fix bugs

* add paramwise_cfg

* remove print_peft_model_trainable_parameters

* fix bugs

* add paramwise_cfg for DeepSpeedOptimWrapper

* fix engine deepspeed paramwise_cfg bug

* fix encode_fn bug

* fix

* fix pad_image_to_square bugs

* Add space for system to avoid mismatch of 'USER' token

* revert to adding bos_token at each conv

* revert for paramwise_cfg

* better cfgs?

* fix import bug

* fix import bug

* pretrain align

* update prepare_inputs_labels_for_multimodal

* 1792

* support length_grouped_samplers

* 1792

* remove KEEP_SYSTEM

* remove system in cfg

* update 336 cfg

* add torch_dtype for mmbench and chat

* group 50

* quant for pretrain

* update cfgs

* refactor cfgs

* add length for concat dataset

* update requirements

* fix typo

* add template for internlm pretrain

* no zh

* remove 20b cfgs

* fix pre-commit

* revert invalid input

* rename

* Update README.md

* Update README_zh-CN.md

* fix pre-commit

* remove llava_zh from docs

* qlora 512

* rename llava map_fn

* update cfgs

* update model urls

* add docs link

* add llava docs

* update docs

* update urls

* add citation

* fix README

* move

* update

* vicuna pretrain with prompt

* rename

* add results

* fix pre-commit

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Update README.md

* Update README_zh-CN.md

* Update README_zh.md

* Update README_zh.md

* Update README.md

* Update README_zh.md

* Update README.md

* Update README.md

* fix typo

* fix

* Update README.md

* Update README_zh-CN.md

* rename

* auto cn_string

* fix pre-commit

* rename

* remove language

* add VLMEvalKit

* rename VLLM to VLM

* add the download links of MMBench

* update

* update readme

* update

* update

* update merge

* fix cfg bug

* Update README.md

* Update README_zh.md

* update

* fix

* update requirements

* Update runtime.txt

* Update runtime.txt

* Update runtime.txt

* Update README.md

* Update README.md

* Update README_zh.md

* fix pre-commit

* fix

* update mmbench prompt

* fix bugs

* fix bugs

* update docs

* update

* update

* Update README.md
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

will to add qwen-vl ? Thanks will add multimodal model like minigpt,llava,mplug-owl ?
2 participants