You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm trying to get fine-tuning working through the 3_sft.sh script but am encountering an error:
Traceback (most recent call last):
File "/root/VILA/llava/train/train_mem.py", line 36, in <module>
train()
File "/root/VILA/llava/train/train.py", line 436, in train
trainer.train(resume_from_checkpoint=resume_from_checkpoint)
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 1537, in train
return inner_training_loop(
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 1854, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 2738, in training_step
loss = self.compute_loss(model, inputs)
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 2761, in compute_loss
outputs = model(**inputs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/deepspeed/utils/nvtx.py", line 15, in wrapped_fn
ret_val = func(*args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/deepspeed/runtime/engine.py", line 1735, in forward
Traceback (most recent call last):
loss = self.module(*inputs, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1538, in _call_impl
result = forward_call(*args, **kwargs)
File "/root/VILA/llava/model/language_model/llava_llama.py", line 133, in forward
outputs = self.llm.forward(
TypeError: LlamaForCausalLM.forward() got an unexpected keyword argument 'seqlens_in_batch'
I tried commenting out the seqlens_in_batch argument where self.llm.forward() is called and the script will work, but when i try to get the validation scores by setting --evaluation_strategy to something other than "no" I get a bunch of errors related to the dataloader and the dataset 'inputs':
Traceback (most recent call last):
File "/root/VILA/llava/train/train_mem.py", line 36, in <module>
train()
File "/root/VILA/llava/train/train.py", line 436, in train
trainer.train(resume_from_checkpoint=resume_from_checkpoint)
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 1537, in train
return inner_training_loop(
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 1929, in _inner_training_loop
self._maybe_log_save_evaluate(tr_loss, model, trial, epoch, ignore_keys_for_eval)
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 2262, in _maybe_log_save_evaluate
dataset_metrics = self.evaluate(
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 3022, in evaluate
output = eval_loop(
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 3212, in evaluation_loop
loss, logits, labels = self.prediction_step(model, inputs, prediction_loss_only, ignore_keys=ignore_keys)
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 3429, in prediction_step
loss, outputs = self.compute_loss(model, inputs, return_outputs=True)
File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 2761, in compute_loss
outputs = model(**inputs)
File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1538, in _call_impl
result = forward_call(*args, **kwargs)
File "/root/VILA/llava/model/language_model/llava_llama.py", line 102, in forward
) = self.prepare_inputs_labels_for_multimodal(
File "/root/VILA/llava/model/llava_arch.py", line 261, in prepare_inputs_labels_for_multimodal
if vision_tower is None or images is None or input_ids.shape[1] == 1:
IndexError: tuple index out of range
Any suggestions?
The text was updated successfully, but these errors were encountered:
I'm trying to get fine-tuning working through the 3_sft.sh script but am encountering an error:
I tried commenting out the seqlens_in_batch argument where self.llm.forward() is called and the script will work, but when i try to get the validation scores by setting --evaluation_strategy to something other than "no" I get a bunch of errors related to the dataloader and the dataset 'inputs':
Any suggestions?
The text was updated successfully, but these errors were encountered: