Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fine tuning With SQLcoder-7b #150

Open
bhrt95 opened this issue Dec 13, 2023 · 0 comments
Open

Fine tuning With SQLcoder-7b #150

bhrt95 opened this issue Dec 13, 2023 · 0 comments

Comments

@bhrt95
Copy link

bhrt95 commented Dec 13, 2023

I'm new to this area of Language models, in my use case I want to fine tune SQL coder model with spider dataset using this code base as this repo was working for me, while following the instructions given in the readme.
I'm able to start training with Starcoder model with ArmelR/stack-exchange-instruction dataset.

I replaced python command with model path and also dataset name
!python finetune/finetune.py --model_path="defog/sqlcoder-7b" --dataset_name="spider" --subset="data/finetune" --split="train" --size_valid_set 1000 --streaming --seq_length 1024 --max_steps 1000 --batch_size 1 --input_column_name="question" --output_column_name="query" --gradient_accumulation_steps 16 --learning_rate 1e-4 --lr_scheduler_type="cosine" --num_warmup_steps 100 --weight_decay 0.05 --output_dir="./checkpoints"

I'm facing an issue with attention mask shape while starting training,
I know just by changing model path itself I couldn't directly just start training, please provide me some suggestions on starting the training.I'm providing link to my kaggle notebook here to get started.
https://www.kaggle.com/code/bhrt16/notebookb5fd138c63

This is the log of the error

/opt/conda/lib/python3.10/site-packages/scipy/init.py:146: UserWarning: A NumPy version >=1.16.5 and <1.23.0 is required for this version of SciPy (detected version 1.24.3
warnings.warn(f"A NumPy version >={np_minversion} and <{np_maxversion}"
/opt/conda/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py:691: FutureWarning: The use_auth_token argument is deprecated and will be removed in v5 of Transformers. Please use token instead.
warnings.warn(
tokenizer_config.json: 100%|███████████████████| 915/915 [00:00<00:00, 4.98MB/s]
tokenizer.model: 100%|███████████████████████| 493k/493k [00:00<00:00, 1.11MB/s]
tokenizer.json: 100%|██████████████████████| 1.80M/1.80M [00:00<00:00, 51.6MB/s]
special_tokens_map.json: 100%|████████████████| 72.0/72.0 [00:00<00:00, 448kB/s]
/opt/conda/lib/python3.10/site-packages/datasets/load.py:2088: FutureWarning: 'use_auth_token' was deprecated in favor of 'token' in version 2.14.0 and will be removed in 3.0.0.
You can remove this warning by passing 'token=<use_auth_token>' instead.
warnings.warn(
Loading the dataset in streaming mode
100%|████████████████████████████████████████| 400/400 [00:03<00:00, 110.05it/s]
The character to token ratio of the dataset is: 3.16
Loading the model
/opt/conda/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py:472: FutureWarning: The use_auth_token argument is deprecated and will be removed in v5 of Transformers. Please use token instead.
warnings.warn(
Loading checkpoint shards: 100%|██████████████████| 2/2 [01:07<00:00, 33.92s/it]
/opt/conda/lib/python3.10/site-packages/peft/utils/other.py:141: FutureWarning: prepare_model_for_int8_training is deprecated and will be removed in a future version. Use prepare_model_for_kbit_training instead.
warnings.warn(
trainable params: 41943040 || all params: 3794014208 || trainable%: 1.1055056122762943
Starting main loop
Training...
wandb: Currently logged in as: bhrt95. Use wandb login --relogin to force relogin
wandb: Tracking run with wandb version 0.16.1
wandb: Run data is saved locally in /kaggle/working/starcoder/wandb/run-20231213_114310-6pzqbs68
wandb: Run wandb offline to turn off syncing.
wandb: Syncing run StarCoder-finetuned
wandb: ⭐️ View project at https://wandb.ai/bhrt95/huggingface
wandb: 🚀 View run at https://wandb.ai/bhrt95/huggingface/runs/6pzqbs68
/opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
Traceback (most recent call last):
File "/kaggle/working/starcoder/finetune/finetune.py", line 326, in
main(args)
File "/kaggle/working/starcoder/finetune/finetune.py", line 315, in main
run_training(args, train_dataset, eval_dataset)
File "/kaggle/working/starcoder/finetune/finetune.py", line 306, in run_training
trainer.train()
File "/opt/conda/lib/python3.10/site-packages/transformers/trainer.py", line 1540, in train
return inner_training_loop(
File "/opt/conda/lib/python3.10/site-packages/transformers/trainer.py", line 1857, in _inner_training_loop
tr_loss_step = self.training_step(model, inputs)
File "/opt/conda/lib/python3.10/site-packages/transformers/trainer.py", line 2735, in training_step
self.accelerator.backward(loss)
File "/opt/conda/lib/python3.10/site-packages/accelerate/accelerator.py", line 1905, in backward
loss.backward(**kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/_tensor.py", line 492, in backward
torch.autograd.backward(
File "/opt/conda/lib/python3.10/site-packages/torch/autograd/init.py", line 251, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
File "/opt/conda/lib/python3.10/site-packages/torch/autograd/function.py", line 288, in apply
return user_fn(self, *args)
File "/opt/conda/lib/python3.10/site-packages/torch/utils/checkpoint.py", line 271, in backward
outputs = ctx.run_function(*detached_inputs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = module._old_forward(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/transformers/models/mistral/modeling_mistral.py", line 654, in forward
hidden_states, self_attn_weights, present_key_value = self.self_attn(
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/accelerate/hooks.py", line 165, in new_forward
output = module._old_forward(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/transformers/models/mistral/modeling_mistral.py", line 293, in forward
raise ValueError(
ValueError: Attention mask should be of size (1, 1, 1024, 2048), but is torch.Size([1, 1, 1024, 1024])

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant