Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to save and load a tasknet model? #3

Open
thirsima opened this issue Aug 29, 2023 · 8 comments
Open

How to save and load a tasknet model? #3

thirsima opened this issue Aug 29, 2023 · 8 comments

Comments

@thirsima
Copy link

Hi! I tried the basic 3-task example from the README file, and the training worked fine. Then I tried to save and load the model:

Saving the model worked ok:

trainer.save_model("tasknet-model")

But loading the model gives an error:

loaded = tn.Model.from_pretrained('./tasknet-model')
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[16], line 1
----> 1 loaded = tn.Model.from_pretrained('./tasknet-model')

File ~/projects/keha/Tekoaly/trials/skillrecommendation-language-model/venv/lib/python3.10/site-packages/transformers/modeling_utils.py:2175, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
   2173 if not isinstance(config, PretrainedConfig):
   2174     config_path = config if config is not None else pretrained_model_name_or_path
-> 2175     config, model_kwargs = cls.config_class.from_pretrained(
   2176         config_path,
   2177         cache_dir=cache_dir,
   2178         return_unused_kwargs=True,
   2179         force_download=force_download,
   2180         resume_download=resume_download,
   2181         proxies=proxies,
   2182         local_files_only=local_files_only,
   2183         use_auth_token=use_auth_token,
   2184         revision=revision,
   2185         subfolder=subfolder,
   2186         _from_auto=from_auto_class,
   2187         _from_pipeline=from_pipeline,
   2188         **kwargs,
   2189     )
   2190 else:
   2191     model_kwargs = kwargs

AttributeError: 'NoneType' object has no attribute 'from_pretrained'

I wonder what is the correct way to save and load the model?

@sileod
Copy link
Owner

sileod commented Aug 29, 2023

Hi!
The library uses a shared encoder + "adapters" (task embeddings + task heads, e.g. classifiers)
It saves the shared encoder and the adapters

Currently, if you want to start again, you should load the saved encoder, and fill in the adapter weights one by one with a loop

The training is multi-task, but the model use is typically single task, what is your use case ?

@thirsima
Copy link
Author

Thanks! I will try loading the encoder and adapters separately.

Eventually my use case will be to train a model that can do both sentence similarity and token classification, but I at the moment I am just trying to find a multi-task training module that works without problems. So far tasknet looks most promising.

I guess tasknet does not support sentence similarity at the moment, but looking at the currently supported task implementations, it should not be too hard to add.

@thirsima
Copy link
Author

To clarify the use case, I eventually want to implement a microservice that loads the trained encoder and trained adapters from local files so that encoder is common for the 2 tasks.

@sileod
Copy link
Owner

sileod commented Aug 29, 2023

Sentence similarity is already supported, just use tn.Classification template where y is float. So it should work off the shelf.
This code show how to specialize encoder for one of the training tasks

def load_pipeline(model_name, task_name, adapt_task_embedding=True,multilingual=False):

@thirsima
Copy link
Author

Currently, if I call trainer.save_model(task_index) for 4 tasks, 4 different copies of the encoder are saved to disk and the files seem to have differences. And if I use load_pipeline() for all 4 tasks, I have 4 copies of the encoder in memory.

Is it possible to load the 4 tasks so that the encoder would be shared again? My aim is to avoid excessive memory consumption when I have multiple tasks that could use a shared encoder.

tasknet.Model.__init__() seems to have warm_start parameter. Would it be feasible to load first encoder from one task, and then warm start tasknet.Model with that encoder?

@sileod
Copy link
Owner

sileod commented Aug 29, 2023

Currently, when the model is saved, it saves a single encoder + a set of adapters.
The adapter class is actually a collection of adapters. I'll try to clarify this, thanks

Then, you can load the single encoder and set of adapters, and use
model = adapter.adapt_model_to_task(model, task_name)
So you should save once, then call adapt_model_to_task multiple times for each task.
If you do:
model_t1 = adapter.adapt_model_to_task(model, task_name1)
model_t2 = adapter.adapt_model_to_task(model, task_name2)
You will have different model objects, but they will should the same weights
You can have hundreds of models, if they use the same weights, it will not use much more memory than one (that's how I trained deberta-tasksource-nli on a single GPU with tasknet)
The main concern is how to address task embeddings properly

@cuongnguyengit
Copy link

Hi, Is there a way to combine all tasks into one model in the inference step?
Task_index can be made into a variable of model?

@sileod
Copy link
Owner

sileod commented Mar 4, 2024

You should use task_model_list https://github.com/sileod/tasknet/blob/main/src/tasknet/models.py#L188
It's a torch module so it can uses variables as input and it's differentiable
If you are talking about actual inference outside of training, it's cleaner to use the adapter as mentionned above

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants