Skip to content
This repository has been archived by the owner on Sep 19, 2023. It is now read-only.

Support for serving multiple models with a single instance #108

Open
alexkillen opened this issue Sep 20, 2019 · 3 comments
Open

Support for serving multiple models with a single instance #108

alexkillen opened this issue Sep 20, 2019 · 3 comments
Labels
enhancement New feature or request

Comments

@alexkillen
Copy link

This would provide the ability to serve multiple models, and multiple versions of each model, with a single serving instance.

Details can be seen here: https://www.tensorflow.org/tfx/serving/serving_config#model_server_configuration

In practice, this could be achieved by providing a --model-config option that could be used instead of the --model argument, like below:

nvidia-docker run nmtwizard/opennmt-tf \
    --storage_config storages.json \
    --model_storage s3_model: \
    --model-config /path/to/models.config \
    --gpuid 1 \
    serve --host 0.0.0.0 --port 5000

Then when calling 'tensorflow_model_server' in the opennmt-tf docker image entrypoint.py, the argument --model-config-file could be used instead of --model-name and --model-base-path.

@guillaumekln guillaumekln added the enhancement New feature or request label Sep 23, 2019
@guillaumekln
Copy link
Contributor

Thanks for the request!

I agree it would be nice to serve multiple models with a single instance.

However, this project is not designed for TensorFlow Serving specifically, which is merely an implementation detail to serve OpenNMT-tf models. So we should come up with a more general design and API to support specifying multiple models. Let us think about that.

@alexkillen
Copy link
Author

Thanks for the prompt reply, having to figure out how to handle this in a more general way makes sense.

Just on the topic of Tensorflow Serving, I've played around with it a little and seems quite straight-forward using the --model_config_file and --model_config_file_poll_wait_seconds arguments, however one note is that the latter is currently only available in the "nightly" tags of the tensorflow/serving docker images at present. With those arguments it is simply a case of running the container and then editing the config file, which tensorflow_model_server will poll, when you want to add/edit a model.

There is a noticeable overhead of using the same instance to serve multiple models, as you would expect, but it's not huge, and I'd imagine tweaking the batch parameters would improve it further.

Looking forward to seeing how you handle this, your REST API is a lot nicer to use than that exposed by Tensorflow Serving.

@alexkillen alexkillen changed the title Support for model config files when serving OpenNMT-tf models Support for serving multiple models with a single instance Feb 3, 2020
@alexkillen
Copy link
Author

I just updated the title so it's not specific to Tensorflow Serving, which it looks like is no longer used for serving OpenNMT-tf models anyway.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Development

No branches or pull requests

2 participants