Support for serving multiple models with a single instance #108

alexkillen · 2019-09-20T10:30:51Z

This would provide the ability to serve multiple models, and multiple versions of each model, with a single serving instance.

Details can be seen here: https://www.tensorflow.org/tfx/serving/serving_config#model_server_configuration

In practice, this could be achieved by providing a --model-config option that could be used instead of the --model argument, like below:

nvidia-docker run nmtwizard/opennmt-tf \
    --storage_config storages.json \
    --model_storage s3_model: \
    --model-config /path/to/models.config \
    --gpuid 1 \
    serve --host 0.0.0.0 --port 5000

Then when calling 'tensorflow_model_server' in the opennmt-tf docker image entrypoint.py, the argument --model-config-file could be used instead of --model-name and --model-base-path.

The text was updated successfully, but these errors were encountered:

guillaumekln · 2019-09-23T07:08:12Z

Thanks for the request!

I agree it would be nice to serve multiple models with a single instance.

However, this project is not designed for TensorFlow Serving specifically, which is merely an implementation detail to serve OpenNMT-tf models. So we should come up with a more general design and API to support specifying multiple models. Let us think about that.

alexkillen · 2019-09-24T10:57:07Z

Thanks for the prompt reply, having to figure out how to handle this in a more general way makes sense.

Just on the topic of Tensorflow Serving, I've played around with it a little and seems quite straight-forward using the --model_config_file and --model_config_file_poll_wait_seconds arguments, however one note is that the latter is currently only available in the "nightly" tags of the tensorflow/serving docker images at present. With those arguments it is simply a case of running the container and then editing the config file, which tensorflow_model_server will poll, when you want to add/edit a model.

There is a noticeable overhead of using the same instance to serve multiple models, as you would expect, but it's not huge, and I'd imagine tweaking the batch parameters would improve it further.

Looking forward to seeing how you handle this, your REST API is a lot nicer to use than that exposed by Tensorflow Serving.

alexkillen · 2020-02-03T11:43:58Z

I just updated the title so it's not specific to Tensorflow Serving, which it looks like is no longer used for serving OpenNMT-tf models anyway.

guillaumekln added the enhancement New feature or request label Sep 23, 2019

alexkillen changed the title ~~Support for model config files when serving OpenNMT-tf models~~ Support for serving multiple models with a single instance Feb 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for serving multiple models with a single instance #108

Support for serving multiple models with a single instance #108

alexkillen commented Sep 20, 2019

guillaumekln commented Sep 23, 2019

alexkillen commented Sep 24, 2019

alexkillen commented Feb 3, 2020

Support for serving multiple models with a single instance #108

Support for serving multiple models with a single instance #108

Comments

alexkillen commented Sep 20, 2019

guillaumekln commented Sep 23, 2019

alexkillen commented Sep 24, 2019

alexkillen commented Feb 3, 2020