RTG Serve

RTG model can be served using Flask Server.

Flask Installation

$ pip install rtg[serve]

Flask has its own set of dependencies unrelated to the core functionality, hence, not installed when installing rtg.


$ python -m rtg.serve -h  # rtg-serve
usage: rtg.serve [-h] [-d] [-p PORT] [-ho HOST] [-msl MAX_SRC_LEN] exp_dir

Deploy an RTG model to a RESTful server

positional arguments:
  exp_dir               Experiment directory

optional arguments:
  -h, --help            show this help message and exit
  -d, --debug           Run Flask server in debug mode (default: False)
  -p PORT, --port PORT  port to run server on (default: 6060)
  -ho HOST, --host HOST
                        Host address to bind. (default:
  -b BASE, --base BASE  Base prefix path for all the URLs (default: None)
  -msl MAX_SRC_LEN, --max-src-len MAX_SRC_LEN
                        max source len; longer seqs will be truncated
                        (default: 250)

To launch a service for runs/001-tfm experiment, run python -m rtg.serve -d runs/001-tfm To use basepath of /v1: python -m rtg.serve -d runs/001-tfm -b /v1

It prints : * Running on (Press CTRL+C to quit)

Currently only /translate API is supported. It accepts both GET with query params and POST with form params.

batch decoding is yet to be supported. The current decoder decodes only one sentence at a time.

An example POST request:

 curl --data "source=Comment allez-vous?" --data "source=Bonne journée" http://localhost:6060/translate
  "source": [
    "Comment allez-vous?",
    "Bonne journée"
  "translation": [
    "How are you?",
    "Have a nice day"
  "dec_args": {
    "beam_size": 4,
    "lp_alpha": 0.6,
    "max_len": 50,
    "num_hyp": 1
  "score": [
  "time": 4.5281,
  "time_unit": "s",

You can also request like GET method as http://localhost:6060/translate?source=text1&source=text2 after proper URL encoding the text1 text2. This should only be used for quick testing in your web browser.

Advanced Decoder Args

You may pass the following optional arguments to API:

  • beam_size - Number of beams to use for decoding

  • num_hyp - Number of hypotheses to return in response

  • max_len - Maximum length (relative to source length) to wait for end-of-seq token

  • lp_alpha - Length penalty

All these arguments take default values from conf.yml, but you may also set at runtime via arguments to rest API. Example:

curl --data "source=Comment allez-vous?" --data "source=Bonne journée" "http://localhost:6060/translate?beam_size=6&num_hyp=4&lp_alpha=0.0"
  "dec_args": {
    "beam_size": 6,
    "lp_alpha": 0,
    "max_len": 50,
    "num_hyp": 4
  "source": [
    "Comment allez @-@ vous ?",
    "Bonne journée"
  "time": 6.4446,
  "time_unit": "s",
  "translation": [
      "How do you do, sir?.",
      "- How are you? - Fine.",
      "How do you do?",
      "How do you do?"
      "Have a nice day.",
      "Have a good day",
      "Good day",
      "Have a good day.."
  "score": [

Google Analytics Integration

Google Analytics is supported on web pages, however disabled by default. To enable set GA_TAG environment variable before starting rtg.serve process.

export GA_TAG="G-xxxxx"

Production Deployment Please use uWSGI for production deployment. If you dont already have uWSGI, you may install it via conda by running conda install -c conda-forge uwsgi.

uwsgi --http --module --pyargv "<path-to-exp-dir>"

# or using a .ini file
uwsgi --ini examples/uwsgi.ini

Where the uwsgi.ini has the following info:

http =
module =
pyargv = /full/path/<path-to-exp-dir> -b /v1
master = true
processes = 1
stats =

Note that <path-to-exp-dir> is expected to be a valid path to Experiment dir, it maybe obtained using rtg-export tool.

Pre-process and post-process

The input/source text given to the API must be pre-processed in the same settings as the preprocessing during training phase. So, we offer configurations to match the preprocessing:

  • src_pre_proc: List of transformations to be used on source text before giving to model (e.g. tokenizer, lowercase)

  • tgt_pre_proc: List of transformations to be used on target text before giving to model (e.g. tokenizer, lowercase)

  • tgt_post_proc: List of transformations to be used on target text produced by model (e.g. detokenizer, removal of unk)

The following transformations are built into RTG, so you may simply use their name:

transformers  = {
    'no_op': lambda x: x,
    'space_tok': lambda x: ' '.join(x.strip().split()),  # removes extra white spaces
    'space_detok': lambda toks: ' '.join(toks),
    'moses_tok': partial(MosesTokenizer().tokenize, escape=False, return_str=True,
    'moses_detok': partial(MosesDetokenizer().detokenize, return_str=True, unescape=True),
    'moses_truecase': partial(MosesTruecaser().truecase, return_str=True),
    'lowercase': lambda x: x.lower(),
    'drop_unk': lambda x: x.replace('<unk>', ''),
    'html_unescape': html.unescape,
    'punct_norm': MosesPunctNormalizer().normalize

When no arguments are given to {src_pre,tgt_pre,tgt_prop}_proc are missing, we use the same sensible defaults (same as the ones used in

  - html_unescape
  - punct_norm
  - moses_tok
  - moses_detok
  - drop_unk

You may also use shell command line, including unix pipes, by prefixing your command with "#!". In addition, you may mix shell commands with known (pythonic) transforms. Example:

    - "#!/path/to/normalizer.perl | /path/to/ --lang deu"
    - lowercase
    - drop_unk
    - moses_detok
Disabling pre- and post- processing
  • You may permanently disable preprocessing and post processing using

    - no_op
    - no_op

NOTE: {src,tgt}_pre_proc and tgt_post_proc are only used by REST API as of now. rtg.decode and rtg.prep do not yet to use pre- and post- text transformers.