RTG model can be served using Flask Server.
$ pip install rtg[serve]
Flask has its own set of dependencies unrelated to the core functionality, hence, not installed when installing rtg
.
$ python -m rtg.serve -h # rtg-serve
usage: rtg.serve [-h] [-d] [-p PORT] [-ho HOST] [-msl MAX_SRC_LEN] exp_dir
Deploy an RTG model to a RESTful server
positional arguments:
exp_dir Experiment directory
optional arguments:
-h, --help show this help message and exit
-d, --debug Run Flask server in debug mode (default: False)
-p PORT, --port PORT port to run server on (default: 6060)
-ho HOST, --host HOST
Host address to bind. (default: 0.0.0.0)
-b BASE, --base BASE Base prefix path for all the URLs (default: None)
-msl MAX_SRC_LEN, --max-src-len MAX_SRC_LEN
max source len; longer seqs will be truncated
(default: 250)
To launch a service for runs/001-tfm
experiment, run python -m rtg.serve -d runs/001-tfm
To use basepath of /v1
: python -m rtg.serve -d runs/001-tfm -b /v1
It prints :
* Running on http://0.0.0.0:6060/ (Press CTRL+C to quit)
Currently only /translate
API is supported. It accepts both GET
with query params and POST
with form params.
Note
|
batch decoding is yet to be supported. The current decoder decodes only one sentence at a time. |
An example POST request:
curl --data "source=Comment allez-vous?" --data "source=Bonne journée" http://localhost:6060/translate
{
"source": [
"Comment allez-vous?",
"Bonne journée"
],
"translation": [
"How are you?",
"Have a nice day"
],
"dec_args": {
"beam_size": 4,
"lp_alpha": 0.6,
"max_len": 50,
"num_hyp": 1
},
"score": [
-6,
-3
],
"time": 4.5281,
"time_unit": "s",
}
You can also request like GET method as http://localhost:6060/translate?source=text1&source=text2
after proper URL encoding the text1
text2
. This should only be used for quick testing in your web browser.
You may pass the following optional arguments to API:
-
beam_size
- Number of beams to use for decoding -
num_hyp
- Number of hypotheses to return in response -
max_len
- Maximum length (relative to source length) to wait for end-of-seq token -
lp_alpha
- Length penalty
All these arguments take default values from conf.yml, but you may also set at runtime via arguments to rest API. Example:
curl --data "source=Comment allez-vous?" --data "source=Bonne journée" "http://localhost:6060/translate?beam_size=6&num_hyp=4&lp_alpha=0.0"
{
"dec_args": {
"beam_size": 6,
"lp_alpha": 0,
"max_len": 50,
"num_hyp": 4
},
"source": [
"Comment allez @-@ vous ?",
"Bonne journée"
],
"time": 6.4446,
"time_unit": "s",
"translation": [
[
"How do you do, sir?.",
"- How are you? - Fine.",
"How do you do?",
"How do you do?"
],
[
"Have a nice day.",
"Have a good day",
"Good day",
"Have a good day.."
]
],
"score": [
[
-8.3406,
-8.3871,
-9.1363,
-9.1478
],
[
-3.7928,
-3.8259,
-3.8653,
-3.8789
]
],
}
Google Analytics is supported on web pages, however disabled by default.
To enable set GA_TAG
environment variable before starting rtg.serve
process.
export GA_TAG="G-xxxxx"
Production Deployment
Please use uWSGI for production deployment.
If you dont already have uWSGI, you may install it via conda by running conda install -c conda-forge uwsgi
.
uwsgi --http 127.0.0.1:5000 --module rtg.serve.app:app --pyargv "<path-to-exp-dir>"
# or using a .ini file
uwsgi --ini examples/uwsgi.ini
Where the uwsgi.ini
has the following info:
[uwsgi]
http = 0.0.0.0:6060
module = rtg.serve.app:app
pyargv = /full/path/<path-to-exp-dir> -b /v1
master = true
processes = 1
stats = 127.0.0.1:9191
Note that <path-to-exp-dir>
is expected to be a valid path to Experiment dir, it maybe obtained using rtg-export
tool.
The input/source text given to the API must be pre-processed in the same settings as the preprocessing during training phase. So, we offer configurations to match the preprocessing:
-
src_pre_proc
: List of transformations to be used on source text before giving to model (e.g. tokenizer, lowercase) -
tgt_pre_proc
: List of transformations to be used on target text before giving to model (e.g. tokenizer, lowercase) -
tgt_post_proc
: List of transformations to be used on target text produced by model (e.g. detokenizer, removal of unk)
The following transformations are built into RTG, so you may simply use their name:
transformers = {
'no_op': lambda x: x,
'space_tok': lambda x: ' '.join(x.strip().split()), # removes extra white spaces
'space_detok': lambda toks: ' '.join(toks),
'moses_tok': partial(MosesTokenizer().tokenize, escape=False, return_str=True,
aggressive_dash_splits=True,
protected_patterns=MosesTokenizer.WEB_PROTECTED_PATTERNS),
'moses_detok': partial(MosesDetokenizer().detokenize, return_str=True, unescape=True),
'moses_truecase': partial(MosesTruecaser().truecase, return_str=True),
'lowercase': lambda x: x.lower(),
'drop_unk': lambda x: x.replace('<unk>', ''),
'html_unescape': html.unescape,
'punct_norm': MosesPunctNormalizer().normalize
}
When no arguments are given to {src_pre,tgt_pre,tgt_prop}_proc
are missing, we use the same sensible defaults (same as the ones used in https://aclanthology.org/2021.acl-demo.37/.)
src_pre_proc:
- html_unescape
- punct_norm
- moses_tok
tgt_post_proc:
- moses_detok
- drop_unk
You may also use shell command line, including unix pipes, by prefixing your command with "#!". In addition, you may mix shell commands with known (pythonic) transforms. Example:
prep:
src_pre_proc:
- "#!/path/to/normalizer.perl | /path/to/tokenizer.py --lang deu"
- lowercase
tgt_post_proc:
- drop_unk
- moses_detok
-
You may permanently disable preprocessing and post processing using
prep:
src_pre_proc:
- no_op
tgt_post_proc:
- no_op
-
Or, temporarily, add
prep=false
argumenthttp://localhost:6060/translate\?prep\=false
NOTE:
{src,tgt}_pre_proc
and tgt_post_proc
are only used by REST API as of now. rtg.decode and rtg.prep do not yet to use pre- and post- text transformers.