Producing multiple drafts

The --multiple-translations parameter on both the experiment tool and the translate tool instructs the model to produce multiple drafts, instead of only a single draft, which is the default behavior. The default number of drafts is 3, but this can be controlled with the num_drafts parameter in the config.yml file.

No other command line parameters should need to change when producing multiple translations. This means that input and output files are still specified in the same manner. The multiple drafts can be differentiated by a suffix that is added to the output file name. For example, if you specify an output file called translation_output.txt, a set of files called translation_output.1.txt, translation_output.2.txt, etc. will be created.

There are a few different methods that can be used to produce the multiple drafts. The method can be controlled with the multiple_translations_method parameter in config.yml. The following table describes the possible values for this parameter:

Method	Description
`hybrid`	This is the default value for `multiple_translations_method`. This method is a hybrid between beam search and sampling. It uses the top hypothesis from beam search to produce the first draft and sampling to produce the remaining drafts. See the sections below on beam search and sampling for more information on parameters that control the generation for these two methods.
`beam_search`	This method uses ordinary beam search and the multiple hypotheses in the beam to populate the multiple drafts. That is, the first draft will be composed of the top-ranked hypothesis for each verse, the second draft of the second-ranked hypothesis, etc. The number of beams used by beam search can be controlled with the `num_beams` parameter in `config.yml`. The default value of `num_beams` is 2, but a value of 5 or more is recommended. When producing multiple translations, you must ensure that `num_beams >= num_drafts`.
`diverse_beam_search`	Diverse beam search is a variant of beam search that addresses a weakness of ordinary beam search, specifically that the hypotheses in the beam tend to be highly similar to each other. It adds a penalty term for pairs of similar hypotheses, resulting in a more diverse set of hypotheses. As with ordinary beam search, the number of beams can be controlled with `num_beams`. The strength of the penalty term can be controlled with a parameter called `diversity_penalty`. This parameter has a default value of `1.0`, and it is recommended not to decrease its value. A higher value of `diversity_penalty` will lead to more diversity between the different drafts (perhaps at the expense of accuracy).
`sampling`	This method uses random multinomial sampling to generate text. The different drafts are produced by sampling each verse translation multiple times. Sampling can be controlled via the `temperature` parameter in `config.yml`. Its default value is `0.75` and it is recommended to keep its value below `1.0`. Higher values of `temperature` will lead to more diversity, while lower values lead to higher accuracy (at the expense of diversity).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Producing multiple drafts

Clone this wiki locally