Explicit n-grams, single finalfrontier command
-
The most user-visible change is that
ff-train-deps
andff-train-skipgram
have been merged into one command,finalfrontier
. Dependency and skipgram embeddings can be trained with respectivelyfinalfrontier deps
andfinalfrontier skipgram
. -
Support for training explicit subwords has been added.
Thus far, finalfrontier has followed the same subword approach as fastText: each subword (n-gram) mapped to an embedding using the FNV-1 hash function. This approach reduces the number of embeddings when the corpus contains a large number of possible embeddings, at the cost of collisions. With the
--subwords ngrams
option, finalfrontier uses an (explicit) n-gram vocabulary instead. -
The
hogwild
andfinalfrontier-utils
crates have been merged into thefinalfrontier
crate. Consequently, finalfrontier now consists of a single crate. -
When the number of threads is not specified, finalfrontier has traditionally used half the logical CPUs. This has been refined to use half the number of logical GPUs, capped at 20 threads. Using more than 20 threads can slow convergence drastically on typical corpora.