This guide is about extending the training tools to support new optimization problems. It is assumed the necessary LLVM changes have been made - i.e. instrumenting the optimization pass with a way to carry out decision making via a trained model, training log collection - see the lib/Analysis/MLInlineAdvisor.cpp and lib/CodeGen/MLRegallocEvictAdvisor.cpp for examples.
Refer to compiler_opt/rl/inlining
or compiler_opt/rl/regalloc
.
-
create a directory peer to
inlining
andregalloc
. This placement is not necessary, but sufficient for illustration. -
define the implementation of
compiler_opt.rl.compilation_runner.CompilationRunner
that's specific to your problem. Refer to the examples. Note how we always start processes via thecompiler_opt.rl.start_cancellable_process()
utility. -
define the ML interface - see the
config.py
file in each of the examples. -
extend
compiler_opt.rl.problem_configuration.ProblemConfiguration
. Make the new class gin-configurable. By convention, define this in the__init__.py
. -
place specific gin configs in the subdirectory, as well as vocab (these are optional, but likely necessary). A convention here is to make sure your gin files make the configurable
config_registry.get_configuration.implementation
point to your implementation ofProblemConfiguration
. See thecommon.gin
files in our examples. This allows any tool to just pick up your problem when pointing it (via--gin_files
) to your problem.
You can have multiple gin files for different algorithm configurations, and
reuse common settings (like the above) via gin's import
mechanism. See our
examples where we have different configs for PPO or behavioral cloning.
- add your module to the list in
compiler_opt.rl.registry.py
, under the "Register implementations" comment.
'compilation problem' is an optimization problem with a specific way of invoking clang and specific features and tensorflow topologies. The component model requires all these be exported in a class implementing ProblemConfiguration below, however, to avoid cycle dependencies in Bazel environments, do not explicitly inherit from it.
Internally, all the module's implementation parameters are expected to be gin-initialized.
Existing tools (e.g. train_locally.py
) will just transparently use your new
component if you point the tool to one of your gin files. This assumes your gin
file binds config_registry.get_configuration.implementation
as described:
--gin_bindings=config_registry.get_configuration.implementation=@configs.InliningConfig
To use in a new tool:
-
just get a ProblemConfiguration object in your python:
config = problem_configuration.get_configuration()
-
make sure your tool also exposes
--gin_files
and--gin_bindings
and bootstraps gin.
-
to avoid long binding names, use the
runners
module name for theCompilationRunner
implementation, and use theconfigs
module name for the implementation ofProblemConfiguration
. -
the
CompilationRunner
gin initialization should initialize to None, and use, theclang_path
andlauncher_path
macros (https://github.com/google/gin-config#syntax-quick-reference):
clang_path = None
launcher_path = None
runners.MyCompilationRunner.clang_path = %clang_path
runners.MyCompilationRunner.launcher_path = %launcher_path
Use a similar pattern for problem-specific additional flags (see inlining's
llvm_size_path
for example). When running tools, this allows the user pass
common flags transparently wrt the underlying runner - i.e. if swapping 2
runners, the clang flag stays the same:
--gin_bindings=clang_path="'/foo/bar/clang'"