OptunAPI is a simple API designed for Machine Learning applications that allows to distribute an automatic hyperparameters optimization over different machines through HTTP requests. Each set of hyperparameters can be studied independently since the minima research does't require any gradients computation, but instead is performed through a Bayesian optimization based on Optuna. The machine running Optuna manages centrally the optimization studies -- the so-called "Optuna-server" -- providing sets of hyperparameters and assessing them by the scores evaluated and sent back by the single computing instance, named "Trainer-client". The HTTP requests underlying such client-server system are powered by FastAPI.
OptunAPI inherits most of the modern functionalities of Optuna and FastAPI:
- Lightweight and versatile
- OptunAPI is entirely written in Python and has few dependencies.
- Easy to configure
- For hyperparameters sampling, OptunAPI relies on configuration files easy to set up.
- Easy to integrate
- The hyperparameters values can be easily recover decoding the HTTP response content from the server.
- Easy parallelization
- Different machines can run the hyperparameters study in parallel, centrally coordinated by the server.
- Efficient optimization algorithms
- The optimization task is headed by Optuna and its state-of-the-art algorithms.
- Quick visualization for study analysis
- TODO - OptunAPI provides a set of reports to monitor the status of the hyperparameters study.
To understand how OptunAPI works, we need to spend a couple of words about its components:
Study
andTrial
objects from Optuna- Optuna's Ask-and-Tell interface
- HTTP requests to map the hyperparameters space
A study corresponds to an optimization task, i.e., a set of trials. This object provides interfaces to run a new
Trial
and access trials' history. OptunAPI is designed so that, when the first machine ask for a hyperparameters set, it
starts a new study (create_study()
) identified according to the HTTP request submitted. Any other machines referring
to the same optimization session don't initialize a new study, but recover the previous one (load_study()
) contributing
to mapping the hyperparameters space.
A trial allows to prepare a particular set of hyperparameters and evaluate its capability of optimizing a objective function, not necessarily available in an explicit form as in the case of very complex Machine Learning algorithms. This object provides the following interfaces to get parameter suggestion:
optuna.trial.Trial.suggest_categorical()
for categorical parametersoptuna.trial.Trial.suggest_int()
for integer parametersoptuna.trial.Trial.suggest_float()
for floating point parameters
With optional arguments of step
and log
, we can discretize or take the logarithm of integer and floating point parameters.
The following code block is taken from the Optuna tutorial and shows a standard use of these features:
import optuna
def objective (trial):
# Categorical parameter
optimizer = trial.suggest_categorical ('optimizer', ['RMSprop', 'Adam'])
# Integer parameter
num_layers = trial.suggest_int ('num_layers', 1, 3)
# Integer parameter (log)
num_channels = trial.suggest_int ('num_channels', 32, 512, log = True)
# Integer parameter (discretized)
num_units = trial.suggest_int ('num_units', 10, 100, step = 5)
# Floating point parameter
dropout_rate = trial.suggest_float ('dropout_rate', 0.0, 1.0)
# Floating point parameter (log)
learning_rate = trial.suggest_float ('learning_rate', 1e-5, 1e-2, log = True)
# Floating point parameter (discretized)
drop_path_rate = trial.suggest_float ('drop_path_rate', 0.0, 1.0, step = 0.1)
OptunAPI uses these methods internally and requires only a configuration file correctly filled to run the studies.
The Optuna's Ask-and-Tell interface provides a more flexible interface for hyperparameter optimization based on the two following methods:
optuna.study.Study.ask()
creates a trial that can sample hyperparametersoptuna.study.Study.tell()
finishes the trial by passingtrial
and an objective value
OptunAPI uses these methods in two different moments. When a machine ask for a set of hyperparameters, that set belongs to a trial resulting from an ask instance. Then, once the objective function was evaluated with that particular set of hyperparameters, the machine sends a new request encoding the objective value allowing to close the corresponding trial with a tell instance.
OptunAPI provides a simple Python module to run a server able to centrally manage the optimization studies:
optuna/optuna/server.py
. It is
equipped with a set of path operation functions relying on the FastAPI ecosystem:
ping_server
- the path is
/optunapi/ping
- the operation is
GET
- the function allows to verify if the server is running
- the path is
read_hparams
- the path is
/optuna/hparams/{model_name}
(model_name
is a path parameter) - the operation is
GET
- the function allows to start (or load) an Optuna study and send sets of hyperparameters
- the path is
send_score
- the path is
/optuna/score/{model_name}?trail_id=TRIAL_ID&score=SCORE
(with query parameters) - the operation is
GET
- the function allows to finish the trial identified by
trial_id
with thescore
value
- the path is
Python 3.6+
OptunAPI is based on two modern and highly performant frameworks:
OptunAPI is a public repository on GitHub.
$ git clone https://github.com/mbarbetti/optunapi.git
---> 100%
To run and use OptunAPI it's preferable to create a virtual environment with Python 3.6+ and install Optuna and FastAPI within it.
$ pip install optuna fastapi
---> 100%
Standing on the shoulder of FastAPI, OptunAPI needs an ASGI server to run the so-called Optuna-server, such as Uvicorn or Hypercorn.
$ pip install uvicorn[standard]
---> 100%
The high-level functions provided by Optuna to suggest values for the hyperparameters are replaced with an appropriate configuration file in OptunAPI. Referring to the example reported in the Optuna tutorial, what follows is the corresponding YAML configuration file:
# Categorical parameter
optimizer:
name : optimizer
type : categorical
choices :
- RMSprop
- Adam
# Integer parameter
num_layers:
name : num_layers
type : int
low : 1
high : 3
# Integer parameter (log)
num_channels:
name : num_channels
type : int
low : 32
high : 52
log : True
# Integer parameter (discretized)
num_units:
name : num_units
type : int
low : 10
high : 100
step : 5
# Floating point parameter
dropout_rate:
name : dropout_rate
type : float
low : 0.0
high : 1.0
# Floating point parameter (log)
learning_rate:
name : learning_rale
type : float
low : 1e-5
high : 1e-2
log : True
# Floating point parameter (discretized)
drop_path_rate:
name : drop_path_rate
type : float
low : 0.0
high : 1.0
step : 0.1
Prepared the configuration file for the optimization session and saved it into
optunapi/optunapi/config
,
we are ready to run the Optuna-server.
$ uvicorn server:optunapi
INFO: Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
INFO: Started reloader process [28720]
INFO: Started server process [28722]
INFO: Waiting for application startup.
INFO: Application startup complete.
What does the command uvicorn server:optunapi
mean?
The command uvicorn server:optunapi
refers to:
server
: the fileserver.py
(the Python "module") in optunapi/optunapi.optunapi
: the object created inside ofserver.py
with the lineoptunapi = FastAPI()
.
Note that Uvicorn sets 127.0.0.1
and 8000
as default values for the server IP and port.
To change the defaults it's enough launching the previous command with the arguments
--host
and --port
followed by the chosen values.
The optimization session is managed by an Optuna study, initialized with the first client HTTP request, or loaded and expanded by any other connecting machines. To refer to a particular optimization session a client has to encode the name of the corresponding configuration file within its HTTP request.
Consider the simple use-case provided by OptunAPI, where we want to find the minimum of a 2D-paraboloid:
optunapi/tests/simple_client.py
.
Since the provided configuration file is named optuna-test.yaml
, then the GET request submitted by the client
to receive the hyperparameters set has to contain the string 'optuna-test'
:
import requests
HOST = 'http://127.0.0.1:8000'
read_hparams = requests.get (HOST + '/optunapi/hparams/optunapi-test')
hp_req = read_hparams.json()
TRIAL_ID = hp_req ['trial_id']
PARAMS = hp_req [ 'params' ]
What happens behind the scenes is that the above HTTP request calls an ask instance to the Optuna
study, stored in optunapi/optunapi/db
once created and named optunapi-test.db
. As already said, an ask instance is a trial equipped with
a set of hyperparameters and the client can recover those values decoding the corresponding HTTP response.
In the example above, hp_req
is a dictionary containing, among others, the identifier number of the current
trial (TRIAL_ID
) and a dictionary for the hyperparameters values (PARAMS
).
Having accessed to the hyperparameters values, we can perform whatever learning algorithm one prefers and
evaluate the associated training score, that will be used as objective value to finish the trial instance.
This is done with a new GET request referring to the same optimization session (again, 'optunapi-test'
in the path)
and passing TRIAL_ID
and SCORE
as query parameters:
import requests
HOST = 'http://127.0.0.1:8000'
send_score = requests.get (HOST + '/optunapi/score/optunapi-test?trial_id=TRIAL_ID&score=SCORE')
score_req = send_score.json()
BEST_TRIAL_ID = score_req ['best_score_id']
BEST_PARAMS = score_req [ 'best_params' ]
Each running client allows to refine the search for minima performed by the Optuna algorithms, focusing on smaller and smaller space portion and enhancing the mapping of the hyperparameters space.
OptunAPI is designed to be used within a VPN not directly opened to the public Internet. On the other hand, opening the Optuna-server to Internet allows to exploit easily a wide variety of computing resources, from on-premises machines to instances deriving from different cloud computing services (AWS, Azure, GCP, etc.). Such design raises a security issue since anyone can submit a request to the server or catch its response, opening the system to cyberattack.
A possible solution to this issue relies on the SSH protocol. The idea is to set up the Optuna-server
as a private server (from the perspective of REMOTE SERVER
) not directly visible from the outside
(LOCAL CLIENT
’s perspective). This configuration, schematically represented in the sketch below,
allows a local client to still access the private server passing through the remote server
authenticating with SSH credentials.
----------------------------------------------------------------------
|
-------------+ | +----------+ +---------
LOCAL | | | REMOTE | | PRIVATE
CLIENT | <== SSH ========> | SERVER | <== local ==> | SERVER
-------------+ | +----------+ +---------
|
FIREWALL (only port 22 is open)
----------------------------------------------------------------------
OptunAPI provides a very simple implementation of this scheme:
optunapi/tests/secured_client.py
.
It is based on sshtunnel and allows to submit a HTTP request to the
private server after having specifying our SSH credentials (ssh_username
, ssh_pkey
).
import sshtunnel
import requests
with sshtunnel.open_tunnel (
(REMOTE_SERVER_IP, 22),
ssh_username = 'mbarbetti',
ssh_pkey = '/home/mbarbetti/.ssh/id_rsa',
remote_bind_address = (PRIVATE_SERVER_IP, PRIVATE_SERVER_PORT),
local_bind_address = ('127.0.0.1', 10022)
) as tunnel:
ping_server = requests.get ('http://localhost:10022/optunapi/ping')
ping_msg = ping_server.json()
print (ping_msg)
How to run the server in this case?
In this configuration the Optuna-server acts as private server,
then its IP and port are the ones declared within the with
statement:
$ uvicorn server:optunapi --host PRIVATE_SERVER_IP --port PRIVATE_SERVER_PORT
This project is licensed under the terms of the MIT license.