-
Notifications
You must be signed in to change notification settings - Fork 0
UCT Planning
These instructions assume that you're using Ubuntu version >= 14.04.
Install all necessary dependencies:
sudo apt-get install libboost-all-dev libyaml-cpp-dev libpoco-dev libconsole-bridge-dev
Boost should typically be installed on all departmental machines. Please ask gripe to install the other dependencies as necessary.
Next, we have to install a dependency (class_loader
) from source.
cd ~/utexas_planning
git clone
https://github.com/ros/class_loader.git
mkdir class_loader/build
cd class_loader/build
cmake ../ -DBUILD_SHARED_LIBS=true -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=../../install
make -j8 install
Next, set the environment variable PKG_CONFIG_PATH
so that class_loader can be found. I recommend placing the following the in your .bashrc file:
export PKG_CONFIG_PATH=~/utexas_planning/install/lib/pkgconfig:${PKG_CONFIG_PATH}
After you've ensure that the above env variable is setup correctly in your current terminal, download and build the utexas_planning
code.
cd ~/utexas_planning
git clone
https://github.com/piyushk/utexas_planning.git
mkdir utexas_planning/build
cd utexas_planning/build
cmake ../ -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=../../install
make -j8 install
Next, set the following environment variable so that all the planners and models can be found. Remember to change /home/piyushk
to whatever is appropriate for your machine. Again, I recommend putting the following command in your ~/.bashrc
file. You can separate multiple items using colons.
export UTEXAS_PLANNING_LIBRARIES=/home/piyushk/utexas_planning/install/lib/libutexas_planning_models.so:/home/piyushk/utexas_planning/install/lib/libutexas_planning_planners.so
export RDDL_DOMAIN_DIRECTORIES=/home/piyushk/utexas_planning/utexas_planning/benchmarks/rddl_prefix
export LD_LIBRARY_PATH=/home/piyushk/utexas_planning/install/lib:${LD_LIBRARY_PATH}
Finally, you can run some simple commands to test some code:
~/utexas_planning/install/bin/vi_example
~/utexas_planning/install/bin/evaluator --experiment-file /home/piyushk/utexas_planning/utexas_planning/src/examples/evaluator/example.yaml --verbose
To change the action selection portion of mcts, you'll have to make the following updates:
- Add a string constant with the name of the action selection approach in
include/utexas_planning/planners/mcts/mcts.h
. For instance, the string constant for UCT looks like:
const std::string UCT =
“uct
”;
- If the new action selection strategy needs parameters, add them in the PARAMS struct. For instance, the reward bound parameter for UCT is included in the PARAMS struct as:
_(float,uct_reward_bound,uct_reward_bound,10000) \
- Update the source file
src/planners/mcts/mcts.cpp
, specifically changing thegetPlanningAction
function to include the new planning action selection strategy. To implement the new action selection strategy, you may need more information in StateNode and StateActionNode data structures than is available. If you update these data structures, make sure you update theupdateState
andgetNewStateNode
functions so that these data structures are initialized and updated correctly as the MCTS search is performed.
Note: Some additional work may be required if the action selection strategy depends on the current depth of the search tree, such as the case where a random action is selected at the root node.
Changing the backup strategy is fairly similar to changing the planning action selection.
- Add a string constant with the name of the backup strategy in
include/utexas_planning/planners/mcts/mcts.h
. For instance, the string constant for the eligibility trace approach looks like:
const std::string ELIGIBILITY_TRACE =
“eligibility
”;
- If the new action selection strategy needs parameters, add them in the PARAMS struct. For instance, the eligibility parameter for the eligibility trace approach is included in the PARAMS struct as:
_(float,eligibility_lambda,eligibility_lambda,0.0) \
- Update the source file
src/planners/mcts/mcts.cpp
, specifically changing theupdateState
function to include the new planning action selection strategy. To implement the new action selection strategy, you may need more information in StateNode and StateActionNode data structures than is available. If you update these data structures, make sure you update thegetNewStateNode
function so that these data structures are initialized correctly as the MCTS search is performed.
It's a bit difficult to introspect MCTS code. Generate the following YAML file and provide it to the evaluator. Make sure you change parameters as necessary:
models:
- name: utexas_planning::GridModel
start_x: 2
start_y: 4
planners:
- name: utexas_planning::MCTS
max_playouts: 10
max_depth: 25
action_selection_strategy: uct
uct_reward_bound: 100
backup_strategy: eligibility
eligibility_lambda: 1.0
Save this file somewhere. I'll assume you're calling it test.yaml.
Next, you can optionally enable debug output in the MCTS code. Keep in mind that this will generate a ton of verbose output about each rollout. For this reason, don't set max_playouts in the YAML file very high. Uncomment the following line in CMakeLists.txt
#add_definitions(-DMCTS_DEBUG)
After changing this optional value, you'll have to rerun cmake once:
cd ~/utexas_planning/utexas_planning/build/
cmake ../ -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=../../install
To test changes, rebuild the code, and run the following command:
cd ~/utexas_planning/utexas_planning/build/
make -j8 install
~/utexas_planning/install/bin/evaluator --exp
--verbose --max-trial-depth 1
Setting max-trial-depth limits the evaluator to take at most 1 action before reaching a terminal state. This is useful as UCT search is started from scratch every iteration (for now).