Skip to content

Latest commit

 

History

History
16 lines (13 loc) · 1.25 KB

README.md

File metadata and controls

16 lines (13 loc) · 1.25 KB

Intrinsically Motivated Reinforcement Learning

Setup and Run

  • Install dependencies: pip install . to install all packages defined in setup.py
  • Run unit tests: py.test -vv -s
  • Execute an experiment with ./imrl/utils/cli.py --episodes 6

Common Experiments

  • Discrete, tabular, deterministic 3x3 gridworld: ./imrl/utils/cli.py --environment=gridworld --gridworld_size=3 --num_vi=5 --episodes=200
    • This learns the policy through value iteration for 100 episodes and then executes on that policy
  • Discrete, tabular, stochastic 3x3 gridworld: ./imrl/utils/cli.py --environment=gridworld --gridworld_size=3 --num_vi=5 --failure_rate=0.1 --episodes=200
  • Discrete, tabular, stochastic 5x5 gridworld testing theta convergence on a single value iteration: ./imrl/utils/cli.py --environment=gridworld --gridworld_size=5 --num_vi=9 --vi_interval=400 --failure_rate=0.1 --episodes=600
  • Continuous, tabular, deterministic gridworld: ./imrl/utils/cli.py --environment=gridworld_continuous --gridworld_size=3 --num_vi=5 --episodes=200 --results_interval=45

Conventions

  • Naming convention: Where variables are named that represent different time steps, such as a state s and next state s', the first state will be named state and the second state_prime