better_rl

Simple infrastructure for deep RL experimentation, with a focus on interpretability and reproducibility (seed).

Philosophy: We hope that researchers do not overlook this tool as another package that helps them know if their algorithm works, but rather for complex tasks/algorithms and be able to peek (to some point) where an agent can leverage the information it gathers.

The kind of questions, that researchers typically have, that we aim to resolve in order:

How does the agent state-visitation distribution change as training progresses and across multiple trajectories (steps separated)?
- Visualize the replay buffer (state): When you run an agent with a replay buffer, you can see how filled the buffer is as well as what the t-SNE looks like for the buffer. This will help allow researchers to investigate what type of information the agent currently stores. For advanced buffers like PER, it can help investigate what information the agent prioritizes, which may be essential for debugging planning algorithms.
  - What effect do noteworthy, influential states (group state-actions based on TD-error/any other method) have on the policy?
- Are there repetitive patterns across space and time that result in the observed agent: When you run an agent, you can look at the state-action temporal interaction distribution, projected using t-SNE in real-time. This is typically useful for model-based RL algorithms.
  - Saliency maps on image inputs.
How does the agent expect rewards for the state-action distribution?
How can we visualize/qualitatively-categorize high-level policies (skills)?

We anticipate that the best features yet to be built will emerge through iterative feed- back, deployment, and usage in the broader reinforcement learning and interpretability research communities.

To this end, we do not focus on just MBRL/online RL, but transfer from simple_rl for classic MDP based as well as methods specifically targetting continual reinforcement learning. We wish to package this hopefully valuable tool as a python package and subsequently a jupyter notebook widget with real-time updates.

Policy Visualizer

A Python package for visualizing policy performance and t-SNE of state-action pairs.

Installation

You can install the package using pip:

pip install policy_visualizer

Usage

from policy_visualizer import PolicyVisualizer

# Example usage
visualizer = PolicyVisualizer(checkpoint_path='path/to/checkpoints', 
                               env_name='CartPole-v1', 
                               agent_class=YourAgentClass)
visualizer.plot_all()

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
experiments		experiments
tests		tests
visualizer		visualizer
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

better_rl

Policy Visualizer

Installation

Usage

About

Releases

Packages

Languages

License

sheeerio/better_rl

Folders and files

Latest commit

History

Repository files navigation

better_rl

Policy Visualizer

Installation

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages