Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tune hyperparameters for kernel density estimation tutorial #774

Merged
merged 8 commits into from
Sep 7, 2023

Conversation

michalzajac-ml
Copy link
Contributor

Description

This PR tunes parameters for the tutorial 7_train_density. The new version (when flag FAST = False) reaches better results in significantly lower number of PPO iterations:

score PPO steps
old -403.2 +/- 164.4 10M
new -197.7 +/- 30.0 1M
expert ~-180

(results are mean +/- std from 10 random seeds)

To arrive at the hyperparameters, I performed a grid search, checking the following values:

"density_type": ["s", "sa", "ss"],
"kernel_bandwidth": [0.1, 0.2, 0.4],
"lr": [3e-5, 1e-4, 3e-4, 1e-3],
"gamma": [0.95, 0.99],
"ent_coef": [0.0, 1e-3, 1e-4],
"n_epochs": [1, 10],

Testing

Verified performance of the hyperparameters in a separate experiment, as well as ran the notebook itself.

@michalzajac-ml michalzajac-ml added the docs Documentation missing, incorrect or unclear label Sep 5, 2023
@michalzajac-ml michalzajac-ml linked an issue Sep 5, 2023 that may be closed by this pull request
8 tasks
Copy link
Collaborator

@ernestum ernestum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM and as already mentioned here, the pipeline could be fixed by specifying the seals version better in setup.py

Copy link
Member

@AdamGleave AdamGleave left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, 1 minor suggestion

docs/algorithms/density.rst Outdated Show resolved Hide resolved
@michalzajac-ml michalzajac-ml changed the base branch from master to dependency_fixes September 6, 2023 07:56
Base automatically changed from dependency_fixes to master September 7, 2023 22:56
@AdamGleave AdamGleave merged commit f09aeea into master Sep 7, 2023
1 of 2 checks passed
@AdamGleave AdamGleave deleted the 763-tune-density-estimation branch September 7, 2023 23:33
lukasberglund pushed a commit to lukasberglund/imitation that referenced this pull request Sep 12, 2023
…patibleAI#774)

* Pin huggingface_sb3 version.

* Properly specify the compatible seals version so it does not auto-upgrade to 0.2.

* Make random_mdp test deterministic by seeding the environment.

* Tune hyperparameters for kernel density estimation tutorial

* Modify .rst docs for density estimation to match tutorials

* Update docs/algorithms/density.rst (nit)

Co-authored-by: Adam Gleave <[email protected]>

* Fix formatting in 7_train_density.ipynb and density.rst

---------

Co-authored-by: Maximilian Ernestus <[email protected]>
Co-authored-by: Adam Gleave <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation missing, incorrect or unclear
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Ensure all tutorials work as expected
3 participants