add ddpg in tensorflow #51

cjcchen · 2018-04-14T08:14:43Z

add a ddpg model based on tensorflow in rllab

This is needed for using more than one MjViewer at a time

chainer setup is broken (on Ubuntu 16.04, at least), and the chainer devs direct users to newer versions, because v1.18.0 is unsupportted. See cupy/cupy#886. I can find no actual dependencies to chainer in the main rllab repo. If someone has some downstream code with a hard chainer dependency, they're welcome to submit a (working) PR to add it back in.

StatefulPool doesn't support batching over with class methods because it always passes `G` as the first argument to the worker function. If one of the `run_` methods in StatefulPool is called with a class method it can lead to a silent lock-up of the pool, which is very difficult to debug. Note: this bug does not appear unless n_parallel > 1

Adds TensorBoard support for basic key-value pairs. Anything logged via `logger.record_tabular()` is also available via TensorBoard.

ryanjulian · 2018-04-17T17:26:35Z

I can do a more detailed review later, but one overall comment: the purpose of adding more algorithms is also to add more building blocks to rllab. I need you to break the building blocks (i.e.g the neural networks and the replay buffer) out into their own libraries in rllab, so they can be reused by future algorithms.

ryanjulian · 2018-04-17T17:32:31Z

sandbox/rocky/tf/algos/ddpg/ddpg.py

+import os
+
+
+class DDPG(object):


Should inherit from rllab.algos.base.RLAlgorithm

ryanjulian · 2018-04-17T17:33:05Z

sandbox/rocky/tf/algos/ddpg/ddpg.py

+                 action_range=(-1, 1),
+                 actor_lr=1e-4,
+                 critic_lr=1e-3,
+                 reward_scale=1,


Can you leave detailed comments on what each of these parameters does? Even better if you can reference them back to the constants in the equations in a paper.

into ddpg_pull_req

ryanjulian · 2018-05-02T22:07:02Z

Just to reiterate: this needs to be split into building blocks which are reusable into rllab.

The issue also requests a regression test against openai/baselines version of the algorithm.

zhanpenghe · 2018-05-02T22:24:06Z

sandbox/rocky/tf/algos/ddpg/replay_buffer.py

+import random
+
+
+class ReplayBuffer:


I think you could just use the replay buffer from the ddpg in the theano tree.

The replay buffer from the ddpg in the theano tree is combined with ddpg algorithm in the same file. It should not include the ddpg file from the theano tree unless I separate them. Also, I think the sandbox tree should not use the functions in the theano tree since they use different platforms(theano and tensorflow).

zhanpenghe · 2018-05-02T22:28:06Z

sandbox/rocky/tf/algos/ddpg/ddpg_base_net.py

+    return tc.layers.layer_norm(x, center=True, scale=True)
+
+
+class Model(object):


For actor critic, I think we could work on q functions and policies that is implemented with tensorflow directly so they are reusable.

I could not find the q functions and policies functions in tensorflow. If there are, could you provide them for me?

ryanjulian · 2018-05-09T19:37:43Z

Uhh deleting files in a

ryanjulian · 2018-05-09T19:38:53Z

Oops, ignore.

Deleting files in a commit is one way to deal with a polluted tree, but it's more productive to use git rebase. You need to rebase a lot in this repository, anyway.

https://git-scm.com/book/pt-br/v2/Git-Branching-Rebasing

ryanjulian · 2018-05-15T22:57:12Z

@cjcchen where can I find the latest code for this PR?

cjcchen · 2018-05-16T16:55:03Z

I created a new PR based on the master branch. #62

cjcchen · 2018-05-16T16:55:41Z

But it seems fail on the CI test and I could not find the reason.

ryanjulian · 2018-05-16T20:22:30Z

Can you please close either this PR or #62? I don't know which one to review.

ryanjulian and others added 30 commits February 27, 2018 15:56

Add window title parameter to MjViewer

6b7f6fb

Set window context before rendering in MjViewer

3ae055a

This is needed for using more than one MjViewer at a time

[tf] Symbolic entropy for DiagonalGaussian

1f3f06a

TravisCI support

4fa6bff

[travisci] Correct changed file detection

4694327

add tensorboard

c7ea99f

fix pull request

4cfff58

yapf format

b152895

remove dm_control deps

c3caaf8

Add TensorBoard Support

0b2776b

Adds TensorBoard support for basic key-value pairs. Anything logged via `logger.record_tabular()` is also available via TensorBoard.

fix set tenworboard dir

b220f58

add histogram

dbcc665

remove unused import

0ec372f

Merge branch 'integration' into tensorboard_histogram

b07c6e4

format

ea6341f

format

b4fee90

format

fbd3c01

set private para

ff7bcb5

reformat

9e08f2b

reformat

54dc01a

add ddpg

56f7800

add example for ddpg

9157bf0

add an example for ddpg

9afa1b1

remove unused import

d9db1d5

remove unused line

7b5d800

remove unused line

a5f8f6e

remove unused line

244dbe2

add check point save

6e888f1

add check point save

996c4d7

ryanjulian requested review from eric-heiden and hejia-zhang April 17, 2018 17:26

ryanjulian reviewed Apr 17, 2018

View reviewed changes

sandbox/rocky/tf/algos/ddpg/ddpg.py Outdated

import os

class DDPG(object):

Copy link

Owner

ryanjulian Apr 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should inherit from rllab.algos.base.RLAlgorithm

ryanjulian reviewed Apr 17, 2018

View reviewed changes

Merge branch 'tensorboard_histogram' of https://github.com/cjcchen/rllab

393f0e8

into ddpg_pull_req

ryanjulian requested a review from zhanpenghe May 2, 2018 22:07

zhanpenghe reviewed May 2, 2018

View reviewed changes

ryanjulian changed the base branch from integration to master May 8, 2018 17:31

cjcchen added 2 commits May 9, 2018 02:55

add param desc

eacb3f5

format

27dfbf7

ryanjulian closed this May 9, 2018

ryanjulian reopened this May 9, 2018

ryanjulian added this to the Week of May 14 milestone May 16, 2018

cjcchen closed this May 17, 2018

ryanjulian assigned cjcchen May 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add ddpg in tensorflow #51

add ddpg in tensorflow #51

cjcchen commented Apr 14, 2018

ryanjulian commented Apr 17, 2018

ryanjulian Apr 17, 2018

ryanjulian Apr 17, 2018

cjcchen May 8, 2018

ryanjulian commented May 2, 2018

zhanpenghe May 2, 2018 •

edited

Loading

cjcchen May 8, 2018

zhanpenghe May 2, 2018 •

edited

Loading

cjcchen May 8, 2018

ryanjulian commented May 9, 2018

ryanjulian commented May 9, 2018

ryanjulian commented May 15, 2018

cjcchen commented May 16, 2018

cjcchen commented May 16, 2018

ryanjulian commented May 16, 2018

		return tc.layers.layer_norm(x, center=True, scale=True)


		class Model(object):

add ddpg in tensorflow #51

add ddpg in tensorflow #51

Conversation

cjcchen commented Apr 14, 2018

ryanjulian commented Apr 17, 2018

ryanjulian Apr 17, 2018

Choose a reason for hiding this comment

ryanjulian Apr 17, 2018

Choose a reason for hiding this comment

cjcchen May 8, 2018

Choose a reason for hiding this comment

ryanjulian commented May 2, 2018

zhanpenghe May 2, 2018 • edited Loading

Choose a reason for hiding this comment

cjcchen May 8, 2018

Choose a reason for hiding this comment

zhanpenghe May 2, 2018 • edited Loading

Choose a reason for hiding this comment

cjcchen May 8, 2018

Choose a reason for hiding this comment

ryanjulian commented May 9, 2018

ryanjulian commented May 9, 2018

ryanjulian commented May 15, 2018

cjcchen commented May 16, 2018

cjcchen commented May 16, 2018

ryanjulian commented May 16, 2018

zhanpenghe May 2, 2018 •

edited

Loading

zhanpenghe May 2, 2018 •

edited

Loading