Added asynchronous plotting to TensorFlow #126

jonashen · 2018-06-07T00:11:00Z

Unfortunately, async plotting for TensorFlow requires the use of threading.Thread instead of multiprocessing.Process. Thus, on Linux machines, there will be a slight delay when you run the algorithm that seems like nothing is happening, but once the window loads everything works perfectly and asynchronously.

ghost · 2018-06-07T00:18:24Z

sandbox/rocky/tf/plotter/plotter.py

+
+def init_plot(env, policy, session):
+    global process, queue, sess
+    sess = session


Why not use tf.get_default_session() instead?

ghost · 2018-06-07T00:19:21Z

sandbox/rocky/tf/plotter/plotter.py

+def init_worker():
+    global process, queue
+    queue = Queue()
+    process = Thread(target=_worker_start)


If it's an instance of Thread, then the object should be named thread.

If I interrupt the training with the plotter on, will the program stop as well, or it waits for the plotter? If the program still waits for the plot thread, you may need set daemon=True when creates the thread.

I just tested. The program still waits when you don't set daemon=True. Set daemon=True solves this issue.

Thanks for letting me know!

ghost · 2018-06-07T00:20:16Z

sandbox/rocky/tf/plotter/plotter.py

+                    # Only fetch the last message of each type
+                    while True:
+                        try:
+                            msg = queue.get_nowait()


This generates busy waiting if there's nothing in the queue.

This should be fine for now because something will be placed into the queue for each iteration, and upon completion the thread will close.

ghost · 2018-06-07T00:21:25Z

sandbox/rocky/tf/plotter/plotter.py

+
+process = None
+queue = None
+sess = None


If you pass sess as a parameter to _worker_start using the Thread constructor, there's no need to have this as a global variable.

ghost · 2018-06-07T00:24:52Z

sandbox/rocky/tf/plotter/plotter.py

+
+__all__ = ['init_worker', 'init_plot', 'update_plot']
+
+process = None


Encapsulate the variables and the methods in this file within a class to avoid having global variables.

ghost · 2018-06-07T00:25:23Z

sandbox/rocky/tf/plotter/plotter.py

+import atexit
+from multiprocessing import Process
+import numpy as np
+import pickle


Pickle is not need here. Double check these imports.

ryanjulian · 2018-06-07T17:37:03Z

sandbox/rocky/tf/algos/batch_polopt.py

@@ -119,7 +122,8 @@ def train(self, sess=None):
                logger.log("Optimizing policy...")
                self.optimize_policy(itr, samples_data)
                logger.log("Saving snapshot...")
-                params = self.get_itr_snapshot(itr, samples_data)  # , **kwargs)
+                params = self.get_itr_snapshot(itr,
+                                               samples_data)  # , **kwargs)


Please remove vestigial comment.

ryanjulian · 2018-06-07T17:37:50Z

sandbox/rocky/tf/plotter/__init__.py

@@ -0,0 +1 @@
+from sandbox.rocky.tf.plotter import *


Please don't use any * imports. Every imported symbol should be specified individually. Not all symbols are part of the public API (so those are not imported).

Note that PEP8 says that the public API for a module is the symbols listed in the all list defined in the module. We don't enforce this yet, but you can see that the plotter authors actually took advantage. So you should only import APIs from the all list.

ghost · 2018-06-07T20:42:28Z

sandbox/rocky/tf/plotter/plotter.py

+        self.queue.join()
+        self.thread.join()
+
+    def shutdown(self):


Check that the worker thread dies correctly at the end of the simulation and when user interrupts the simulation with keyboard interruption.

ghost · 2018-06-07T20:42:37Z

sandbox/rocky/tf/plotter/plotter.py

+                    while True:
+                        msgs = {}
+                        # Only fetch the last message of each type
+                        while True:


It seems now GIL is putting both this worker thread and the main thread in the same processor, so this busy waiting will impact performance.
A suggestion on how to avoid busy waiting can be found in my attempt to solve this problem here.
Feel free to use the code there.

ghost

I tested it and performance is much better now. Also, the process exits correctly after normal and interrupted execution.

ghost · 2018-06-07T21:48:52Z

sandbox/rocky/tf/algos/batch_polopt.py

@@ -158,3 +165,6 @@ def get_itr_snapshot(self, itr, samples_data):
    def optimize_policy(self, itr, samples_data):
        raise NotImplementedError

+    def update_plot(self):
+        if self.plot:


This guard is redundant since it's already used in line 136.

ryanjulian

I think we should take the opportunity to clean this up a little bit. We can punt on improving the parallelism. Please leave a TODO/GitHub issue to figure out how to do this cross-platform and use multiprocessing.

ryanjulian · 2018-06-09T20:49:02Z

sandbox/rocky/tf/plotter/plotter.py

+from multiprocessing import Process
+import numpy as np
+import platform
+from queue import Empty, Queue


PEP8: import grouping

ryanjulian · 2018-06-09T20:59:16Z

sandbox/rocky/tf/plotter/plotter.py

+                            msg = self.queue.get_nowait()
+                            msgs[msg[0]] = msg[1:]
+
+                    if 'stop' in msgs:


Let's replace the strings with enums.

ryanjulian · 2018-06-09T20:59:49Z

sandbox/rocky/tf/plotter/plotter.py

+                        # Only fetch the last message of each type
+                        while not self.queue.empty():
+                            msg = self.queue.get()
+                            msgs[msg[0]] = msg[1:]


msg[0] and msg[1:] is very difficult to read. What if we used a namedtuple instead?

from collections import namedtuple import enum from enum import Enum class Op(Enum): STOP = enum.auto() UPDATE = enum.auto() DEMO = enum.auto() Message = namedtuple("Message", ["op", "args", "kwargs"]) class Plotter: def _start_worker(self): while True: if initial_rollout: msg = self.queue.get() msgs[msg.op] = msg if Op.STOP in msgs: break elif Op.DEMO in msgs: env, policy = msgs[Op.DEMO].args def update_plot(self, policy, max_length=np.inf): if self.worker_thread.is_alive(): self.queue.put(Message(op=Op.DEMO, args=(policy.get_param_values(), max_length)) self.queue.task_done()

ryanjulian · 2018-06-09T21:06:13Z

sandbox/rocky/tf/plotter/plotter.py

+
+                    if 'stop' in msgs:
+                        break
+                    elif 'update' in msgs:


shouldn't this be if and not elif?

ryanjulian · 2018-06-09T21:06:26Z

sandbox/rocky/tf/plotter/plotter.py

+                        break
+                    elif 'update' in msgs:
+                        env, policy = msgs['update']
+                    elif 'demo' in msgs:


shouldn't this be if and not elif?

ryanjulian · 2018-06-09T21:07:44Z

sandbox/rocky/tf/plotter/plotter.py

+        max_length = None
+        initial_rollout = True
+        try:
+            with self.sess.as_default(), self.sess.graph.as_default():


Can you add a comment here explaining that the worker processes all messages in the queue per loop, not one message per loop?

ryanjulian · 2018-06-11T04:53:40Z

Please reopen this PR against https://github.com/rlworkgroup/garage

jonashen requested review from ryanjulian, eric-heiden, CatherineSue and a user June 7, 2018 00:11

jonashen mentioned this pull request Jun 7, 2018

Asynchronous plotting for TensorFlow #1

Closed

ghost suggested changes Jun 7, 2018

View reviewed changes

ryanjulian reviewed Jun 7, 2018

View reviewed changes

ryanjulian added this to the Week of June 4th milestone Jun 7, 2018

jonashen added 3 commits June 7, 2018 12:28

Added async plotting to TensorFlow

c07843e

Changed Plotter to be a class

edfc757

Fixed relative imports

c7bc9d6

jonashen force-pushed the tf_plotting branch from 89d5d08 to c7bc9d6 Compare June 7, 2018 19:28

ghost suggested changes Jun 7, 2018

View reviewed changes

jonashen added 2 commits June 7, 2018 14:38

Removed busy waiting of plotter thread, credits to Angel

5483eef

Passed sess to plotter

3c26089

ghost reviewed Jun 7, 2018

View reviewed changes

Migrated one-liner function call

e0ca0b8

ghost approved these changes Jun 7, 2018

View reviewed changes

jonashen requested review from ryanjulian, CatherineSue and zhanpenghe and removed request for eric-heiden and CatherineSue June 7, 2018 22:04

ryanjulian requested changes Jun 9, 2018

View reviewed changes

Added Enum, Message namedtuple for better readability

47a9604

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added asynchronous plotting to TensorFlow #126

Added asynchronous plotting to TensorFlow #126

jonashen commented Jun 7, 2018

ghost Jun 7, 2018

ghost Jun 7, 2018

CatherineSue Jun 7, 2018

CatherineSue Jun 7, 2018

jonashen Jun 7, 2018

ghost Jun 7, 2018

jonashen Jun 7, 2018

ghost Jun 7, 2018

ghost Jun 7, 2018

ghost Jun 7, 2018

ryanjulian Jun 7, 2018

ryanjulian Jun 7, 2018

ryanjulian Jun 7, 2018

ghost Jun 7, 2018

ghost Jun 7, 2018

ghost left a comment •

edited by ghost

Loading

ghost Jun 7, 2018

ryanjulian left a comment

ryanjulian Jun 9, 2018

ryanjulian Jun 9, 2018

ryanjulian Jun 9, 2018

ryanjulian Jun 9, 2018

ryanjulian Jun 9, 2018

ryanjulian Jun 9, 2018

ryanjulian Jun 9, 2018

ryanjulian commented Jun 11, 2018


		__all__ = ['init_worker', 'init_plot', 'update_plot']

		process = None

Added asynchronous plotting to TensorFlow #126

Are you sure you want to change the base?

Added asynchronous plotting to TensorFlow #126

Conversation

jonashen commented Jun 7, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ghost left a comment • edited by ghost Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ryanjulian left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ryanjulian commented Jun 11, 2018

ghost left a comment •

edited by ghost

Loading