Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v2.0 #21

Open
16 of 31 tasks
Armandpl opened this issue Nov 22, 2022 · 2 comments
Open
16 of 31 tasks

v2.0 #21

Armandpl opened this issue Nov 22, 2022 · 2 comments

Comments

@Armandpl
Copy link
Owner

Armandpl commented Nov 22, 2022

This is a tracking issue for the second iteration of this project.

CAD/Mechanical Assembly:

Software:

  • Use MCAP and Foxglove.dev for data logging and debug #25
  • Control the robot using an arduino, control the arduino over serial #24
  • use hydra to configure pendulum
  • add tests? robot self test like 3D printer?
  • separate package for gym wrapper and actual pendulum?
  • clean up wandb
    • delete all models except for the working one (v204) started by deleting replay buffers first
    • log robot config to training runs
    • delete failed and killed runs?
  • Make it so that the same gym env + sb3 code can run either on a jetson nano or a PC #23
  • better env reset. previously we used hardcoded commands to reset the robot to its starting position. using a pid should be easier, cleaner and should transfer more easily should we have different robot hardware configurations. this should be part of the gym wrapper. the robot class/API should only be about the robot and should be usable outside of the gym context just wait for the pendulum to be below an angle threshold for a number of steps then reset both encoders. this way we don't have to move motor back. drawback is the pendulum isn't facing the same way every episode, but this time we're not making a video so that's alright
  • control frequency wrapper: catch up delays
  • update pre-commit setup

Electronics:

  • add a current sensor
  • use more precise encoders for the motor + pendulum and evaluate their impact
  • use a motor that has helical gear vs. spur gear. see if it reduces play and allows for better/smoother control

RL:

  • faster training. previous iteration took 4-5h to train. can we go faster? can we train under <10 min? can we do it without giving more info/constraints.
  • upgrade gym to make upgrade to gymnasium easier
  • offline RL?
    • datasets could be from an energy based swing up + pid?
    • use MCAP files? or replay buffers from SAC? MCAP feels like the good choice here as the replay buffer contains the transforms applied by the wrappers such as the HistoryWrapper
  • measure how much of the time is spent running the policy and how much of the time is spent doing matrix multiplication while the pendulum is idle. is it possible to parallelize?? outsource matrix multiplication to a cloud machine?
  • try using torque as the output of the policy (torque control on arduino??)
  • try training with/without current sensor data in the observation
  • remove the two different control frequencies and use a SkipFrame wrapper instead.
  • System id from data logs then sim2real #26

Control Theory:

  • pid? mpc? system id?

Documentation:

  • an assembly video tutorial would be nice
  • doc should show the angles origings and directions
  • changelog?
  • atomic/concise wandb reports for RL experiments would be nice

Validate reproducibility:

  • Can we source the parts 1 year later?

Ressources:
Offline RL:

RL:

newnew:

@Armandpl
Copy link
Owner Author

Armandpl commented Jan 7, 2024

Goals as of 7 jan:
Pierre:

  • broadly compare different approaches for upright control: PID, MPC, RL
  • code should allow easily swapping controllers and comparing them
  • code up MPC
  • first pass: tune the MPC + PID on the physical robot
  • second pass: make the sim closer to reality to tune the controller and transfer
  • RL: sim2real

Armand:

  • broadly make training faster. 5h is too slow given deepmind trains robot dogs in 4 min
  • faster = wall time
  • parallelize simulation
    • easy first pass: use vec env, paralelize on cpu with the current sim
    • try sim2real, maybe fine tune
  • try other algos
    • now that we can use not onboard compute maybe we can use on policy algos e.g PPO
  • try and remove as much code as possible
  • depending on sim2real transfer, make sim more accurate
    • simulate sensor noise and resolution
    • simulate motor
    • system id from logs?
  • Depending on sim speed on cpu + sim2real success
    • sim in isaac gym

@Armandpl
Copy link
Owner Author

Armandpl commented Jan 7, 2024

New list of todos:

@pierfabre pierfabre pinned this issue Jan 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant