v2.0 #21

Armandpl · 2022-11-22T16:45:05Z

This is a tracking issue for the second iteration of this project.

CAD/Mechanical Assembly:

Software:

Electronics:

add a current sensor
use more precise encoders for the motor + pendulum and evaluate their impact
use a motor that has helical gear vs. spur gear. see if it reduces play and allows for better/smoother control

RL:

faster training. previous iteration took 4-5h to train. can we go faster? can we train under <10 min? can we do it without giving more info/constraints.
upgrade gym to make upgrade to gymnasium easier
offline RL?
- datasets could be from an energy based swing up + pid?
- use MCAP files? or replay buffers from SAC? MCAP feels like the good choice here as the replay buffer contains the transforms applied by the wrappers such as the HistoryWrapper
measure how much of the time is spent running the policy and how much of the time is spent doing matrix multiplication while the pendulum is idle. is it possible to parallelize?? outsource matrix multiplication to a cloud machine?
~~try using torque as the output of the policy (torque control on arduino??)~~
try training with/without current sensor data in the observation
remove the two different control frequencies and use a SkipFrame wrapper instead.
System id from data logs then sim2real #26

Control Theory:

Documentation:

Validate reproducibility:

Ressources:
Offline RL:

RL:

newnew:

Armandpl · 2024-01-07T09:32:32Z

Goals as of 7 jan:
Pierre:

Armand:

broadly make training faster. 5h is too slow given deepmind trains robot dogs in 4 min
faster = wall time
parallelize simulation
- easy first pass: use vec env, paralelize on cpu with the current sim
- try sim2real, maybe fine tune
try other algos
- now that we can use not onboard compute maybe we can use on policy algos e.g PPO
try and remove as much code as possible
- try and remove the velocity filter by using PPO-LSTM and having the agent figure out the filter. see PPO vs RecurrentPPO (aka PPO LSTM) on environments with masked velocity (SB3 Contrib)
depending on sim2real transfer, make sim more accurate
- simulate sensor noise and resolution
- simulate motor
- system id from logs?
Depending on sim speed on cpu + sim2real success
- sim in isaac gym

Armandpl · 2024-01-07T09:40:36Z

pierfabre pinned this issue Jan 21, 2024

Provide feedback