write script to bench the max control freq when using PPO, SAC etc #55

Armandpl · 2024-01-17T16:56:15Z

write wrapper that keeps track of the last time step was called
- and also makes the action zero as we actually don't want to actuate the robot
configure the script with hydra
- we want to be able to select the algorithm
- i think we would use the files under algos, this way we can do things like bench sac vs sac but updating every episode instead of every step
setup and run a sweep on my machine
- then run it on my mac
- on Pierre's computer
- on the rig
- find a way to display all that in a wandb report
bench the velocity filter
- run the bench with/without the velocity filter. i dont know if it takes a lot of time/compute or not. if it is slow maybe we need a faster one. or maybe we need to use recurrent algos that don't need the speed as an observation

Armandpl · 2024-01-18T23:47:30Z

should we try benching SBX too? seems like UTD ratio is important and feel like my laptop will be too weak for that so Jax on mps might be better?
look at SAC when doing episodic training
look at SAC when doing one update per step, two updates per step etc

This was referenced Jan 17, 2024

setup RL experiments #51

Open

evaluate recurrent RL algos #53

Open

Provide feedback