-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sanity check by training with sac or tqc #64
Comments
|
The robot arm broke so I can't secure it to the motor shaft anymore.
|
Try sac w/ gSDE in sbx? |
Ok so now we could bench TQC against SAC to gauge which one we should use on the real robot? But if we go this route we should also probably tune the hyper-parameters? but is the tuning going to transfer since it is still unclear how far/close the sim is to the actual robot? Maybe just go to hp tuning for tqc since 'we know' it is better? |
In simulation, I can get the agent to converge in ~40k timesteps. 40k timesteps at 50Hz is 15 min in real life. But when training on the real robot it takes hours. It is slow in part because waiting for the pendulum to reset is slow. Maybe we shouldn't reset the robot and let it learn for a long time??
|
Need to update the way we reset the episode. I set up a PID but it is badly tuned and I think it may have damaged the motor. |
I want to train the robot in very few steps and very quickly in terms of wall time but I haven't completed a training run on the robot yet. I should do that first to sanity check, make sure there is nothing wrong with the robot, the laptop/robot coms or the env code.
repro the training run from last time:
The text was updated successfully, but these errors were encountered: