Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Insights into passing previous actions as observation #20

Open
VineetTambe opened this issue Oct 9, 2024 · 1 comment
Open

Insights into passing previous actions as observation #20

VineetTambe opened this issue Oct 9, 2024 · 1 comment

Comments

@VineetTambe
Copy link

VineetTambe commented Oct 9, 2024

Hey Authors,

I was playing around with some params of CALM and was wondering if you have any insight into passing previous actions as observations to the policy.
Other work similar to CALM claim that passing previous actions and states as observations to the policy reduce vibrations and other higher order behaviors in the policy.

Could it be that the 64D latent representation of the reference motion passed has enough signal which when coupled with the obs history is sufficient for the network to learn the same things it would have with the previous action history.

What are your thoughts on this one?

Thanks!

@tesslerc
Copy link
Collaborator

tesslerc commented Oct 9, 2024

Good question.
I'm not sure and it's certainly worth testing!

The question from an optimization standpoint is why does the model learn jittery behavior in the first place. My guess is that could be connected to representation/solution sampling issues.

My take on this -- If you aren't able to properly sample from the set of possible solutions, your model might collapse to the mean. So instead of sampling one of the smooth solutions, the model collapses to the mean which results in jittery behavior.
Methods such as CALM/ASE/PULSE/MaskedMimic use a latent representation to represent "which solution to focus on".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants