- 12.1. Deep Deterministic Policy Gradient
- 12.1.1. An Overview of DDPG
- 12.2. Components of DDPG
- 12.2.1. Critic network
- 12.2.2. Actor Network
- 12.3. Putting it all Together
- 12.4. Algorithm - DDPG
- 12.5. Swinging Up the Pendulum using DDPG
- 12.6. Twin Delayed DDPG
- 12.7. Components of DDPG
- 12.7.1. Key Features of TD3
- 12.7.2. Clipped Double Q Learning
- 12.7.3. Delayed Policy Updates
- 12.7.4. Target Policy Smoothing
- 12.8. Putting it all Together
- 12.9. Algorithm - TD3
- 12.10. Soft Actor Critic
- 12.11. Components of SAC
- 12.11.1. Understanding Soft Actor Critic
- 12.11.2. V and Q Function with the Entropy Term
- 12.11.3. Critic Network
- 12.11.3.1. Value Network
- 12.11.3.2. Q Network
- 12.11.3.3. Actor Network
- 12.12 Putting it all Together
- 12.13. Algorithm - SAC
12. Learning DDPG, TD3 and SAC
Folders and files
Name | Name | Last commit date | ||
---|---|---|---|---|
parent directory.. | ||||