
03.RL
文章平均质量分 50
强化学习
apche CN
Archit
展开
专栏收录文章
- 默认排序
- 最新发布
- 最早发布
- 最多阅读
- 最少阅读
-
RL Policy-Based : Actor-Critic,A3C,DPG,DDPG,TRPO,PPO
RL: Policy-Based (Policy Gradient) Actor-CriticRef:原创 2021-03-31 17:56:52 · 379 阅读 · 0 评论 -
RL-总结
-RL==================================-Mario Martin mindmap-========================================Mario Martin's Reinforcement Learninghttps://www.cs.upc.edu/~mmartin/url-RL.htmlLilian Weng Blog:A (Long) Peek into Reinforcement Learninghttps://原创 2021-03-30 00:01:45 · 251 阅读 · 0 评论 -
RL Value-Based: off-policy DQN(Deep Q-Learning),on-policy
Deep RL:Q Learning->Approximate Q-Learning -> Deep Q-Learning.Deep Q-Learning: Deep Q-Learning was introduced in 2013. Since then, a lot of improvements have been made.So, today we’ll see four strategies that improve — dramatically — the train原创 2021-03-29 01:07:26 · 302 阅读 · 0 评论