搜索资源列表
DQN
- 谷歌DeepMind2015年2月发表的人工智能算法,可以在雅达利2600游戏机的49个游戏中击败人类专业玩家-human-level control through RL
pytorch-a2c-ppo-acktr-master
- 改代码为ACKTR代码,该算法比传统的TRPO和DQN在运行速度和计算量都有很大的提升(scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation)
Proximal_Policy_Optimization
- 强化学习可以按照方法学习策略来划分成基于值和基于策略两种。而在深度强化学习领域将深度学习与基于值的Q-Learning算法相结合产生了DQN算法,通过经验回放池与目标网络成功的将深度学习算法引入了强化学习算法。(Reinforcement learning can be divided into value-based learning and strategy based learning according to method learning strategies. In the fiel