Journal of Guangxi Normal University(Natural Science Edition) ›› 2010, Vol. 28 ›› Issue (3): 99-103.

Previous Articles     Next Articles

Improved Reinforcement Learning Algorithm and Its Application in RoboCup

CHENG Xian-yi1,2, ZHU Qian2   

  1. 1. College of Computer Science and Technology,Nantong University,Nantong Jiangsu 226019,China;
    2. College of Computer Science and Telecommunications Engineering,Jiangsu University, Zhenjiang Jiangsu 212013,China
  • Received:2010-05-13 Online:2010-09-20 Published:2023-02-06

Abstract: An improved algorithm based on CMAC (cerebella modelarticulation controller) and named DCMAC-AL is proposed.It uses advantage(λ) learning to calculate the state-action function,emphasizes the differences among action values and shuns action oscillation.It creates novel features based on Bellman error to improvethe adaption of CMAC.Besides,it provides a mathematic model for takeaway in RoboCup Soccer Simulation and experiment with DCMAC-AL.The results demonstrate thatDCMAC-AL outperforms advantage(λ) learning in regard to learning effort.

Key words: reinforcement learning, agent, RoboCup, CMAC

CLC Number: 

  • TP181
[1] SUTTON S R,BARTO A G.Reinforcement learning[M].Cambridge,MA:MIT Press,1998:24-26.
[2] BAKKER B.Reinforcement learning with long short-term memory[C]//Advances in Neural Information Processing System 14.Cambridge,MA:MIT Press,2002:987-990.
[3] PHILIPP W K,SHIE M,DOINA P.Automatic basis function construction for approximate dynamic programming and reinforcement learning[C]//Proceedings of the 23rd International Conference on Machine learning.Cambridge:MIT Press,2006:1103-1115.
[4] 高阳,胡景凯,王本年,等.基于CMAC网络强化学习的电梯群控调度[J].电子学报,2007,35(2):262-265.
[5] 李明爱,焦利芳,郝冬梅,等.基于多个并行CMAC神经网络的强化学习方法[J].系统仿真学报,2008,20(24):6683-6687.
[6] STONE P,SUTTON R S,KUHLMANN G.Reinforcement learning for RoboCup-soccer keepaway[J].Adaptive Behavior,2005,13(3):165-188.
[7] ATIL S,TOLEDO C B.A new perspective to the keepaway soccer:the takers (ShortPaper)[C]//ISCEN A,EROG-UL U.Proc of 7th Int Conf on Autonomous Agents and Multiagent Systems (AAMAS 2008).Estoril,Portugal:Springer Press,2008:566-569.
[1] LI Zhixin, SU Qiang. Knowledge-aided Image Captioning [J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(5): 418-432.
[2] CHEN Gaojian, WANG Jing, LI Qianwen, YUAN Yunjing, CAO Jiachen. Data-driven Method for Automatic Machine Learning Pipeline Generation [J]. Journal of Guangxi Normal University(Natural Science Edition), 2022, 40(3): 185-193.
[3] TANG Fengzhu, TANG Xin, LI Chunhai, LI Xiaohuan. Dynamic Task Allocation Method for UAVs Based on Deep Reinforcement Learning [J]. Journal of Guangxi Normal University(Natural Science Edition), 2021, 39(6): 63-71.
[4] FENG Xiu, MA Nannan, ZHI Hongtao, HAN Shuangqiao, ZHANG Xiang. Removal of Low Concentration Cadmium Ion in the Wastewater by Heavy Metal Capturing Agent UDTC [J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(3): 63-67.
[5] HU Wenjun. Delay Consensus of Leader-following Multi-agent Systems via the Adaptive Distributed Control [J]. Journal of Guangxi Normal University(Natural Science Edition), 2018, 36(1): 70-75.
[6] JIN Le, JI Min. Advances in Novel Targeting CT Contrast Agents [J]. Journal of Guangxi Normal University(Natural Science Edition), 2017, 35(3): 83-88.
[7] ZHANG Lin-lan, LIU Qing. Bilateral Automated Negotiation Based on Fuzzy Method with Incomplete Information [J]. Journal of Guangxi Normal University(Natural Science Edition), 2015, 33(4): 38-42.
[8] ZHOU Jian, WANG Li-li, Ahmed Rahmani, LIU Xin. Application of Distributed Multi-agent System in Flight Conflict Resolution [J]. Journal of Guangxi Normal University(Natural Science Edition), 2015, 33(3): 16-22.
[9] XIE Guang-qiang, ZHANG Yun, LI Yang, ZENG Qi-jie. Research of Krause's Multi-Agent Consensus Model [J]. Journal of Guangxi Normal University(Natural Science Edition), 2013, 31(3): 106-113.
[10] WANG Heng-shan, WU Qiang, YAO Gui-yang, LI Xiu-ying, PAN Ying-ming. Synthesis of Chiral Phosphine Reagents Derived from Maleopimaric Acid and Their Chiral Recognition in 31P NMR Applications [J]. Journal of Guangxi Normal University(Natural Science Edition), 2012, 30(3): 207-213.
[11] LIU Wei, WANG Hao, FANG Bao-fu. RoboCup-Rescue Decision Making Process Based on Binary Particle Swarm Optimization [J]. Journal of Guangxi Normal University(Natural Science Edition), 2011, 29(2): 130-133.
[12] SU Cheng, CHEN Wen-na, ZHOU Ling, HUANG Dong-mei. Mechanism of Multi-agent Task Allocation for Ocean Spatial Data Integration [J]. Journal of Guangxi Normal University(Natural Science Edition), 2011, 29(2): 205-209.
[13] WU Yi-xian, SU Cheng, CHEN Ming, FENG Guo-fu, CHI Tao. Clustering Management Structure for Greenhouse Wireless Sensor Network Based on Agent [J]. Journal of Guangxi Normal University(Natural Science Edition), 2011, 29(2): 210-214.
[14] LIU Xiang-nan, CHEN Ming, FENG Guo-fu, CHI Tao. Control Strategy for Wireless Sensor Network Topology Based on Mobile Agent [J]. Journal of Guangxi Normal University(Natural Science Edition), 2011, 29(2): 215-218.
[15] SONG Xiao-xin, LI De-quan. Leader-Following Consensus of Multi-Agent Systems [J]. Journal of Guangxi Normal University(Natural Science Edition), 2010, 28(4): 9-14.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!