Please wait a minute...
 首页  期刊介绍 期刊订阅 联系我们 横山亮次奖 百年刊庆
 
最新录用  |  预出版  |  当期目录  |  过刊浏览  |  阅读排行  |  下载排行  |  引用排行  |  横山亮次奖  |  百年刊庆
清华大学学报(自然科学版)  2024, Vol. 64 Issue (2): 358-369    DOI: 10.16511/j.cnki.qhdxxb.2023.27.001
  航天航空工程 本期目录 | 过刊浏览 | 高级检索 |
未知环境下无人机编队智能避障控制方法
黄号1, 马文卉2, 李家诚1, 方洋旺1
1. 西北工业大学 无人系统技术研究院, 西安 710072;
2. 西北工业大学 自动化学院, 西安 710072
Intelligent obstacle avoidance control method for unmanned aerial vehicle formations in unknown environments
HUANG Hao1, MA Wenhui2, LI Jiacheng1, FANG Yangwang1
1. Unmanned System Research Institute, Northwestern Polytechnical University, Xi'an 710072, China;
2. School of Automation, Northwestern Polytechnical University, Xi'an 710072, China
全文: PDF(6398 KB)   HTML
输出: BibTeX | EndNote (RIS)      
摘要 为保障固定翼无人机编队在未知障碍环境下的安全飞行,该文针对固定翼无人机编队飞行控制方法展开研究。在深度确定性策略梯度(deep deterministic policy gradient,DDPG)的基础上,引入贪婪选择构建了Greedy-DDPG算法,训练长机模型实现避障控制;并结合人工势场的方法和领从一致性设计了僚机群避障避碰控制策略,确保僚机能够规避障碍,跟随长机执行飞行任务。数值仿真实验结果显示,Greedy-DDPG算法的训练时长比DDPG算法的缩短了5.9%,避障的泛化能力得到提升;Monte Carlo仿真实验验证结果显示,该方法具有良好的鲁棒性。采用该方法可实现无人机编队协同飞行,对于保障无人机编队在未知环境中的飞行安全具有重要意义。
服务
把本文推荐给朋友
加入引用管理器
E-mail Alert
RSS
作者相关文章
黄号
马文卉
李家诚
方洋旺
关键词 编队控制避障避碰强化学习集中式协同    
Abstract:[Objective] Formations of fixed-wing unmanned aerial vehicles (UAVs), which are commonly used in military, rescue, and other missions, often do not have the ability to hover and have a large turning radius. Thus, when operating in an unknown environment, it is easy for the formations to collide in the presence of obstacles, which will gravely affect flight safety if not guarded against. It is difficult to avoid unknown environmental obstacles using traditional modeling methods. However, artificial potential field methods can address deadlock problems such as target infeasibility and cluster congestion. [Methods] To achieve the cooperation of UAV formations without collision, a deep deterministic policy gradient (DDPG)-based centralized UAV formation control method is proposed in this study, which is designed by combining the centralized communication architecture, reinforcement learning, and artificial potential field method. First, a greedy-DDPG flight control method is studied for leader UAVs, which improves collision avoidance effectiveness. Considering maneuver constraints, reward functions, action spaces, and state spaces are improved. Additionally, to shorten the training duration, the exploration strategy of DDPG is improved using the greedy scheme. This improvement mainly uses the critic network to evaluate the value of random action groups and improves greedy selection to make actions more inclined, thus achieving rapid updates regarding the critic network and accelerating the update of the overall network. Based on this, incorporated with the artificial potential field method and leader-follower consensus, a collision-free control method is designed for followers, which can ensure collision-free following cooperation. [Results] The numerical simulation experimental results show that the improved DDPG algorithm has a 5.9% shorter training time than the original algorithm. In the same scenario, the method that we proposed perceives the same number of obstacles as the artificial potential field method. The artificial potential field method has significant fluctuations in heading angle, while the proposed method has relatively small fluctuations. The DDPG algorithm has a smoother heading angle due to a smaller number of perceived obstacles; however, the minimum distance from the obstacles is only 9.1 m. The method that we proposed here is above 17 m from the obstacles. Furthermore, Monte Carlo experimental data under different scenarios of the long aircraft show that the ability of obstacle avoidance generalization of the proposed method is improved. Moreover, experiments were applied to the proposed formation control method. Under the same scenario and control parameters, the UAV formation control method based on the proposed architecture has lower formation errors during flight, with a maximum error of no more than 10 m. However, the artificial potential field-based formation control method has a maximum formation error of over 25 m. When encountering narrow gaps, our proposed method can quickly pass through without congestion, while the artificial potential field-based formation control method appears to hover in front of obstacles, which is not conducive to flight safety. During the entire flight, this method has a greater distance from obstacles and higher safety. [Conclusions] Compared with the original DDPG algorithm, the improved DDPG algorithm has faster training speed and better training effect. The formation control method can realize the formation flight of unmanned aerial vehicles under unknown obstacles. Compared with the formation control method based on artificial potential field, the formation control method avoids the hovering in place before obstacles, which is of great significance to the formation flight safety of unmanned aerial vehicles.
Key wordsformation control    avoiding obstacles and collisions    reinforcement learning    centralized collaboration
收稿日期: 2023-05-25      出版日期: 2023-12-28
ZTFLH:  V249.1  
基金资助:国家自然科学基金面上项目(61973253)
通讯作者: 方洋旺,教授,E-mail:ywfang@nwpu.edu.cn     E-mail: ywfang@nwpu.edu.cn
作者简介: 黄号(1998-),男,硕士研究生。
引用本文:   
黄号, 马文卉, 李家诚, 方洋旺. 未知环境下无人机编队智能避障控制方法[J]. 清华大学学报(自然科学版), 2024, 64(2): 358-369.
HUANG Hao, MA Wenhui, LI Jiacheng, FANG Yangwang. Intelligent obstacle avoidance control method for unmanned aerial vehicle formations in unknown environments. Journal of Tsinghua University(Science and Technology), 2024, 64(2): 358-369.
链接本文:  
http://jst.tsinghuajournals.com/CN/10.16511/j.cnki.qhdxxb.2023.27.001  或          http://jst.tsinghuajournals.com/CN/Y2024/V64/I2/358
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
[1] ISMAIL A, BAGULA B A, TUYISHIMIRE E. Internet-of-things in motion:A UAV coalition model for remote sensing in smart cities [J]. Sensors, 2018, 18(7):2184.
[2] YANG J H, QIAN J C, GAO H W. Forest wildfire monitoring and communication UAV system based on particle swarm optimization [J]. Journal of Physics:Conference Series, 2021, 1982(1):012068.
[3] TORTONESI M, STEFANELLI C, BENVEGNU E, et al. Multiple-UAV coordination and communications in tactical edge networks [J]. IEEE Communications Magazine, 2012, 50(10):48-55.
[4] LINDQVIST B, SOPASAKIS P, NIKOLAKOPOULOS G. A scalable distributed collision avoidance scheme for multi-agent UAV systems [C]//2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). Prague, Czech Republic:IEEE, 2021:9212-9218.
[5] SHIVGAN R, DONG Z Q. Energy-efficient drone coverage path planning using genetic algorithm [C]//2020 IEEE 21st International Conference on High Performance Switching and Routing (HPSR). Newark, NJ, USA:IEEE, 2020:1-6.
[6] 陈博琛, 唐文兵, 黄鸿云, 等. 基于改进人工势场的未知障碍物无人机编队避障[J]. 计算机科学, 2022, 49(S1):686-693. CHEN B C, TANG W B, HUANG H Y, et al. Pop-up obstacles avoidance for UAV formation based on improved artificial potential field [J]. Computer Science, 2022, 49(S1):686-693. (in Chinese)
[7] 李樾, 韩维, 陈清阳, 等. 基于改进的速度障碍法的有人/无人机协同系统三维实时避障方法[J]. 西北工业大学学报, 2020, 38(02):309-318. LI Y, HAN W, CHEN Q Y, et al. Real-time obstacle avoidance for manned/unmanned aircraft cooperative system based on improved velocity obstacle method [J]. Journal of Northwestern Polytechnical University, 2020, 38(2):309-318. (in Chinese)
[8] PEDRO D, MATOS-CARVALHO J P, FONSECA J M, et al. Collision avoidance on unmanned aerial vehicles using neural network pipelines and flow clustering techniques [J]. Remote Sensing, 2021, 13(13):2643.
[9] 刘钢, 汤俊, 刘陈, 等. 无人飞行器集群协同行为建模技术综述[J]. 系统工程与电子技术, 2021, 43(8):2221-2231. LIU G, TANG J, LIU C, et al. Survey of cooperative behavior modeling technology for unmanned aerial vehicles clusters [J]. Systems Engineering and Electronics, 2021, 43(8):2221-2231. (in Chinese)
[10] 张殿富, 刘福. 基于人工势场法的路径规划方法研究及展望[J]. 计算机工程与科学, 2013, 35(6):88-95. Zhang D F, Liu F. Research and development trend of path planning based on artificial potential field method [J]. Computer Engineering & Science, 2013, 35(6):88-95. (in Chinese)
[11] 陈廷斌, 张奇松, 杨晓光. 基于改进人工势场-鱼群算法的LBS最短路径修正研究[J]. 计算机应用与软件, 2015, 32(6):259-262. CHEN T B, ZHANG Q S, YANG X G. On LBS shortest path correction based on improved artificial fish swarm algorithm with potential field [J]. Computer Applications and Software, 2015, 32(6):259-262. (in Chinese)
[12] 代冀阳, 王村松, 殷林飞, 等. 飞行器分层势场路径规划算法[J]. 控制理论与应用, 2015, 32(11):1505-1510. DAI J Y, WANG C S, YIN L F, et al. Hierarchical potential field algorithm of path planning for aircraft [J]. Control Theory & Applications, 2015, 32(11):1505-1510. (in Chinese)
[13] 仇恒坦, 平雪良, 高文研, 等. 改进人工势场法的移动机器人路径规划分析[J]. 机械设计与研究, 2017, 33(4):36-40. QIU H T, PING X L, GAO W Y, et al. Mobile robot path planning based on improved artificial potential field method [J]. Machine Design & Research, 2017, 33(4):36-40. (in Chinese)
[14] MCINTYRE D, NAEEM W, XU X D. Cooperative obstacle avoidance using bidirectional artificial potential fields [C]//2016 UKACC 11th International Conference on Control (CONTROL). Belfast, UK:IEEE, 2016:1-6.
[15] LI J C, FANG Y W, CHENG H Y, et al. Large-scale fixed-wing UAV swarm system control with collision avoidance and formation maneuver [J]. IEEE Systems Journal, 2023, 17(1):744-755.
[16] 张云燕, 魏瑶, 刘昊, 等. 基于深度强化学习的端到端无人机避障决策[J]. 西北工业大学学报, 2022, 40(5):1055-1064. ZHANG Y Y, WEI Y, LIU H, et al. End-to-end UAV obstacle avoidance decision based on deep reinforcement learning [J]. Journal of Northwestern Polytechnical University, 2022, 40(5):1055-1064. (in Chinese)
[17] 王庭晗, 罗禹贡, 刘金鑫, 等. 基于考虑状态分布的深度确定性策略梯度算法的端到端自动驾驶策略[J]. 清华大学学报(自然科学版), 2021, 61(9):881-888. WANG T H, LUO Y G, LIU J X, et al. End-to-end self-driving policy based on the deep deterministic policy gradient algorithm considering the state distribution [J]. Journal of Tsinghua University (Science and Technology), 2021, 61(9):881-888. (in Chinese)
[18] MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning [J]. Nature, 2015, 518(7450):529-533.
[19] 张耀中, 许佳林, 姚康佳, 等. 基于DDPG算法的无人机集群追击任务[J]. 航空学报, 2020, 41(10):324000. ZHANG Y Z, XU J L, YAO K J, et al. Pursuit missions for UAV swarms based on DDPG algorithm [J]. Acta Aeronautica et Astronautica Sinica, 2020, 41(10):324000. (in Chinese)
[20] LI B, YANG Z P, CHEN D Q, et al. Maneuvering target tracking of UAV based on MN-DDPG and transfer learning [J]. Defence Technology, 2021, 17(2):457-466.
[21] 高敬鹏, 胡欣瑜, 江志烨. 改进DDPG无人机航迹规划算法[J]. 计算机工程与应用, 2022, 58(8):264-272. GAO J P, HU X Y, JIANG Z Y, Unmanned aerial vehicle track planning algorithm based on improved DDPG [J]. Computer Engineering and Applications, 2022, 58(8):264-272. (in Chinese)
[22] 陈浩. 复杂条件下固定翼无人机集群编队控制研究[D]. 长沙:国防科技大学, 2020. CHEN H. Research on formation control of fixed-wing UAV swarms in complex environments [D]. Changsha:National University of Defense Technology, 2020. (in Chinese)
[23] SHEVITZ D, PADEN B. Lyapunov stability theory of nonsmooth systems [J]. IEEE Transactions on Automatic Control, 1994, 39(9):1910-1914.
[1] 何启嘉, 王启明, 李佳璇, 王正佳, 王通. 基于优势竞争网络的转运机器人路径规划[J]. 清华大学学报(自然科学版), 2022, 62(11): 1751-1757.
[2] 王庭晗, 罗禹贡, 刘金鑫, 李克强. 基于考虑状态分布的深度确定性策略梯度算法的端到端自动驾驶策略[J]. 清华大学学报(自然科学版), 2021, 61(9): 881-888.
[3] 曾道建, 童国维, 戴愿, 李峰, 韩冰, 谢松县. 基于序列到序列模型的法律问题关键词抽取[J]. 清华大学学报(自然科学版), 2019, 59(4): 256-261.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
版权所有 © 《清华大学学报(自然科学版)》编辑部
本系统由北京玛格泰克科技发展有限公司设计开发 技术支持:support@magtech.com.cn