| 849 | 3 | 157 |
| 下载次数 | 被引频次 | 阅读次数 |
针对荒野地区使用无人机进行人员搜救存在的精度低、漏检严重等问题,提出基于YOLOv8-s的9种备选改进模型,并最终确定IC-A-S模型为最佳方案。该模型引入Inner-CIoU损失函数,优化了模型对小目标的定位能力;集成AFPN4多尺度特征融合结构,并将其特征融合层提升到4层,加强对小目标特征的提取能力;根据动态蛇形卷积能够自适应聚焦细小目标的特点,设计C2f-SnakeConv模块替换AFPN4的级联残差模块,增强AFPN4对小目标的特征表达和学习能力。在HERIDAL测试集上,相比基线模型YOLOv8-s,所提模型的AP50、AP75和mAP指标的提高量分别为4.5%、4.1%和4.2%;在HERIDAL泛化测试集上,相比基线模型,AP50、AP75和mAP指标的提高量分别为4.8%、5.9%和4.6%。两组测试结果均表明IC-A-S改进模型提高了对荒野地区遇险人员的搜救检测精度。(电子补充材料详见中国知网本文的知网节。)
Abstract:Aiming at the problems of low accuracy and serious missed detection in using unmanned aerial vehicles(UAVs) for personnel search and rescue in wilderness areas, nine alternative improved models based on YOLOv8-s were proposed. The IC-A-S model was ultimately selected as the optimal solution. This model incorporates the Inner-CIoU loss function to enhance the model's positioning accuracy for small targets; integrates the AFPN4 multi-scale feature fusion structure, extending its feature fusion layers to four levels to strengthen the extraction of small target features; and leverages the adaptive focusing capability of dynamic snake-shaped convolution by designing the C2f-SnakeConv module to replace the cascade residual module in AFPN4, thereby enhancing the feature expression and learning ability of AFPN4 for small targets. On the test set of HERIDAL, compared with the baseline model YOLOv8-s, the proposed model achieved improvements of 4.5%, 4.1%, and 4.2% in AP50, AP75 and mAP metrics, respectively. On the HERIDAL generalization ability test set, the improvements in AP50, AP75, and mAP metrics were 4.8%, 5.9% and 4.6%, respectively. These results demonstrate that the IC-A-S improved model significantly enhances the detection accuracy of UAV-based search and rescue operations in wilderness areas.
[1]史会新.小型无人机在海上搜救活动中的应用研究[J].价值工程,2023,42(13):97-100.SHI X.Research on the application of small unmanned aerial vehicles in maritime search and rescue activities[J].Value Engineering,2023,42(13):97-100.
[2]李楠,薛建凯,舒慧生基于自适应t分布变异麻雀搜索算法的无人机航迹规划[J].东华大学学报(自然科学版),2022,48(3):69-74.LI N,XUE K,SHU H S.A sparrow search algorithm with adaptive t distribution mutation-based path planning of unmanned aerial vehicles[J].Journal of Donghua University(Natural Science),2022,48(3):69-74.
[3]REN Q,HE K M,GIRSHICK R,et al.Faster R-CNN:towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
[4]REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:unified,real-time object detection[C]//2016 IEEEConference on Computer Vision and Pattern Recognition(CVPR).Las Vegas,NV,USA.IEEE,2016:779-788.
[5]TIAN Z,CHU X,WANG X,et al.Fully convolutional onestage 3d object detection on lidar range images[J].Advances in Neural Information Processing Systems,2022,35:34899-34911.
[6]LIU W,ANGUELOV D,ERHAN D,et al.SSD:single shot Multi Bxo detector[M]//Computer Vision-ECCV 2016.Cham:Springer International Publishing,2016:21-37.
[7]BOŽIC'-ŠTULIC'D,MARUŠICŽ,GOTOVAC S.Deep learning approach in aerial imagery for supporting land search and rescue missions[J].International Journal of Computer Vision,2019,127(9):1256-1278.
[8]MARUŠICŽ,BOŽIC'-ŠTULIC'D,GOTOVAC S,et al.Region proposal approach for human detection on aerial imagery[C]//2018 3rd International Conference on Smart and Sustainable Technologies (Spli Tche).Split,Croatia.IEEE,2018:1-6.
[9]VASIC'K,PAPIC'V.Multimodel deep learning for person detection in aerial images[J].Electronics,2020,9(9):1459.
[10]ZHANG H,XU C,ZHANG J.Inner-Io U:more effective intersection over union loss with auxiliary bounding box[EB/OL].https://arxiv.org/abs/2311.02877v4.
[11]ZHENG H,WANG P,LIU W,et al.Distance-Io Uloss:faster and better learning for bounding box regression[EB/OL].https://arxiv.org/abs/1911.08287v1.
[12]QI L,HE Y T,QI X M,et al.Dynamic snake convolution based on topological geometric constraints for tubular structure segmentation[C]//2023 IEEE/CVF International Conference on Computer Vision (ICCV).Paris,France.IEEE,2023:6047-6056.
[13]WANG X,SHIVANNA R,CHENG D,et al.DCN v2:improved deep cross network and practical lessons for webscale learning to rank systems[C]//Proceedings of the Web Conference 2021.Ljubljana,Slovenia.ACM,2021:1785-1797.
[14]WANG H,DAI J F,CHEN Z,et al.Intern Image:exploring large-scale vision foundation models with deformable convolutions[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:14408-14419.
[15]YANG G,LEI J,ZHU Z,et al.AFPN:Asymptotic feature pyramid network for object detection[C]//2023 IEEEInternational Conference on Systems,Man,and Cybernetics(SMC).IEEE,2023:2184-2189.
[16]REZATOFIGHI H,TSOI N,GWAK J,et al.Generalized intersection over union:metric and a loss for bounding box regression[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).Long Beach,CA,USA.IEEE,2019:658-666.
[17]ZHANG F,REN W Q,ZHANG Z,et al.Focal and efficient IOU loss for accurate bounding box regression[J].Neurocomputing,2022,506:146-157.
[18]GEVORGYAN Z.SIo U loss:more powerful learning for bounding box regression[EB/OL].https://arxiv.org/abs/2205.12740v1.
[19]LIU S,QI L,QIN F,et al.Path aggregation network for instance segmentation[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City,UT,USA.IEEE,2018:8759-8768.
[20]LIU T,HUANG D,WANG Y H.Learning spatial fusion for single-shot object detection[EB/OL].https://arxiv.org/abs/1911.09516v2.
[21]YU F,KOLTUN V.Multi-scale context aggregation by dilated convolutions[EB/OL].https://arxiv.org/abs/1511.07122v3.
[22]DAI F,QI H Z,XIONG Y W,et al.Deformable convolutional networks[C]//2017 IEEE International Conference on Computer Vision (ICCV).Venice,Italy.IEEE,2017:764-773.
[23]PYRRÖP,NASERI H,JUNG A.Rethinking drone-based search and rescue with aerial person detection[EB/OL].https://arxiv.org/abs/2111.09406v1.
[24]WANG Y,YEH I H,MARK LIAO H Y.YOLOv9:learning what you want to learn using programmable gradient information[C]//Computer Vision-ECCV 2024.Cham:Springer,2025:1-21.
[25]WANG A,CHEN H,LIU H,et al.YOLOv10:real-time end-to-end object detection[EB/OL].https://arxiv.org/abs/2405.14458v2.
[26]KHANAM R,HUSSAIN M.YOLOv11:an overview of the key architectural enhancements[EB/OL].https://arxiv.org/abs/2410.17725v1.
基本信息:
DOI:10.19886/j.cnki.dhdz.2024.0430
中图分类号:V19;X4;TP391.41
引用信息:
[1]石科,赵曙光.多尺度特征可变聚焦的无人机人员搜救检测算法[J].东华大学学报(自然科学版),2025,51(06):10-18.DOI:10.19886/j.cnki.dhdz.2024.0430.