基於深度學習之多連接模塊對於物件偵測的影響

李政霖; Li, Cheng-Lin

基於深度學習之多連接模塊對於物件偵測的影響

dc.contributor	蘇崇彥	zh_TW
dc.contributor	Su, Chung-Yen	en_US
dc.contributor.author	李政霖	zh_TW
dc.contributor.author	Li, Cheng-Lin	en_US
dc.date.accessioned	2023-12-08T07:47:15Z
dc.date.available	2022-07-18
dc.date.available	2023-12-08T07:47:15Z
dc.date.issued	2022
dc.description.abstract	在本論文中，我們提出與YOLOv5不同的加深網路模型的方法，並設計了三種適用於特定資料集的多連接模塊(Multi-Connection)。多連接模塊的主要目的是重用特徵並保留輸入特徵以供向下傳遞。我們在8個公開的資料集驗證我們的方法。我們改進了YOLOv5中的殘差塊(Residual block)。實驗結果顯示，與YOLOv5s6相比，YOLOv5s6加入多連接模塊型一在Global Wheat Head Dataset 2020上的平均精度(mAP)提高1.6%; YOLOv5s6加入多連接模塊型二在PlantDoc 資料集上的 mAP 提高2.9%;YOLOv5s6加入多連接模塊型三的mAP在PASCAL Visual Object Classes(VOC)資料集上提高了2.9%。另一方面，我們也比較了一般的傳統深化模型的方法。一般來說，加深網絡模型會提高模型的學習能力，但我們認為對於不同的資料集，採用不同的策略可以獲得更高的準確率。此外我們設計多連接模塊型四，應用在交通號誌偵測上，多連接模塊型四之一基於殘差塊做堆疊增加網路深度，來加強網路的學習能力，並加入壓縮和激勵模塊(SE block)，來強化特徵圖資訊，另外透過一個額外的跳連接鼓勵特徵重用。多連接模塊型四之二，主要是將多連接模塊型四之一的通道減半，來減少模型計算量跟參數量。多連接模塊型四之三我們基於多連接模塊型四之二多增加一個3乘3的卷積提升模型學習能力。我們選擇TT100K資料集來訓練模型，我們也收集了臺灣交通號誌當作客製化資料集，去驗證我們的方法，目的是要設計出一個高效性能的模塊，所以設計出多連接模塊型四之三。在TT100K資料集中多連接模塊型四之三獲得最好的表現，與YOLOv5s6相比計算量僅增加了11%，mAP提升了3.2%，犧牲一點計算量換來模型準確率有感的提升，此外我們也在其他公開的資料集驗證我們的方法，多連接模塊型四之三的表現也是非常有效益的。	zh_TW
dc.description.abstract	In this paper, we propose a different method of deepening the network model from YOLOv5s6 and design three types of Multi-Connection (MC) blocks that are suitable for specific datasets. The main purpose of the Multi-Connection block is to reuse features and retain input features for passing down. Eight public datasets and one customized dataset are run for verification. We improve the residual block in YOLOv5. The experimental results show that compared with the YOLOv5s6, the mean average precision (mAP) of YOLOv5s6 with MC type I is improved by 1.6% on the Global Wheat Head Dataset 2020. Compared with the YOLO5v5s6, the mAP of YOLOv5s6 with MC type II is improved by 2.9% on the PlantDoc dataset. Compared with the YOLO5v5s6 the mAP of YOLOv5s6 with MC type III is improved by 2.9% on the PASCAL Visual Object Classes (VOC) dataset. The MC block has a better performance than YOLOv5s6. On the other hand, we also compare the traditional general method of deepening the model (double residual block). In general, deepening the network model will improve the learning ability of the model, but we believe that for different datasets, adopting different strategies can get higher accuracy.In addition, we also improved the Multi-Connection block and designed the MC block type IV, which is applied to traffic sign detection. The MC block type IV-1 is based on the stacking of residual blocks to increase the network depth to enhance the network's learning ability. We add Squeeze-and-Excitation (SE) blocks to enhance feature map information and encourage feature reusing through an additional connection. The Multi-Connection block type IV-2 mainly halves the channels of the Multi-Connection block type IV-1 to reduce the model calculation and parameters. Multi-Connection block type IV-3, we add a 3-by-3 convolution based on the Multi-Connection block type IV-2 to improve the learning ability of the model. We choose the traffic signs dataset TT100K to train the model, and we also collected Taiwan traffic signs as a customized dataset to validate our method. The purpose is to design a high-performance block, so the Multi-Connection block type IV-3 is designed, and the Multi-Connection block type IV-3 achieves the best performance in the TT100K dataset. Compared with YOLOv5s6, the Multi-Connection block type IV-3 has the best performance in the TT100K dataset, with a mere 11% increase in computation and a 3.2% increase in mAP. Sacrificing a little calculation in exchange for a significant improvement in the accuracy of the model, in addition, we are also verifying our method in other public datasets, and the performance of the Multi-Connection block type IV-3 is also very beneficial.	en_US
dc.description.sponsorship	電機工程學系	zh_TW
dc.identifier	60975040H-41508
dc.identifier.uri	https://etds.lib.ntnu.edu.tw/thesis/detail/fef0735c5c2a9c0e06ade6b0c2233855/
dc.identifier.uri	http://rportal.lib.ntnu.edu.tw/handle/20.500.12235/120318
dc.language	中文
dc.subject	深度學習	zh_TW
dc.subject	物件偵測	zh_TW
dc.subject	YOLOv5	zh_TW
dc.subject	多連接模塊	zh_TW
dc.subject	殘差模塊	zh_TW
dc.subject	Deep learning	en_US
dc.subject	Object detection	en_US
dc.subject	YOLOv5	en_US
dc.subject	Multi-Connection block	en_US
dc.subject	Residual block	en_US
dc.title	基於深度學習之多連接模塊對於物件偵測的影響	zh_TW
dc.title	The Impact of Multi-Connection Blocks Based on Deep Learning for Object Detection	en_US
dc.type	etd

Files

Original bundle

Now showing 1 - 1 of 1

Name:: 202200041508-103788.pdf
Size:: 44.19 MB
Format:: Adobe Portable Document Format
Description:: etd

Download

Collections

學位論文