用於精細動作辨識的雙頭預測網路

No Thumbnail Available

Date

2021

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

近年深度學習發展迅速,不僅2D影像辨識,現在3D動作辨識也受到關注。動作辨識的研究從3D CNN開始,便在許多數據集得到不錯的效果。但大部分的動作辨識網路,在細部動作的辨識上都有改進的空間,原因是細部動作整體來說和一般的動作差異不大,可能只是在一小段時間內發生的差異,因此十分不好判斷。這個情況在籃球比賽十分常見,籃球比賽中常常有各種肢體碰撞,但是這些肢體碰撞並不一定會造成犯規,要辨識這些犯規就勢必得加強細部動作的偵測。由於現在並沒有相關的資料集讓我們做相關的研究,因此我們自己蒐集資料,建立一個籃球犯規的資料集。在本論文中,我們提出了一種提昇細部動作辨識的網路套用在現有的網路上,包括3D-Resnet50[1]、(2+1)D-Resnet50[2]、I3D-50[3]。實驗結果顯示加入這個網路後,在各種模型的準確度上都獲得3~7%的提升。
In recent years, deep learning has developed rapidly, not only in 2D image recognition, but now 3D action recognition is also attracting attention. The research on action recognition started with 3DCNN, and got good results on many data sets. But most action recognition networks have room for improvement in the recognition of fine-grained actions. The reason is that the fine-grained actions are not much different from the general actions, and may only be differences in a short period of time, so it is very difficult to recognize by current 3D models. This situation is very common in basketball games. There are often various body collisions in basketball games, but these body collisions do not necessarily cause fouls. To identify these fouls, it is necessary to strengthen the detection of fine-grained actions. Since there is no relevant data set for us to do related research, we collect the data ourselves and build a data set of basketball fouls. In this paper, we propose a two-head prediction networks suitable for existing networks, including 3D-Resnet50, (2 +1) D-Resnet50, and I3D-50 to improve the accuracy of action recognition. Experimental results show that after joining the proposed network, the accuracy of various models has been improved in 3~7%.

Description

Keywords

深度學習, 影像辨識, 動作辨識, deep learning, image recognition, action recognition

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By