結合運動特徵之高效 3D卷積神經網路於細粒度足球動作辨識
No Thumbnail Available
Date
2025
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
本研究提出結合運動特徵之高效 3D 卷積神經網路架構,應用於細粒度足球動作辨識任務。近年來,細粒度動作辨識逐漸成為重點研究方向,特別是在體育賽事領域,若能準確辨識如「傳球」、「射門」等細節動作,對於比賽分析具有極高價值。然而,細粒度動作在視覺上具有高度相似性,例如「踢球」與「慢跑」在外觀上極為接近,容易造成模型混淆。此外,即時應用亦帶來挑戰,系統必須能即時處理大量的影像並給出正確判斷,對模型的推論速度要求高。因此,如何在保持辨識準確度的同時兼顧運算效率,為細粒度足球動作辨識領域的核心問題。為了解決上述問題,本研究於 X3D 架構中嵌入運動特徵模組,以強化模型對細微運動變化的感知能力。實驗結果顯示,所提方法在整體準確率達 91.80% ,相較於基線模型提升 3.77%,而計算成本僅為基線模型的 1.04 倍,於辨識準確率與效率之間達成良好平衡,展現出應用於即時體育轉播與比賽分析之潛力。
This study proposes an efficient 3D convolutional neural network architecture that incorporates motion features for fine-grained soccer action recognition. In response to challenges such as high visual similarity between action classes and the need for real-time processing, a motion feature module is embedded into the X3D backbone to enhance the model’s sensitivity to subtle motion changes. Experimental results demonstrate that the proposed method achieves an overall accuracy of 91.80%, representing a 3.77% improvement over the baseline model. In the"Kick" action category, the recall rate reaches 86.20%, outperforming the baseline by 12.07%. Furthermore, the added module introduces only a minimal increase in computational cost, with FLOPs being just 1.04 times that of the original X3D. These results indicate that the proposed approach achieves a favorable balance between recognition accuracy and efficiency, and shows strong potential for deployment in real-time applications such as sports broadcasting and match analysis.
This study proposes an efficient 3D convolutional neural network architecture that incorporates motion features for fine-grained soccer action recognition. In response to challenges such as high visual similarity between action classes and the need for real-time processing, a motion feature module is embedded into the X3D backbone to enhance the model’s sensitivity to subtle motion changes. Experimental results demonstrate that the proposed method achieves an overall accuracy of 91.80%, representing a 3.77% improvement over the baseline model. In the"Kick" action category, the recall rate reaches 86.20%, outperforming the baseline by 12.07%. Furthermore, the added module introduces only a minimal increase in computational cost, with FLOPs being just 1.04 times that of the original X3D. These results indicate that the proposed approach achieves a favorable balance between recognition accuracy and efficiency, and shows strong potential for deployment in real-time applications such as sports broadcasting and match analysis.
Description
Keywords
細粒度足球動作辨識, 運動特徵, 即時應用, fine-grained soccer action recognition, motion features, real-time applications