多重樂器自動採譜之探討

dc.contributor陳柏琳zh_TW
dc.contributor蘇黎zh_TW
dc.contributorChen, Berlinen_US
dc.contributorSu, Lien_US
dc.contributor.author吳宥德zh_TW
dc.contributor.authorWu, Yu-Teen_US
dc.date.accessioned2020-12-14T09:07:43Z
dc.date.available2020-08-11
dc.date.available2020-12-14T09:07:43Z
dc.date.issued2020
dc.description.abstract自動音樂採譜 (Automatic Music Transcription, AMT)是音樂資訊檢索 (Music Information Retrieval, MIR)中最重要的任務之一,由於其訊號的複雜性,它已被視為訊號處理中最具挑戰性的領域之一。在許多 AMT 任務中,多樂器採譜任務是通用採譜系統的關鍵步驟之一,但相關領域的研究卻很少。模型必須在一首樂曲當中,同時辨識多種樂器和其相應音高,而其中包括了不同樂器的各種音色和豐富的諧波(Harmonics),可能導致訊號彼此相互干擾,造成更為複雜的情況,因此與傳統的單樂器採譜研究相比,多樂器採譜成為了一個更進階且複雜的問題。除了存在技術本質上的困難,統整與協調不同層次的採譜問題、處理複雜的交互影響,也需要更加清晰與明確的問題定義,並針對最後的結果發展一套有效的評估方法。 在這項研究中,我們提出了一個多樂器自動採譜的方法。藉由發展一套從訊號層級的特徵工程、到最終評估結果的端到端流程,整合了多項技術以更好的處理此複雜的問題。當中結合了能夠清楚顯現音高特徵的訊號處理技術、新穎的深度學習模型,以及從多目標識別(Multi-object Recognition),實例分割(Instance Segmentation)、計算機視覺中,圖到圖轉換所激發出來的概念,進一步整合新發展的後處理演算法,提出來的系統對於多樂器採譜中的所有子任務,呈現出通用彈性且十分有效率的表現。在針對不同子任務進行綜合評估後,於各項指標上皆表現出了至今為止最優的結果,其中包括了過去從未被研究的多樂器音符層級採譜任務(Note-level Transcription)。zh_TW
dc.description.abstractAutomatic music transcription (AMT), one of the most important tasks in music information retrieval (MIR), has been seen as one of the most challenging field in signal processing because of its inherent complexity of signals. Among many of the AMT tasks, multi-instrument is one critical step for general transcription system, but yet a less investigated field. The requirement of identifying multiple instruments and the corresponding pitch in music performances, which consists of various timbres and rich harmonic information that could interfere with each other, making it a more advanced problem in comparison with the conventional single-instrument AMT problem. Despite the technical difficulties, to orchestrate different levels of the complex problem scopes, a clear definition of problem scenarios and efficient evaluation approaches are also needed. In this research, we propose a multi-instrument AMT approach, with a complete end-to-end flow from signal-level feature engineering to the final evaluation. Combined with signal processing techniques capable of specifying pitch saliency, novel deep learning methods, concepts inspired from multi-object recognition, instance segmentation, and image-to-image translation in computer vision, meanwhile being integrated with a newly developed post-processing algorithm, the proposed system is flexible and efficient for all the sub-tasks in multi-instrument AMT. Comprehensive evaluations on different sub-tasks have shown state-of-the-art performance, including the task of multi-instrument note tracking which has not been investigated before.en_US
dc.description.sponsorship資訊工程學系zh_TW
dc.identifierG060747007S
dc.identifier.urihttp://etds.lib.ntnu.edu.tw/cgi-bin/gs32/gsweb.cgi?o=dstdcdr&s=id=%22G060747007S%22.&
dc.identifier.urihttp://rportal.lib.ntnu.edu.tw:80/handle/20.500.12235/111724
dc.language英文
dc.subject自動音樂採譜zh_TW
dc.subject多音預測zh_TW
dc.subject深度學習zh_TW
dc.subject自注意力機制zh_TW
dc.subject多音多樂器預測zh_TW
dc.subjectautomatic music transcriptionen_US
dc.subjectmulti-pitch estimationen_US
dc.subjectmulti-pitch streamingen_US
dc.subjectdeep learningen_US
dc.subjectself-attentionen_US
dc.title多重樂器自動採譜之探討zh_TW
dc.titleAn Investigation of Multi-Instrument Automatic Music Transcriptionen_US

Files

Original bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
060747007s01.pdf
Size:
2.77 MB
Format:
Adobe Portable Document Format

Collections