Automatic Music Transcription using Audio-Visual Fusion for Violin Practice in Home Environment
dc.contributor.author | ZHANG, Bingjun | en_US |
dc.contributor.author | WANG, Ye | en_US |
dc.date.accessioned | 2009-07-03T09:02:40Z | en_US |
dc.date.accessioned | 2017-01-23T07:00:11Z | |
dc.date.available | 2009-07-03T09:02:40Z | en_US |
dc.date.available | 2017-01-23T07:00:11Z | |
dc.date.issued | 2009-07-03T09:02:40Z | en_US |
dc.description.abstract | Violin practice in a home environment, where there is often no teacher available, can benefit from automatic music transcription to provide feedback to the student. This paper describes a high performance violin transcription system with three main contributions. First, as onset detection is an important but challenging task for automatic transcription of pitched non-percussive music, such as from the violin, we propose an effective audio-only onset detection approach based on supervised learning. The proposed approach outperforms the state-of-the-art methods substantially. Second, we introduce the visual modality, i.e., bowing and fingering of the violin playing, to infer onsets, thus enhancing the audio-only onset detection. We devise automatic and real-time video processing algorithms to extract indicative features of onsets from bowing and fingering videos. Third, we evaluate state-of-the-art multimodal fusion techniques to fuse audio and visual modalities and show this improves onset detection and transcription performance significantly. The audio-visual fusion based violin transcription system provides more accurate transcribed results as learning feedback even in acoustically inferior environments. With efficient and fully automatic audio-visual analysis components, the system can be easily deployed in a home environment. | en_US |
dc.format.extent | 929564 bytes | en_US |
dc.format.mimetype | application/pdf | en_US |
dc.identifier.uri | https://dl.comp.nus.edu.sg/xmlui/handle/1900.100/3056 | en_US |
dc.language.iso | en | en_US |
dc.relation.ispartofseries | TRA7/09 | en_US |
dc.title | Automatic Music Transcription using Audio-Visual Fusion for Violin Practice in Home Environment | en_US |
dc.type | Technical Report | en_US |