|
DSpace at School of Computing, NUS >
School of Computing >
Technical Reports >
Please use this identifier to cite or link to this item:
http://hdl.handle.net/1900.100/3056
|
| Title: | Automatic Music Transcription using Audio-Visual Fusion for Violin Practice in Home Environment |
| Authors: | ZHANG, Bingjun WANG, Ye |
| Issue Date: | 3-Jul-2009 |
| Series/Report no.: | ;TRA7/09 |
| Abstract: | Violin practice in a home environment, where there is often no teacher available, can benefit from automatic music transcription to provide feedback to the student. This paper describes a high performance violin transcription system with three main contributions. First, as onset detection is an important but challenging task for automatic transcription of pitched non-percussive music, such as from the violin, we propose an effective audio-only onset detection approach based on supervised learning. The proposed approach outperforms the state-of-the-art methods substantially. Second, we introduce the visual modality, i.e., bowing and fingering of the violin playing, to infer onsets, thus enhancing the audio-only onset detection. We devise automatic and real-time video processing algorithms to extract indicative features of onsets from bowing and fingering videos. Third, we evaluate state-of-the-art multimodal fusion techniques to fuse audio and visual modalities and show this improves on... |
| URI: | http://hdl.handle.net/1900.100/3056 |
| Appears in Collections: | Technical Reports
|
Files in This Item:
| File |
Size | Format |
| TRA7-09.pdf | 907Kb | Adobe PDF | View/Open |
|
Show full item record
All items in DSpace are protected by copyright, with all rights reserved.
|