Repository logo
  • English
  • Català
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Latviešu
  • Magyar
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Yкраї́нська
  • Log In
    or
    New user? Click here to register.Have you forgotten your password?
Repository logo
  • Communities & Collections
  • All of DSpace
  • English
  • Català
  • Čeština
  • Deutsch
  • Español
  • Français
  • Gàidhlig
  • Latviešu
  • Magyar
  • Nederlands
  • Polski
  • Português
  • Português do Brasil
  • Suomi
  • Svenska
  • Türkçe
  • Қазақ
  • বাংলা
  • हिंदी
  • Ελληνικά
  • Yкраї́нська
  • Log In
    or
    New user? Click here to register.Have you forgotten your password?
  1. Home
  2. Browse by Author

Browsing by Author "ZHANG, Bingjun"

Now showing 1 - 1 of 1
Results Per Page
Sort Options
  • No Thumbnail Available
    Item
    Automatic Music Transcription using Audio-Visual Fusion for Violin Practice in Home Environment
    (2009-07-03T09:02:40Z) ZHANG, Bingjun; WANG, Ye
    Violin practice in a home environment, where there is often no teacher available, can benefit from automatic music transcription to provide feedback to the student. This paper describes a high performance violin transcription system with three main contributions. First, as onset detection is an important but challenging task for automatic transcription of pitched non-percussive music, such as from the violin, we propose an effective audio-only onset detection approach based on supervised learning. The proposed approach outperforms the state-of-the-art methods substantially. Second, we introduce the visual modality, i.e., bowing and fingering of the violin playing, to infer onsets, thus enhancing the audio-only onset detection. We devise automatic and real-time video processing algorithms to extract indicative features of onsets from bowing and fingering videos. Third, we evaluate state-of-the-art multimodal fusion techniques to fuse audio and visual modalities and show this improves onset detection and transcription performance significantly. The audio-visual fusion based violin transcription system provides more accurate transcribed results as learning feedback even in acoustically inferior environments. With efficient and fully automatic audio-visual analysis components, the system can be easily deployed in a home environment.

DSpace software copyright © 2002-2025 LYRASIS

  • Cookie settings
  • Privacy policy
  • End User Agreement
  • Send Feedback