Aligning Popular Music with Mono MIDI for Singing Pitch Extraction

碩士 === 國立交通大學 === 電信工程研究所 === 101 === Pitch represents the fundamental frequency of voice. It is an important feature in a Query by Singing or Humming (QBSH) system. Currently, using pitch feature to find the most matched song is a popular way in QBSH. The accuracy of pitch detection is hence a crit...

Full description

Bibliographic Details
Main Authors: Cai, Chang-You, 蔡昌祐
Other Authors: Chen, Sin-Horng
Format: Others
Language:zh-TW
Published: 2013
Online Access:http://ndltd.ncl.edu.tw/handle/97053385589858320339
Description
Summary:碩士 === 國立交通大學 === 電信工程研究所 === 101 === Pitch represents the fundamental frequency of voice. It is an important feature in a Query by Singing or Humming (QBSH) system. Currently, using pitch feature to find the most matched song is a popular way in QBSH. The accuracy of pitch detection is hence a critical issue. Although human can recognize singing pitch in a song with music accompaniment, it is not easy for a computer to automatically detect the singing pitch from a song because of the inferences of background music and harmonics. In this thesis, we first use an existing method to extract the melody line of a popular song. The method first depresses the background music to enhance the singing voice. It then uses a method to enhance the pitch signal by summing harmonics. A method to estimate the range of human’s pitch is then applied to eliminate all harmonics. Lastly, it finds the melody line by dynamic programming. Some drawbacks of the method can still be found, including the inaccuracy of pitch tracking at the beginning of singing signal and the existence of melody line at the non-singing part. We hence propose a method to improve it in this study. The method uses the monophonic MIDI signal aligned with the processing song to help to improve the pitch detection. It first computes the MIDI scale spectra of the two signals and sets up a similarity matrix for their alignment. A post-processing is then employed to segment the song and detect unnatural notes. Lastly, it utilizes the aligned MIDI to determine the vocal (singing) segment of the song and recalculates the melody line. Experimental results confirmed the effectiveness of the proposed approach.