|
  |
The implementation for the speaker verification system first
addresses the issue of finding the endpoints of speech in a waveform. The
code which executes the algorithm can be found in the file
locatespeech.m. The algorithm finds the start and end of speech in a given waveform,
allowing the speech to be removed and analyzed. Our implementation uses this algorithm
for the short-time magnitude analysis of the speech. The endpoint detection algorithm
is used here, but not in the cutting of the unvoiced regions of the pitch track. It is important to
note that this algorithm gives the entire region where speech exists in an
input signal. This speech could include voiced regions as well as unvoiced
regions. Voiced speech includes hard consonants such as "ggg" and "ddd",
while unvoiced speech includes fricatives such as "fff" and "sss". For the
short-time magnitude of a speech signal, it is necessary to include all
speech which would be located by this algorithm. However, for short-time
pitch, one is only concerned with voiced regions of speech. As a result,
this algorithm is not used, and instead, we use the energy in the signal to
find the voiced and unvoiced regions of the pitch track. This, however,
is further developed in the Short-Time Frequency section.
The endpoint detection algorithm functions as follows:
PESSIMISM - "Ever dark cloud has a silver lining, but lightning kills hundreds of people each year who try to find it." |
Sara MacAlpine JP Slavinsky Nipul Bharani Aamir Virani |