Our project has lots of room for growth and improvement still. It has led us to uncountable more pathways to follow:
Extraction! A whole project on its own would be to write a program that could extract vowels from speech signals for our formant analysis program. We still haven't given up on the possibility of applying wavelet analysis to locate periodic blocks in a speech signal.
The next step in making this vowel recognition more professional would be to continuously calculate formants and pitch, with some sort of windowing process. This would allow improved vowel recognition and lead us to consonant recognition: transient activity of formants just before and just after consonants gives clues about the consonants themselves. We could use that fabulous frequency tracking program (see Their Finest Hour's function trackplot.m) to track the formant frequencies.
Developed by us, but not implemented, are formant estimation algorithms for the cases in which two formants are close together and two distinct peaks are not discernible by our formant finding program. Not locating one of three formants completely throws off our vowel recognition.
More in-depth speaker recognition can be built into the system with more quantitative analysis of harmonics, rather than simple pitch determination. Something along the lines of the approach by 1995 group P-Squared.
Speed up peaks.m, currently the slowest component of our program. P-squared used a mex-file, other better methods surely exist.
Incorporate the SGI recording function (also developed by none other than P-Squared) to record sound data directly from matlab. Currently it is easier to do the extraction (by hand) from the SGI soundeditor and soundfiler programs.