ELEC 301 Final Project: Text Independent Speaker Recognition


Results

The output of our system yielded correct identification roughly 70% of the time. As one can see, the pitch similarity values usually narrow down the selection to one or two speakers. For our six speakers, we can adequately use mean pitch alone as a good guess for speaker identification. However, on the trials where our system failed, the pitch information is more similar to another user, as is all the cepstral data. Thus, in our trials, using pitch information did not really play a major role. It should be noted that in each successful trial, the cepstral data alone would have determined the correct speaker.

We should also mention the way we arrived at our final "mean similarity". We used a slight variation of the RMS method, but did not actually average the squares. I.e. mean_similarity = (sim_mfcc^2 + sim_dmfcc^2 + ... )^0.5. All the same, though, the pitch similarity measure in some cases clearly validates the cepstral results being sometimes 10 times greater than that of the other speakers. detailed results are depicted in the table below:

Trial 1:
True Speaker: Male 1.
Identified Speaker: Male 1.
Test sample length: 21.0s
Speaker: Male 1Male 2Female 1Female 2Male 3Male 4
MFCC similarity: 0.17030.15490.15820.17240.18120.1629
DMFCC similarity: 0.18540.16130.15390.17550.16230.1615
DDMFCC similarity: 0.17870.17000.14370.18180.16550.1603
Pitch similarity: 0.53060.07480.07360.07180.18510.0641
Mean similarity: 0.30870.28090.26340.30590.29430.2799

Trial 2:
True Speaker: Male 1.
Identified Speaker: Male 1.
Test sample length: 10.25s
Speaker: Male 1Male 2Female 1Female 2Male 3Male 4
MFCC similarity: 0.16070.17160.15280.16660.17840.1698
DMFCC similarity: 0.18970.16220.14590.16430.16550.1723
DDMFCC similarity: 0.16910.18120.15100.17160.16050.1665
Pitch similarity: 0.61020.03510.05280.05120.21980.0309
Mean similarity: 0.30070.29770.25970.29020.29150.2937

Trial 3:
True Speaker: Male 2.
Identified Speaker: Male 2.
Test sample length: 19.25s
Speaker: Male 1Male 2Female 1Female 2Male 3Male 4
MFCC similarity: 0.14260.19520.14690.15730.17130.1868
DMFCC similarity: 0.15540.18960.14750.15550.16960.1825
DDMFCC similarity: 0.15000.19850.14270.16520.16330.1803
Pitch similarity: 0.12730.39980.06330.06240.09760.2495
Mean similarity: 0.25880.33680.25240.27610.29110.3173

Trial 4:
True Speaker: Male 2.
Identified Speaker: Male 2.
Test sample length: 13.75s
Speaker: Male 1Male 2Female 1Female 2Male 3Male 4
MFCC similarity: 0.13680.20270.15350.15930.16020.1876
DMFCC similarity: 0.15660.18780.15050.16580.15980.1795
DDMFCC similarity: 0.14710.20480.14570.16440.15860.1794
Pitch similarity: 0.07560.53790.04030.03980.05980.2466
Mean similarity: 0.25470.34390.25960.28270.27630.3156

Trial 5:
True Speaker: Female 1.
Identified Speaker: Male 3.
Test sample length: 10.75s
Speaker: Male 1Male 2Female 1Female 2Male 3Male 4
MFCC similarity: 0.13900.16580.17700.16210.20500.1512
DMFCC similarity: 0.16550.17350.15270.17160.17660.1601
DDMFCC similarity: 0.15300.18920.14900.17210.16950.1673
Pitch similarity: 0.08440.04250.34540.37880.10920.0396
Mean similarity: 0.26480.30550.27720.29210.31930.2765

Trial 6:
True Speaker: Female 1.
Identified Speaker: Female 2.
Test sample length: 20.5s
Speaker: Male 1Male 2Female 1Female 2Male 3Male 4
MFCC similarity: 0.14460.16310.17130.17680.18570.1585
DMFCC similarity: 0.16150.17230.16390.16840.16860.1653
DDMFCC similarity: 0.15940.18480.15190.17600.16590.1620
Pitch similarity: 0.10150.05360.32270.34400.12790.0502
Mean similarity: 0.26910.30070.28160.30100.30070.2805

Trial 7:
True Speaker: Female 2.
Identified Speaker: Female 2.
Test sample length: 20.0s
Speaker: Male 1Male 2Female 1Female 2Male 3Male 4
MFCC similarity: 0.13130.16930.17100.18050.19910.1488
DMFCC similarity: 0.15760.18220.15890.17930.16570.1563
DDMFCC similarity: 0.15330.18450.15420.17930.16150.1673
Pitch similarity: 0.06190.02970.36880.42940.08260.0276
Mean similarity: 0.25610.30470.27970.31120.30520.2731

Trial 8:
True Speaker: Male 3.
Identified Speaker: Male 3.
Test sample length: 18.0s
Speaker: Male 1Male 2Female 1Female 2Male 3Male 4
MFCC similarity: 0.14100.15990.15250.17680.20240.1673
DMFCC similarity: 0.16620.17010.15540.17130.17280.1643
DDMFCC similarity: 0.16260.17690.15160.17960.16730.1621
Pitch similarity: 0.11950.02600.06800.06510.69800.0234
Mean similarity: 0.27190.29290.26530.30470.31430.2851

Trial 9:
True Speaker: Male 3.
Identified Speaker: Male 3.
Test sample length: 21.5s
Speaker: Male 1Male 2Female 1Female 2Male 3Male 4
MFCC similarity: 0.14210.15860.16060.17650.19770.1645
DMFCC similarity: 0.16880.16610.15810.16540.17300.1686
DDMFCC similarity: 0.17090.17820.15530.16890.16300.1638
Pitch similarity: 0.14810.03480.09910.09460.59220.0313
Mean similarity: 0.27910.29070.27370.29500.30920.2869

Trial 10:
True Speaker: Male 4.
Identified Speaker: Male 4.
Test sample length: 14.75s
Speaker: Male 1Male 2Female 1Female 2Male 3Male 4
MFCC similarity: 0.13420.18110.14850.16110.15570.2194
DMFCC similarity: 0.15530.17590.15260.15990.15800.1983
DDMFCC similarity: 0.15990.18430.15110.17470.15270.1774
Pitch similarity: 0.10810.28230.07340.07270.09460.3688
Mean similarity: 0.26020.31260.26110.28640.26930.3448

Trial 11:
True Speaker: Male 4.
Identified Speaker: Male 2.
Test sample length: 21.6s
Speaker: Male 1Male 2Female 1Female 2Male 3Male 4
MFCC similarity: 0.13860.17430.18250.15010.17580.1788
DMFCC similarity: 0.15280.18300.16020.15560.16630.1821
DDMFCC similarity: 0.15630.19260.15570.16760.16010.1677
Pitch similarity: 0.02170.74230.01240.01230.01770.1936
Mean similarity: 0.25880.31770.28840.27350.29020.3053

Trial 12:
True Speaker: Male 4.
Identified Speaker: Male 2.
Test sample length: 10.5s
Speaker: Male 1Male 2Female 1Female 2Male 3Male 4
MFCC similarity: 0.13470.17270.15560.17620.18850.1722
DMFCC similarity: 0.15310.17960.15750.17400.16450.1712
DDMFCC similarity: 0.15350.19080.15610.16520.16460.1697
Pitch similarity: 0.02710.77520.01510.01490.02190.1459
Mean similarity: 0.25530.31380.27100.29770.29950.2963

Results