The output of our system yielded correct identification roughly 70% of the time. As one can see,
the pitch similarity values usually narrow down the selection to one or two speakers. For our six
speakers, we can adequately use mean pitch alone as a good guess for speaker identification. However,
on the trials where our system failed, the pitch information is more similar to another user, as is
all the cepstral data. Thus, in our trials, using pitch information did not really play a major role.
It should be noted that in each successful trial, the cepstral data alone would have determined the correct
speaker.
We should also mention the way we arrived at our final "mean similarity". We used a slight variation of the
RMS method, but did not actually average the squares. I.e. mean_similarity = (sim_mfcc^2 + sim_dmfcc^2 + ... )^0.5.
All the same, though, the pitch similarity measure in some cases clearly validates the cepstral results
being sometimes 10 times greater than that of the other speakers.
detailed results are depicted in the table below:
Trial 1: True Speaker: Male 1. Identified Speaker: Male 1. Test sample length: 21.0s
Speaker:
|
Male 1 | Male 2 | Female 1 | Female 2 | Male 3 | Male 4 |
MFCC similarity: |
0.1703 | 0.1549 | 0.1582 | 0.1724 | 0.1812 | 0.1629 |
DMFCC similarity: |
0.1854 | 0.1613 | 0.1539 | 0.1755 | 0.1623 | 0.1615 |
DDMFCC similarity: |
0.1787 | 0.1700 | 0.1437 | 0.1818 | 0.1655 | 0.1603 |
Pitch similarity: |
0.5306 | 0.0748 | 0.0736 | 0.0718 | 0.1851 | 0.0641 |
Mean similarity: |
0.3087 | 0.2809 | 0.2634 | 0.3059 | 0.2943 | 0.2799 |
Trial 2: True Speaker: Male 1. Identified Speaker: Male 1. Test sample length: 10.25s
Speaker:
|
Male 1 | Male 2 | Female 1 | Female 2 | Male 3 | Male 4 |
MFCC similarity: |
0.1607 | 0.1716 | 0.1528 | 0.1666 | 0.1784 | 0.1698 |
DMFCC similarity: |
0.1897 | 0.1622 | 0.1459 | 0.1643 | 0.1655 | 0.1723 |
DDMFCC similarity: |
0.1691 | 0.1812 | 0.1510 | 0.1716 | 0.1605 | 0.1665 |
Pitch similarity: |
0.6102 | 0.0351 | 0.0528 | 0.0512 | 0.2198 | 0.0309 |
Mean similarity: |
0.3007 | 0.2977 | 0.2597 | 0.2902 | 0.2915 | 0.2937 |
Trial 3: True Speaker: Male 2. Identified Speaker: Male 2. Test sample length: 19.25s
Speaker:
|
Male 1 | Male 2 | Female 1 | Female 2 | Male 3 | Male 4 |
MFCC similarity: |
0.1426 | 0.1952 | 0.1469 | 0.1573 | 0.1713 | 0.1868 |
DMFCC similarity: |
0.1554 | 0.1896 | 0.1475 | 0.1555 | 0.1696 | 0.1825 |
DDMFCC similarity: |
0.1500 | 0.1985 | 0.1427 | 0.1652 | 0.1633 | 0.1803 |
Pitch similarity: |
0.1273 | 0.3998 | 0.0633 | 0.0624 | 0.0976 | 0.2495 |
Mean similarity: |
0.2588 | 0.3368 | 0.2524 | 0.2761 | 0.2911 | 0.3173 |
Trial 4: True Speaker: Male 2. Identified Speaker: Male 2. Test sample length: 13.75s
Speaker:
|
Male 1 | Male 2 | Female 1 | Female 2 | Male 3 | Male 4 |
MFCC similarity: |
0.1368 | 0.2027 | 0.1535 | 0.1593 | 0.1602 | 0.1876 |
DMFCC similarity: |
0.1566 | 0.1878 | 0.1505 | 0.1658 | 0.1598 | 0.1795 |
DDMFCC similarity: |
0.1471 | 0.2048 | 0.1457 | 0.1644 | 0.1586 | 0.1794 |
Pitch similarity: |
0.0756 | 0.5379 | 0.0403 | 0.0398 | 0.0598 | 0.2466 |
Mean similarity: |
0.2547 | 0.3439 | 0.2596 | 0.2827 | 0.2763 | 0.3156 |
Trial 5: True Speaker: Female 1. Identified Speaker: Male 3. Test sample length: 10.75s
Speaker:
|
Male 1 | Male 2 | Female 1 | Female 2 | Male 3 | Male 4 |
MFCC similarity: |
0.1390 | 0.1658 | 0.1770 | 0.1621 | 0.2050 | 0.1512 |
DMFCC similarity: |
0.1655 | 0.1735 | 0.1527 | 0.1716 | 0.1766 | 0.1601 |
DDMFCC similarity: |
0.1530 | 0.1892 | 0.1490 | 0.1721 | 0.1695 | 0.1673 |
Pitch similarity: |
0.0844 | 0.0425 | 0.3454 | 0.3788 | 0.1092 | 0.0396 |
Mean similarity: |
0.2648 | 0.3055 | 0.2772 | 0.2921 | 0.3193 | 0.2765 |
Trial 6: True Speaker: Female 1. Identified Speaker: Female 2. Test sample length: 20.5s
Speaker:
|
Male 1 | Male 2 | Female 1 | Female 2 | Male 3 | Male 4 |
MFCC similarity: |
0.1446 | 0.1631 | 0.1713 | 0.1768 | 0.1857 | 0.1585 |
DMFCC similarity: |
0.1615 | 0.1723 | 0.1639 | 0.1684 | 0.1686 | 0.1653 |
DDMFCC similarity: |
0.1594 | 0.1848 | 0.1519 | 0.1760 | 0.1659 | 0.1620 |
Pitch similarity: |
0.1015 | 0.0536 | 0.3227 | 0.3440 | 0.1279 | 0.0502 |
Mean similarity: |
0.2691 | 0.3007 | 0.2816 | 0.3010 | 0.3007 | 0.2805 |
Trial 7: True Speaker: Female 2. Identified Speaker: Female 2. Test sample length: 20.0s
Speaker:
|
Male 1 | Male 2 | Female 1 | Female 2 | Male 3 | Male 4 |
MFCC similarity: |
0.1313 | 0.1693 | 0.1710 | 0.1805 | 0.1991 | 0.1488 |
DMFCC similarity: |
0.1576 | 0.1822 | 0.1589 | 0.1793 | 0.1657 | 0.1563 |
DDMFCC similarity: |
0.1533 | 0.1845 | 0.1542 | 0.1793 | 0.1615 | 0.1673 |
Pitch similarity: |
0.0619 | 0.0297 | 0.3688 | 0.4294 | 0.0826 | 0.0276 |
Mean similarity: |
0.2561 | 0.3047 | 0.2797 | 0.3112 | 0.3052 | 0.2731 |
Trial 8: True Speaker: Male 3. Identified Speaker: Male 3. Test sample length: 18.0s
Speaker:
|
Male 1 | Male 2 | Female 1 | Female 2 | Male 3 | Male 4 |
MFCC similarity: |
0.1410 | 0.1599 | 0.1525 | 0.1768 | 0.2024 | 0.1673 |
DMFCC similarity: |
0.1662 | 0.1701 | 0.1554 | 0.1713 | 0.1728 | 0.1643 |
DDMFCC similarity: |
0.1626 | 0.1769 | 0.1516 | 0.1796 | 0.1673 | 0.1621 |
Pitch similarity: |
0.1195 | 0.0260 | 0.0680 | 0.0651 | 0.6980 | 0.0234 |
Mean similarity: |
0.2719 | 0.2929 | 0.2653 | 0.3047 | 0.3143 | 0.2851 |
Trial 9: True Speaker: Male 3. Identified Speaker: Male 3. Test sample length: 21.5s
Speaker:
|
Male 1 | Male 2 | Female 1 | Female 2 | Male 3 | Male 4 |
MFCC similarity: |
0.1421 | 0.1586 | 0.1606 | 0.1765 | 0.1977 | 0.1645 |
DMFCC similarity: |
0.1688 | 0.1661 | 0.1581 | 0.1654 | 0.1730 | 0.1686 |
DDMFCC similarity: |
0.1709 | 0.1782 | 0.1553 | 0.1689 | 0.1630 | 0.1638 |
Pitch similarity: |
0.1481 | 0.0348 | 0.0991 | 0.0946 | 0.5922 | 0.0313 |
Mean similarity: |
0.2791 | 0.2907 | 0.2737 | 0.2950 | 0.3092 | 0.2869 |
Trial 10: True Speaker: Male 4. Identified Speaker: Male 4. Test sample length: 14.75s
Speaker:
|
Male 1 | Male 2 | Female 1 | Female 2 | Male 3 | Male 4 |
MFCC similarity: |
0.1342 | 0.1811 | 0.1485 | 0.1611 | 0.1557 | 0.2194 |
DMFCC similarity: |
0.1553 | 0.1759 | 0.1526 | 0.1599 | 0.1580 | 0.1983 |
DDMFCC similarity: |
0.1599 | 0.1843 | 0.1511 | 0.1747 | 0.1527 | 0.1774 |
Pitch similarity: |
0.1081 | 0.2823 | 0.0734 | 0.0727 | 0.0946 | 0.3688 |
Mean similarity: |
0.2602 | 0.3126 | 0.2611 | 0.2864 | 0.2693 | 0.3448 |
Trial 11: True Speaker: Male 4. Identified Speaker: Male 2. Test sample length: 21.6s
Speaker:
|
Male 1 | Male 2 | Female 1 | Female 2 | Male 3 | Male 4 |
MFCC similarity: |
0.1386 | 0.1743 | 0.1825 | 0.1501 | 0.1758 | 0.1788 |
DMFCC similarity: |
0.1528 | 0.1830 | 0.1602 | 0.1556 | 0.1663 | 0.1821 |
DDMFCC similarity: |
0.1563 | 0.1926 | 0.1557 | 0.1676 | 0.1601 | 0.1677 |
Pitch similarity: |
0.0217 | 0.7423 | 0.0124 | 0.0123 | 0.0177 | 0.1936 |
Mean similarity: |
0.2588 | 0.3177 | 0.2884 | 0.2735 | 0.2902 | 0.3053 |
Trial 12: True Speaker: Male 4. Identified Speaker: Male 2. Test sample length: 10.5s
Speaker:
|
Male 1 | Male 2 | Female 1 | Female 2 | Male 3 | Male 4 |
MFCC similarity: |
0.1347 | 0.1727 | 0.1556 | 0.1762 | 0.1885 | 0.1722 |
DMFCC similarity: |
0.1531 | 0.1796 | 0.1575 | 0.1740 | 0.1645 | 0.1712 |
DDMFCC similarity: |
0.1535 | 0.1908 | 0.1561 | 0.1652 | 0.1646 | 0.1697 |
Pitch similarity: |
0.0271 | 0.7752 | 0.0151 | 0.0149 | 0.0219 | 0.1459 |
Mean similarity: |
0.2553 | 0.3138 | 0.2710 | 0.2977 | 0.2995 | 0.2963 |
|