Score Fusion for Speaker Identification using MFCC and ICMC Features (Record no. 14688)
[ view plain ]
000 -LEADER | |
---|---|
fixed length control field | 02860nam a22001697a 4500 |
082 ## - DEWEY DECIMAL CLASSIFICATION NUMBER | |
Classification number | 621.36 |
Item number | J837S |
100 ## - MAIN ENTRY--AUTHOR NAME | |
Personal name | Joshi, Sonal |
245 ## - TITLE STATEMENT | |
Title | Score Fusion for Speaker Identification using MFCC and ICMC Features |
Statement of responsibility, etc | by Sonal Joshi |
260 ## - PUBLICATION, DISTRIBUTION, ETC. (IMPRINT) | |
Place of publication | IIT Jodhpur |
Name of publisher | Department of Electrical Engineering |
Year of publication | 2017 |
300 ## - PHYSICAL DESCRIPTION | |
Number of Pages | xi,58p. |
Other physical details | HB |
520 ## - SUMMARY, ETC. | |
Summary, etc | "The task of Speaker Identification (SID) or Speaker Recognition is to recognise the person<br/>from a given speech utterance. That means to answer the question, ”Whose voice is this?” An important<br/>application of SID is in forensics to verify the identity of a suspect. Other than forensic<br/>applications, this technology is used to improve the performance of speech recognition, automatically<br/>adjust preferences as per personal needs like in home automation and identify the speaker in<br/>each segment of a teleconference or newsroom discussion (Speaker Diarization).<br/>Even though real world applications demand robustness against various possible practical<br/>and realistic conditions, generally SID systems have poor performance when there is a mismatch.<br/>Different recording conditions in training and testing data lead to mismatch, which can be in the<br/>form of language mismatch, session mismatch, sensor mismatch or any combination of the above.<br/>To improve speaker recognition performance in mismatch scenarios, score fusion of log-likelihood<br/>scores obtained using Gaussian Mixture Model - Universal Background Model (GMM-UBM) classifier<br/>is explored in this work.<br/>After an initial study of commonly used features using TIMIT database, GMM-UBMs using<br/>Mel Frequency Cepstral Coefficients (MFCC) and recently proposed Infinite impulse response<br/>Constant Q Mel-frequency cepstral Co-efficients (ICMC) features are scored independently. This<br/>work is motivated by the fact fusion of systems using MFCC and ICMC at the score level will lead to<br/>performance gain as both the features have complementary information. Experimental results, obtained<br/>using IITG Multivariability Speaker Recognition Phase-I and Phase-II Database, prove that<br/>the fusion results outperform the independently scored results by a significant margin for all mismatches.<br/>Reported average relative improvements in identification accuracy over baseline MFCC<br/>in percent for 128 mixture gaussian are 1.99% for language mismatch, 4.56% for session mismatch,<br/>5.38% for language and session mismatch, 204.54% for sensor mismatch, and 175.3% for sensor<br/>and session mismatch. Experimental results are also obtained using IITG Multivariability Speaker<br/>Recognition Phase-III which is a truly conversational data collected over phone call.<br/>i"<br/> |
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM | |
Topical Term | MFCC and ICMC Features |
Topical Term | MTech Theses |
Topical Term | Department of Electrical Engineering |
700 ## - ADDED ENTRY--PERSONAL NAME | |
Personal name | Yadav, Sandeep Kumar |
942 ## - ADDED ENTRY ELEMENTS (KOHA) | |
Koha item type | Thesis |
Withdrawn status | Lost status | Damaged status | Not for loan | Collection code | Permanent Location | Current Location | Shelving location | Date acquired | Full call number | Accession Number | Price effective from | Koha item type |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Not For Loan | Reference | S. R. Ranganathan Learning Hub | S. R. Ranganathan Learning Hub | Course Reserve | 2024-01-18 | 621.36 J837S | TM00111 | 2024-01-18 | Thesis |