Normal view MARC view ISBD view

Score Fusion for Speaker Identification using MFCC and ICMC Features (Record no. 14688)

MARC details
000 -LEADER
fixed length control field	02860nam a22001697a 4500
082 ## - DEWEY DECIMAL CLASSIFICATION NUMBER
Classification number	621.36
Item number	J837S
100 ## - MAIN ENTRY--AUTHOR NAME
Personal name	Joshi, Sonal
245 ## - TITLE STATEMENT
Title	Score Fusion for Speaker Identification using MFCC and ICMC Features
Statement of responsibility, etc	by Sonal Joshi
260 ## - PUBLICATION, DISTRIBUTION, ETC. (IMPRINT)
Place of publication	IIT Jodhpur
Name of publisher	Department of Electrical Engineering
Year of publication	2017
300 ## - PHYSICAL DESCRIPTION
Number of Pages	xi,58p.
Other physical details	HB
520 ## - SUMMARY, ETC.
Summary, etc	"The task of Speaker Identification (SID) or Speaker Recognition is to recognise the person<br/>from a given speech utterance. That means to answer the question, ”Whose voice is this?” An important<br/>application of SID is in forensics to verify the identity of a suspect. Other than forensic<br/>applications, this technology is used to improve the performance of speech recognition, automatically<br/>adjust preferences as per personal needs like in home automation and identify the speaker in<br/>each segment of a teleconference or newsroom discussion (Speaker Diarization).<br/>Even though real world applications demand robustness against various possible practical<br/>and realistic conditions, generally SID systems have poor performance when there is a mismatch.<br/>Different recording conditions in training and testing data lead to mismatch, which can be in the<br/>form of language mismatch, session mismatch, sensor mismatch or any combination of the above.<br/>To improve speaker recognition performance in mismatch scenarios, score fusion of log-likelihood<br/>scores obtained using Gaussian Mixture Model - Universal Background Model (GMM-UBM) classifier<br/>is explored in this work.<br/>After an initial study of commonly used features using TIMIT database, GMM-UBMs using<br/>Mel Frequency Cepstral Coefficients (MFCC) and recently proposed Infinite impulse response<br/>Constant Q Mel-frequency cepstral Co-efficients (ICMC) features are scored independently. This<br/>work is motivated by the fact fusion of systems using MFCC and ICMC at the score level will lead to<br/>performance gain as both the features have complementary information. Experimental results, obtained<br/>using IITG Multivariability Speaker Recognition Phase-I and Phase-II Database, prove that<br/>the fusion results outperform the independently scored results by a significant margin for all mismatches.<br/>Reported average relative improvements in identification accuracy over baseline MFCC<br/>in percent for 128 mixture gaussian are 1.99% for language mismatch, 4.56% for session mismatch,<br/>5.38% for language and session mismatch, 204.54% for sensor mismatch, and 175.3% for sensor<br/>and session mismatch. Experimental results are also obtained using IITG Multivariability Speaker<br/>Recognition Phase-III which is a truly conversational data collected over phone call.<br/>i"<br/>
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical Term	MFCC and ICMC Features

Topical Term	MTech Theses

Topical Term	Department of Electrical Engineering
700 ## - ADDED ENTRY--PERSONAL NAME
Personal name	Yadav, Sandeep Kumar
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Koha item type	Thesis

Holdings
Withdrawn status	Lost status	Damaged status	Not for loan	Collection code	Permanent Location	Current Location	Shelving location	Date acquired	Full call number	Accession Number	Price effective from	Koha item type
			Not For Loan	Reference	S. R. Ranganathan Learning Hub	S. R. Ranganathan Learning Hub	Course Reserve	2024-01-18	621.36 J837S	TM00111	2024-01-18	Thesis

Print
Send to device
Save record
BIBTEX Dublin Core MARC (non-Unicode/MARC-8) MARCXML RIS
More searches

Search for this title in:
Other Libraries (WorldCat) Other Databases (Google Scholar) Online Stores (Bookfinder.com) Open Library (openlibrary.org)