Item Details

Determination of Likelihood Ratios for Forensic Voice Comparison Using Principal Component Analysis

Issue: Vol 21 No. 1 (2014)

Journal: International Journal of Speech Language and the Law

Subject Areas: Linguistics

Abstract:

The likelihood ratio (LR) framework is gaining increasing acceptance amongst forensic speech scientists when undertaking forensic voice comparison. Multivariate Kernel Density (MVKD) is one approach that has been used for calculating LRs when the number of parameters is in the region of 3 or 4. However there could be robustness issues with this approach when the number of parameters is larger than this. In this paper we present an alternative to the MVKD approach, termed Principal Component Analysis Kernel Density Likelihood Ratio (PCAKLR), which takes account of within-segment correlations, yet is computationally robust irrespective of the number of parameters used. We show that PCAKLR produces comparable results to MVKD for small numbers of parameters. Further, it also has the ability to directly handle between-segment correlations and is thus an alternative to the logistic-regression fusion typically used to combine results from multiple segments.

Author: Balamurali Nair, Esam Alzqhoul, Bernard John Guillemin

View Original Web Page

References :

Aitken,C.G.G.(1995)StatisticsandtheEvaluationofEvidenceforForensicScientists.NewYork:J.Wiley.
Aitken,C.G.G.andLucy,D.(2004)Evaluationoftraceevidenceintheformofmultivariatedata.JournaloftheRoyalStatisticalSociety:SeriesC(AppliedStatistics)53(1):109–122.
Aitken,C.G.G.andTaroni,F.(2004)StatisticsandtheEvaluationofEvidenceforForensicsScientists,vol.10.NewYork:JohnWiley&SonsInc.
Becker,T.,Jessen,M.andGrigoras,C.(2008)ForensicspeakerveriﬁcationusingformantfeaturesandGaussianmixturemodels.ProceedingsofInterspeech,InternationalSpeechCommunicationAssociation:1505–1508.
Becker,T.,Jessen,M.andGrigoras,C.(2009)SpeakerveriﬁcationbasedonformantsusingGaussianmixturemodels.ProceedingsofNAG/DAGAInternationalConferenceonAcoustics,Rotterdam:1640–1643.
Brümmer,N.(2004)Application-independentevaluationofspeakerdetection.ODYSSEY04-TheSpeakerandLanguageRecognitionWorkshop,Toledo,Spain.
Brümmer,N.,Burget,L.,etal.(2007)FusionofheterogeneousspeakerrecognitionsystemsintheSTBUsubmissionfortheNISTspeakerrecognitionevaluation2006.IEEETransactionsonAudio,Speech,andLanguageProcessing15(7):2072–2084.http://dx.doi.org/10.1109/TASL.2007.902870
Brümmer,N.andduPreez,J.(2006)Application-independentevaluationofspeakerdetection.ComputerSpeechandLanguage20(2):230–275.
Cao,L.,Chua,K.,Chong,W.K.,Lee,H.P.andGu,Q.M.(2003)AcomparisonofPCA,KPCAandICAfordimensionalityreductioninsupportvectormachine.Neurocomputing55(1):321–336.
Cheney,E.W.andKincaid,D.R.(2007)NumericalMathematicsandComputing.Stamford:Brooks/ColePub.Co.
Edelman,A.(1989)Eigenvaluesandconditionnumbersofrandommatrices.PhdThesisMassachusettsInstituteofTechnology,Cambridge,MA.
Gold,E.andFrench,P.(2011)Internationalpracticesinforensicspeakercomparison.InternationalJournalofSpeech,LanguageandtheLaw18(2):293–307.
Golub,G.andKahan,W.(1965)Calculatingthesingularvaluesandpseudo-inverseofamatrix.JournaloftheSocietyforIndustrialandAppliedMathematics:SeriesB,NumericalAnalysis2(2):205–224.
Gonzalez-Rodriguez,J.,Drygajlo,A.,Ramos-Castro,D.,Garcia-Gomar,M.andOrtega-Garcia,J.(2006)Robustestimation,interpretationandassessmentoflikelihoodratiosinforensicspeakerrecognition.ComputerSpeechandLanguage20(2–3):331–355.http://dx.doi.org/10.1016/j.csl.2005.08.005
Gonzalez-Rodriguez,J.,Rose,P.,Ramos,D.,Toledano,D.T.andOrtega-Garcia,J.(2007)EmulatingDNA:Rigorousquantiﬁcationofevidentialweightintransparentandtestableforensicspeakerrecognition.IEEETransactionsonAudio,Speech,andLanguageProcessing15(7):2104–2115.http://dx.doi.org/10.1109/TASL.2007.902747
Hollander,M.,WolfeD.A.andChicken,E.(2013)NonparametricStatisticalMethods,vol.751.NewYork:JohnWiley&Sons.
Jackson,J.E.andWiley,J.(1991)AUser’sGuidetoPrincipalComponents.NewYork:WileyOnlineLibrary.
Jolliﬀe,I.T.(2002)Principalcomponentanalysis.EncyclopediaofStatisticsinBehavioralScience.NewYork:Springer.
Jolliﬀe,I.T.(1986)PrincipalComponentAnalysis.NewYork:Springer-Verlag.
Khodai-Joopari,M.(2006)Forensicspeakeranalysisandidentiﬁcationbycomputer.ABayesianapproachanchoredinthecepstraldomain.UnpublishedPhDThesis,UniversityofNewSouthWales,Australia.
Lewis,S.(1984)Philosophyofspeakeridentiﬁcation.Policeapplicationsofspeechandtaperecordinganalysis.ProceedingoftheInstituteofAcoustics6(1):69–77.
Lindley,D.(1977)Aprobleminforensicscience.BiometrikaTrust64(2):207–213.http://dx.doi.org/10.1093/biomet/64.2.207
Meuwly,D.andDrygajlo,A.(2001)ForensicspeakerrecognitionbasedonaBayesianframeworkandGaussianMixtureModelling(GMM).ASpeakerOdyssey-TheSpeakerRecognitionWorkshop,Crete,Greece.
Morrison,G.S.(2009)Likelihood-ratioforensicvoicecomparisonusingparametricrepresentationsoftheformanttrajectoriesofdiphthongs.JournaloftheAcousticalSocietyofAmerica125(4):2387–2397.http://dx.doi.org/10.1121/1.3081384
Morrison,G.S.(2010)Forensicvoicecomparison.ExpertEvidence,ThomsonReuters,Sydney,Australia40:1–105.
Morrison,G.S.(2011a)Acomparisonofproceduresforthecalculationofforensiclikelihoodratiosfromacoustic-phoneticdata:Multvariatekerneldensity(MVKD)versusGaussianmixturemodel-universalbackgroundmodel(GMM-UBM).SpeechCommunication53(2):242–256.
Morrison,G.S.(2011b)Measuringthevalidityandreliabilityofforensiclikelihood-ratiosystems.ScienceandJustice51(3):91–98.
Pigeon,S.,Druyts,P.andVerlinde,P.(2000)ApplyinglogisticregressiontothefusionoftheNIST’991-speakersubmissions.DigitalSignalProcessing10(1):237–248.http://dx.doi.org/10.1006/dspr.1999.0358
Ramos-Castro,D.(2007)Forensicevaluationoftheevidenceusingautomaticspeakerrecognitionsystems.PhDDissertation,UniversidadautonomadeMadrid.
Ramos-Castro,D.,Gonzalez-Rodriguez,J.andOrtega-Garcia,J.(2006)Likelihoodratiocalibrationinatransparentandtestableforensicspeakerrecognitionframework.TheSpeakerandLanguageRecognitionWorkshop,ProceedingsofIEEEOdyssey:1–8.
Reynolds,D.A.,Quatieri,T.F.,etal.(2000)SpeakerveriﬁcationusingadaptedGaussianmixturemodels.DigitalSignalProcessing10(1):19–41.http://dx.doi.org/10.1006/dspr.1999.0361
Rose,P.(2002)ForensicSpeakerIdentiﬁcation.London,NewYork:Taylor&Francis.
Rose,P.(2003)Thetechnicalcomparisonofforensicvoicesamples.ExpertEvidence,ThomsonLawbookCompany,Sydney,Australia99:1–126.
Rose,P.(2006a)Accountingforcorrelationinlinguistic-acousticlikelihoodratio-basedforensicspeakerdiscrimination.TheSpeakerandLanguageRecognitionWorkshop,ProceedingsofIEEEOdyssey:1–8.
Rose,P.(2006b)Technicalforensicspeakerrecognition:evaluation,typesandtestingofevidence.ComputerSpeechandLanguage20(2):159–191.http://dx.doi.org/10.1016/j.csl.2005.07.003
Rose,P.(2010)TheeﬀectofcorrelationonstrengthofevidenceestimatesinForensicVoiceComparison:uni-andmultivariateLikelihoodRatio-baseddiscriminationwithAustralianEnglishvowelacoustics.InternationalJournalofBiometrics2(4):316–329.http://dx.doi.org/10.1504/IJBM.2010.035447
Rose,P.(2011)ForensicvoicecomparisonwithJapanesevowelacoustics-alikelihoodratio-basedapproachsegmentalcepstra.Proceedingsofthe17thInternationalCongressofPhoneticSciences:1718–1721.
Rose,P.,Osanai,T.andKinoshita,Y.(2003)Strengthofforensicspeakeridentiﬁcationevidence:multispeakerformant-andcepstrum-basedsegmentaldiscriminationwithaBayesianlikelihoodratioasthreshold.InternationalJournalofSpeechLanguageandtheLaw10(2):179–202.
Seneﬀ,S.andZue,V.(1988)Transcriptionandalignmentofthetimitdatabase.TIMITCD-ROMDocumentation.
Shlens,J.(2005)Atutorialonprincipalcomponentanalysis.SystemsNeurobiologyLaboratory82,UniversityofCaliforniaatSanDiego.
Singh,S.andT.Murry(1978)Multidimensionalclassiﬁcationofnormalvoicequalities.JournaloftheAcousticalSocietyofAmerica64(1):81–87.http://dx.doi.org/10.1121/1.381958
Stevens,K.N.(1971)Sourcesofinter-andintra-speakervariabilityintheacousticpropertiesofspeechsounds.Proceedingsofthe7thInternationalCongressofPhoneticSciences:206–232.
Tipping,M.E.andBishop,C.M.(1999)Probabilisticprincipalcomponentanalysis.JournaloftheRoyalStatisticalSociety:SeriesB(StatisticalMethodology)61(3):611–622.
Trefethen,L.N.andBau,D.(1997)NumericalLinearAlgebra.Pennsylvania:SocietyforIndustrialMathematics,Vol.50.
vanLeeuwen,D.A.andBrümmer,N.(2007)Anintroductiontoapplication-independentevaluationofspeakerrecognitionsystems.InC.Müller(ed.)SpeakerClassiﬁcationI.FundamentalsFeatures,andMethods330-353.Berlin,Heidelberg:Springer.
Wand,M.P.andJones,M.C.(1994)KernelSmoothing.Florida:CrcPress.
Wang,H.andYang,J.(2010)Thecomparisonof‘Idiot'sBayes’andmultivariatekernel-densityinforensicspeakeridentiﬁcationusingChinesevowel/a/.Proceedingsofthe3rdInternationalCongressonImageandSignalProcessing(CISP2010)8:3533–3537.
Zou,H.,Hastie,T.andTibshirani,R.(2006)Sparseprincipalcomponentanalysis.JournalofComputationalandGraphicalStatistics15(2):265–286.http://dx.doi.org/10.1198/106186006X113430
Zue,V.,Seneﬀ,S.andGlass,J.(1990)SpeechdatabasedevelopmentatMIT:TIMITandbeyond.SpeechCommunication9(4):351–356.http://dx.doi.org/10.1016/0167-6393(90)90010-7