Item Details

Determination of Likelihood Ratios for Forensic Voice Comparison Using Principal Component Analysis

Issue: Vol 21 No. 1 (2014)

Journal: International Journal of Speech Language and the Law

Subject Areas: Linguistics

DOI: 10.1558/ijsll.v21i1.83

Abstract:

The likelihood ratio (LR) framework is gaining increasing acceptance amongst forensic speech scientists when undertaking forensic voice comparison. Multivariate Kernel Density (MVKD) is one approach that has been used for calculating LRs when the number of parameters is in the region of 3 or 4. However there could be robustness issues with this approach when the number of parameters is larger than this. In this paper we present an alternative to the MVKD approach, termed Principal Component Analysis Kernel Density Likelihood Ratio (PCAKLR), which takes account of within-segment correlations, yet is computationally robust irrespective of the number of parameters used. We show that PCAKLR produces comparable results to MVKD for small numbers of parameters. Further, it also has the ability to directly handle between-segment correlations and is thus an alternative to the logistic-regression fusion typically used to combine results from multiple segments.

Author: Balamurali Nair, Esam Alzqhoul, Bernard John Guillemin

View Original Web Page

References :

Aitken,C.G.G.(1995)StatisticsandtheEvaluationofEvidenceforForensicScientists.NewYork:J.Wiley.
Aitken,C.G.G.andLucy,D.(2004)Evaluationoftraceevidenceintheformofmultivariatedata.JournaloftheRoyalStatisticalSociety:SeriesC(AppliedStatistics)53(1):109–122.
Aitken,C.G.G.andTaroni,F.(2004)StatisticsandtheEvaluationofEvidenceforForensicsScientists,vol.10.NewYork:JohnWiley&SonsInc.
Becker,T.,Jessen,M.andGrigoras,C.(2008)ForensicspeakerverificationusingformantfeaturesandGaussianmixturemodels.ProceedingsofInterspeech,InternationalSpeechCommunicationAssociation:1505–1508.
Becker,T.,Jessen,M.andGrigoras,C.(2009)SpeakerverificationbasedonformantsusingGaussianmixturemodels.ProceedingsofNAG/DAGAInternationalConferenceonAcoustics,Rotterdam:1640–1643.
Brümmer,N.(2004)Application-independentevaluationofspeakerdetection.ODYSSEY04-TheSpeakerandLanguageRecognitionWorkshop,Toledo,Spain.
Brümmer,N.,Burget,L.,etal.(2007)FusionofheterogeneousspeakerrecognitionsystemsintheSTBUsubmissionfortheNISTspeakerrecognitionevaluation2006.IEEETransactionsonAudio,Speech,andLanguageProcessing15(7):2072–2084.http://dx.doi.org/10.1109/TASL.2007.902870
Brümmer,N.andduPreez,J.(2006)Application-independentevaluationofspeakerdetection.ComputerSpeechandLanguage20(2):230–275.
Cao,L.,Chua,K.,Chong,W.K.,Lee,H.P.andGu,Q.M.(2003)AcomparisonofPCA,KPCAandICAfordimensionalityreductioninsupportvectormachine.Neurocomputing55(1):321–336.
Cheney,E.W.andKincaid,D.R.(2007)NumericalMathematicsandComputing.Stamford:Brooks/ColePub.Co.
Edelman,A.(1989)Eigenvaluesandconditionnumbersofrandommatrices.PhdThesisMassachusettsInstituteofTechnology,Cambridge,MA.
Gold,E.andFrench,P.(2011)Internationalpracticesinforensicspeakercomparison.InternationalJournalofSpeech,LanguageandtheLaw18(2):293–307.
Golub,G.andKahan,W.(1965)Calculatingthesingularvaluesandpseudo-inverseofamatrix.JournaloftheSocietyforIndustrialandAppliedMathematics:SeriesB,NumericalAnalysis2(2):205–224.
Gonzalez-Rodriguez,J.,Drygajlo,A.,Ramos-Castro,D.,Garcia-Gomar,M.andOrtega-Garcia,J.(2006)Robustestimation,interpretationandassessmentoflikelihoodratiosinforensicspeakerrecognition.ComputerSpeechandLanguage20(2–3):331–355.http://dx.doi.org/10.1016/j.csl.2005.08.005
Gonzalez-Rodriguez,J.,Rose,P.,Ramos,D.,Toledano,D.T.andOrtega-Garcia,J.(2007)EmulatingDNA:Rigorousquantificationofevidentialweightintransparentandtestableforensicspeakerrecognition.IEEETransactionsonAudio,Speech,andLanguageProcessing15(7):2104–2115.http://dx.doi.org/10.1109/TASL.2007.902747
Hollander,M.,WolfeD.A.andChicken,E.(2013)NonparametricStatisticalMethods,vol.751.NewYork:JohnWiley&Sons.
Jackson,J.E.andWiley,J.(1991)AUser’sGuidetoPrincipalComponents.NewYork:WileyOnlineLibrary.
Jolliffe,I.T.(2002)Principalcomponentanalysis.EncyclopediaofStatisticsinBehavioralScience.NewYork:Springer.
Jolliffe,I.T.(1986)PrincipalComponentAnalysis.NewYork:Springer-Verlag.
Khodai-Joopari,M.(2006)Forensicspeakeranalysisandidentificationbycomputer.ABayesianapproachanchoredinthecepstraldomain.UnpublishedPhDThesis,UniversityofNewSouthWales,Australia.
Lewis,S.(1984)Philosophyofspeakeridentification.Policeapplicationsofspeechandtaperecordinganalysis.ProceedingoftheInstituteofAcoustics6(1):69–77.
Lindley,D.(1977)Aprobleminforensicscience.BiometrikaTrust64(2):207–213.http://dx.doi.org/10.1093/biomet/64.2.207
Meuwly,D.andDrygajlo,A.(2001)ForensicspeakerrecognitionbasedonaBayesianframeworkandGaussianMixtureModelling(GMM).ASpeakerOdyssey-TheSpeakerRecognitionWorkshop,Crete,Greece.
Morrison,G.S.(2009)Likelihood-ratioforensicvoicecomparisonusingparametricrepresentationsoftheformanttrajectoriesofdiphthongs.JournaloftheAcousticalSocietyofAmerica125(4):2387–2397.http://dx.doi.org/10.1121/1.3081384
Morrison,G.S.(2010)Forensicvoicecomparison.ExpertEvidence,ThomsonReuters,Sydney,Australia40:1–105.
Morrison,G.S.(2011a)Acomparisonofproceduresforthecalculationofforensiclikelihoodratiosfromacoustic-phoneticdata:Multvariatekerneldensity(MVKD)versusGaussianmixturemodel-universalbackgroundmodel(GMM-UBM).SpeechCommunication53(2):242–256.
Morrison,G.S.(2011b)Measuringthevalidityandreliabilityofforensiclikelihood-ratiosystems.ScienceandJustice51(3):91–98.
Pigeon,S.,Druyts,P.andVerlinde,P.(2000)ApplyinglogisticregressiontothefusionoftheNIST’991-speakersubmissions.DigitalSignalProcessing10(1):237–248.http://dx.doi.org/10.1006/dspr.1999.0358
Ramos-Castro,D.(2007)Forensicevaluationoftheevidenceusingautomaticspeakerrecognitionsystems.PhDDissertation,UniversidadautonomadeMadrid.
Ramos-Castro,D.,Gonzalez-Rodriguez,J.andOrtega-Garcia,J.(2006)Likelihoodratiocalibrationinatransparentandtestableforensicspeakerrecognitionframework.TheSpeakerandLanguageRecognitionWorkshop,ProceedingsofIEEEOdyssey:1–8.
Reynolds,D.A.,Quatieri,T.F.,etal.(2000)SpeakerverificationusingadaptedGaussianmixturemodels.DigitalSignalProcessing10(1):19–41.http://dx.doi.org/10.1006/dspr.1999.0361
Rose,P.(2002)ForensicSpeakerIdentification.London,NewYork:Taylor&Francis.
Rose,P.(2003)Thetechnicalcomparisonofforensicvoicesamples.ExpertEvidence,ThomsonLawbookCompany,Sydney,Australia99:1–126.
Rose,P.(2006a)Accountingforcorrelationinlinguistic-acousticlikelihoodratio-basedforensicspeakerdiscrimination.TheSpeakerandLanguageRecognitionWorkshop,ProceedingsofIEEEOdyssey:1–8.
Rose,P.(2006b)Technicalforensicspeakerrecognition:evaluation,typesandtestingofevidence.ComputerSpeechandLanguage20(2):159–191.http://dx.doi.org/10.1016/j.csl.2005.07.003
Rose,P.(2010)TheeffectofcorrelationonstrengthofevidenceestimatesinForensicVoiceComparison:uni-andmultivariateLikelihoodRatio-baseddiscriminationwithAustralianEnglishvowelacoustics.InternationalJournalofBiometrics2(4):316–329.http://dx.doi.org/10.1504/IJBM.2010.035447
Rose,P.(2011)ForensicvoicecomparisonwithJapanesevowelacoustics-alikelihoodratio-basedapproachsegmentalcepstra.Proceedingsofthe17thInternationalCongressofPhoneticSciences:1718–1721.
Rose,P.,Osanai,T.andKinoshita,Y.(2003)Strengthofforensicspeakeridentificationevidence:multispeakerformant-andcepstrum-basedsegmentaldiscriminationwithaBayesianlikelihoodratioasthreshold.InternationalJournalofSpeechLanguageandtheLaw10(2):179–202.
Seneff,S.andZue,V.(1988)Transcriptionandalignmentofthetimitdatabase.TIMITCD-ROMDocumentation.
Shlens,J.(2005)Atutorialonprincipalcomponentanalysis.SystemsNeurobiologyLaboratory82,UniversityofCaliforniaatSanDiego.
Singh,S.andT.Murry(1978)Multidimensionalclassificationofnormalvoicequalities.JournaloftheAcousticalSocietyofAmerica64(1):81–87.http://dx.doi.org/10.1121/1.381958
Stevens,K.N.(1971)Sourcesofinter-andintra-speakervariabilityintheacousticpropertiesofspeechsounds.Proceedingsofthe7thInternationalCongressofPhoneticSciences:206–232.
Tipping,M.E.andBishop,C.M.(1999)Probabilisticprincipalcomponentanalysis.JournaloftheRoyalStatisticalSociety:SeriesB(StatisticalMethodology)61(3):611–622.
Trefethen,L.N.andBau,D.(1997)NumericalLinearAlgebra.Pennsylvania:SocietyforIndustrialMathematics,Vol.50.
vanLeeuwen,D.A.andBrümmer,N.(2007)Anintroductiontoapplication-independentevaluationofspeakerrecognitionsystems.InC.Müller(ed.)SpeakerClassificationI.FundamentalsFeatures,andMethods330-353.Berlin,Heidelberg:Springer.
Wand,M.P.andJones,M.C.(1994)KernelSmoothing.Florida:CrcPress.
Wang,H.andYang,J.(2010)Thecomparisonof‘Idiot'sBayes’andmultivariatekernel-densityinforensicspeakeridentificationusingChinesevowel/a/.Proceedingsofthe3rdInternationalCongressonImageandSignalProcessing(CISP2010)8:3533–3537.
Zou,H.,Hastie,T.andTibshirani,R.(2006)Sparseprincipalcomponentanalysis.JournalofComputationalandGraphicalStatistics15(2):265–286.http://dx.doi.org/10.1198/106186006X113430
Zue,V.,Seneff,S.andGlass,J.(1990)SpeechdatabasedevelopmentatMIT:TIMITandbeyond.SpeechCommunication9(4):351–356.http://dx.doi.org/10.1016/0167-6393(90)90010-7