Item Details

Background population: how does it affect LR based forensic voice comparison?

Issue: Vol 21 No. 2 (2014)

Journal: International Journal of Speech Language and the Law

Subject Areas: Linguistics

DOI: 10.1558/ijsll.v21i2.191

Abstract:

This article investigates to what extent and in what ways the size of the background population affects the outcome of likelihood ratio (LR) based forensic voice comparison. While sample size is known to affect the accuracy of statistical modelling, specific effects in the context of forensic voice comparison are not yet understood. Forensic voice comparison analysts need to work with limited data, but it is unclear how this might impact on the LR-based evaluation of evidence. In this article, we report LR-based speaker comparison experiments using variously sized datasets for background population. They use features derived from long term F0 distribution. We examined their performance in terms of accuracy (closeness to the true value) and precision (reproducibility).

Author: Yuko Kinoshita, Shunichi Ishihara

View Original Web Page

References :

Aitken, C., G.G. and Lucy, D. (2004). Evaluation of trace evidence in the form of multivariate data. Applied Statistics 53(4). 109--22.
Braun, A. (1995). Fundamental Frequency - How Speaker Specific Is It? BEIPHOL Studies in Forensic Phonetics (64). 9--23.
Brümmer, N. and Du Preez, J. (2006). Application independent evaluation of speaker detection Computer Speech and Language 20(2-3). 230--75.
Burnham, D., Estival, D., Fazio, S., Cox, F., Dale, R., Viethen, J., Cassidy, S., Epps, J., Togneri, R., Kinoshita, Y., Göcke, R., Arciuli, J., Onslow, M., Lewis, T., Butcher, A., Hajek, J. and Wagner, M. 2011. Building an audio-visual corpus of Australian English: large corpus collection with an economical portable and replicable Black Box. Interspeech 2011, 841--4. Florence: International Speech Communication Association.
Curran, J. M. (2005). An introduction to Bayesian credible intervals for sampling error in DNA profiles. Law, Probability and Risk 4(1--2). 115--26.
Edwards, H. and Gotsonis, C. 2009. Strengthening forensic science in the United States: a path forward. Statement before the United State Senate Committee on the Judiciary, 1--328. Washington, D.C.: National Research Council of the National Academies.
Elliott, J. 2000. Comparing the acoustic properties of normal and shouted speech: a study in forensic phonetics. In Barlow, M. (ed.), 8th Australian International Conference in Speech Science and Technology, 154--9. Canberra: Australian Speech Science and Technology Association.
French, P. (1994). An overview of forensic phonetics with particular reference to speaker identification. Forensic Linguistics. 169--81.
French, P. and Harrison, P. (2007). Position Statement concerning use of impressionistic likelihood terms in forensic speaker comparison cases. International Journal of Speech, Language & the Law 14(1). 137--44.
Ishihara, S. and Kinoshita, Y. 2008. How Many Do We Need? Exploration of the Population Size Effect on the Performance of Forensic Speaker Classification. Interspeech 2008, 1941--4. Brisbane: ISCA.
Kinoshita, Y. (2005). Does lindley’s lr estimation formula work for speech data?: investigation using long-term f0. International Journal of Speech Language and the Law 12(2). 235--54.
Kinoshita, Y. and Ishihara, S. 2010. F0 can tell us more: speaker verification using the long term distribution. In Butcher, M. T. J. F. D. G. J. H. A. (ed.), 13th International Conference of Australian Speech Science and Technology Association, 50--3. La Trobe Unviersity, Melbourne: Australian Speech Science and Technology Association.
Kinoshita, Y., Ishihara, S. and Rose, P. 2008. Beyond the Long-term Mean: Exploring the Potential of F0 Distribution Parameters in Forensic Speaker Recognition. the ODYSSEY 2008 - The Speaker and Language Recognition Workshop. Stellenboch: International Speech Communication Association.
Law Commission. 2009. The Admissibility of Expert Evidence in Criminal Proceedings in England and Wales: A New Approach to the Determination of Evidentiary Reliability. In Law Commission (ed.), 1--92. London.
Maekawa, K. 1998. Phonetic and phonological characteristics of paralinguistic information in spoken Japanese. The 5th International Conference on Spoken Language Processing, vol. CD ROM, paper no.997. Sydney: Australian Speech Science Technology Association.
Maekawa, K., Koiso, H., Furui, S. and Isahara, H. 2000. Spontaneous speech corpus of Japanese. The Second International Conference of Language Resources and Evaluation (LREC2000), 947--52. Athens.
Morrison, G. S. (2008). Forensic voice comparison using likelihood ratios based on polynomial curves fitted to the formant trajectories of Australian English /aI/. International Journal of Speech, Language and the Law 15(2). 249--66.
Morrison, G. S. (2009). Forensic voice comparison and the paradigm shift. Science and Justice 49(4). 298--308.
Morrison, G. S. (2011). Measuring the validity and reliability of forensic likelihood-ratio systems. Science & Justice 51(3). 91--8.
Morrison, G. S., Rose, P. and Zhang, C. (2012). Protocol for the collection of databases of recordings for forensic-voice-comparison research and practice. Australian Journal of Forensic Sciences 44(2). 155--67.
Morrison, G. S., Thiruvaran, T. and Epps, J. 2010. Estimating the precision of the likelihood-ratio output of a forensic-voice-comparison system. Proceedings of Odyssey, 63--70.
Morrison, G. S., Zhang, C. and Rose, P. (2011). An empirical estimate of the precision of likelihood ratios from a forensic-voice-comparison system. Forensic Science International 208(1--3). 59--65.
Nolan, F. (1983). The Phonetic Bases of Speaker Recognition, Cambridge: Cambridge University Press.
Reynolds, D. A., Quatieri, T. F. and Dunn, R. B. (2000). Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing 10(1–3). 19--41.
Robertson, B. and Vignaux, G. A. (1995). Interepreting Evidence, Chichester: Wiley.
Rose, P. J., Kinoshita, Y. and Alderman, T. 2006. Realistic Extrinsic Forensic Speaker Discrimination with the Diphthong /ai/ In Warren, P. and Watson, C. I. (eds.), The 11th Australian International Conference on Speech Science & Technology, 329--34. University of Auckland, New Zealand: Australian Speech Science & Technology Association Inc. .
Rose, P. J., Lucy, D. and Osanai, T. 2004. Linguistic-acoustic forensic speaker identification with likelihood ratios from a multivariate hierarchical effects model: A “non-idiot’s bayes” approach. In Cassidy, S. (ed.), the 10th Australian International Conference on Speech Science & Technology, 402--7. Sydney: Australian Speech Science and Technology Association.
Rose, P. J. and Winter, E. 2010. Traditional Forensic Voice Comparison with Female Formants: Gaussian mixture model and multivariate likelihood ratio analyses. SST2010, 42--5. Melbourne.
Sheather, S. J. and Jones, M. C. (1991). A reliable data-based bandwidth selection method for kernel density estimation. Journal of the Royal Statistical Society 53(Series B). 683--90.
Silverman, B. W. (1986). Density Estimation for Statistics and Data Analysis, London: Chapman & Hall.
Sjölander, K. 2006. The Snack Sound Toolkit
van Leewen, D. A. and Brümmer, N. 2007. An Introduction to Applicaiton -Independendt Evaluation of Speaker Recognition System. In Müller, C. (ed.), Speker Classification, 330--53. Berlin: Springer.
Wand, M. P. and Jones, M. C. (1995). Kernel Smoothing, London: Chapman and Hall.