Item Details

Traditional Versus ASR-Based Pronunciation Instruction : An Empirical Study

Issue: Vol 37 No. 3 (2020)

Journal: CALICO Journal

Subject Areas:

DOI: 10.1558/cj.40379


This paper presents a 15-week classroom study measuring the student outcomes of instructor-led pronunciation lessons versus entirely ASR-based pronunciation training. Seventy-six second-semester Spanish language learners were divided into two groups, one experimental (n=44) and one control (n=32). Over the course of six modules, both groups completed a pre- and post-study recording, as well as explicit pronunciation training sessions. These sessions included pre- and post-recordings, with either traditional or ASR pronunciation practice in between, which aimed attention at targeted phonemes. All student recordings were evaluated by native and near-natives for comprehensibility, nativeness, fluency, and perceived confidence. The results show that the effect of explicit and ASR instruction varies depending on the module and characteristic evaluated. ASR seems to outperform traditional instruction when targeting specific phonemes, especially in the short-term, while the explicit instruction group saw longer-term gains in regards to comprehensibility. Holistically, the data suggest that ASR-based instruction shows promise to improve certain aspects of pronunciation, but that using both techniques in tandem would be the most strategic approach to handling the development of this fundamental aspect of learner speech. The data presented here highlight the role and effectiveness of computer-assisted pronunciation training for lower-level Spanish courses.

Author: Christina Garcia, Dan Nickolai, Lillian Jones

View Original Web Page

References :

Bates, D., Mächler, M., Bolker, B., & Walker, S. (2014). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1).


Beaufays, F. (2015, August 11) The neural networks behind Google Voice transcription [web log]. Retrieved


Derwing, T. M., & Munro, M.J. (2005). Second language accent and pronunciation teaching: A research-based approach. TESOL Quarterly, 39(3), 379–397.


Elimat, A. K., & AbuSeileek, A. F. (2014). Automatic speech recognition technology as an effective means for teaching pronunciation. The JALT CALL Journal, 10(1), 21-47.


Golonka, E. M., Bowles, A. R., Frank, V. M., Richardson, D. L., & Freynik, S. (2014). Technologies for foreign language learning: a review of technology types and their effectiveness. Computer assisted language learning, 27(1), 70-105.


Grant, L., & Brinton, D. (2014). Pronunciation myths: Applying second language research to classroom teaching. Ann Arbor, MI: University of Michigan Press.


Levis, J. (2007). Computer technology in teaching and researching pronunciation. Annual Review of Applied Linguistics, 27, 184-202.


Liakin, D., Cardoso, W., & Liakina, N. (2013). Mobile speech recognition software: A tool for teaching second language pronunciation. Cahiers De L’ilob, 5, 85-99.


Liakin, D., Cardoso, W., & Liakina, N. (2015). Learning L2 Pronunciation with a Mobile Speech Recognizer: French/y/. Calico Journal, 32(1), 1-25.


Liakin, D., Cardoso, W., & Liakina, N. (2017). The pedagogical use of mobile speech synthesis (TTS): focus on French liaison. Computer Assisted Language Learning, 30(3-4), 325-342.


Lord, G. (2019). Incorporating technology into the teaching of Spanish pronunciation. In R. Rao (Ed.), Key Issues in the Teaching of Spanish Pronunciation: from description to pedagogy (218-236). New York, NY: Routledge.


Morgan, Terrell A. (2010). Sonidos en contexto: una introducción a la fonética del español con especial referencia a la vida real. New Haven, CT: Yale University Press.


Neri, A., Cucchiarini, C., & Strik, H. (2002a). Feedback in computer assisted pronunciation training: technology push or demand pull?. In Tan, Z., & Dalsgaard, P. (eds.) Proceedings of the International Conference on Spoken Language Processing (ICSLP), 1209-1212. Denver, CO.


Neri, A., Cucchiarini, C., & Strik, H. (2003). Automatic speech recognition for second language learning: how and why it actually works. In M.J. Solé, D. Recasens, & J. Romero (eds.), Proceedings from the 15th International Congress of Phonetic Sciences (ICPhS-15), 1157-1160. Barcelona, Spain.


Neri, A., Cucchiarini, C., Strik, H., & Boves, L. (2002b). The pedagogy-technology interface in computer assisted pronunciation training. Computer assisted language learning, 15(5), 441-467.


O’Brien, M. G., Derwing, T. M., Cucchiarini, C., Hardison, D. M., Mixdorff, H., Thomson, R. I., ... & Levis, G.M. (2018). Directions for the future of technology in pronunciation research and teaching. Journal of Second Language Pronunciation, 4(2), 182-207.


Olson, D. J. (2014). Benefits of visual feedback on segmental production in the L2 classroom. Language Learning & Technology, 18(3), 173–192. Retrieved from


Pieraccini, R. (2012). The voice in the machine: building computers that understand speech. Cambridge, MA: MIT Press.


R Development Core Team. (2014). R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.


Thomson, R. I. (2011). Computer assisted pronunciation training: Targeting second language vowel perception improves pronunciation. Calico Journal, 28(3), 744-765.


Thomson, R. I., & Derwing, T. M. (2014). The effectiveness of L2 pronunciation instruction: A narrative review. Applied Linguistics, 36(3), 326-344.