Effects of task type on morphosyntactic complexity across proficiency : evidence from a large learner corpus of A1 to C2 writings
This study investigates the effect of instructional design on (morpho)syntactic complexity in second language (L2) writing development. We operationalised instructional design in terms of task type and empirically based the investigation on a large subcorpus (669,876 writings by 119,960 learners from 128 tasks at all Common European Framework of Reference for Languages levels) of the EF-Cambridge Open Language Database (EFCAMDAT; Geertzen, Alexopoulou and Korhonen 2014).
First, the 128 task prompts were manually categorised for task type (e.g. argumentation, description). Next, developmental trajectories of syntactic complexity from A1 to C2 were established using a variety of global (e.g. mean length of clause) and specific (e.g. non-third person singular present tense verbs) measures extracted using natural language processing techniques. The effects of task type were analysed using the categorisation from the first step. Finally, tasks that showed atypical behaviour for a measure given their task type were explored qualitatively.
Our results partially confirm earlier experimental and corpus-based studies (e.g. subordination associated with argumentative tasks). Going beyond, our large-scale data-driven analysis made it possible to identify specific measures that were naturally prompted by instructional design (e.g. narrations eliciting wh-phrases). We discuss which measures typically align with certain task types and highlight how instructional design relates to L2 developmental trajectories over time.
Author: Marije Michel, Akira Murakami, Theodora Alexopoulou, Detmar Meurers
Alexopoulou, T., Geertzen, J., Korhonen, A. and Meurers, D. (2015) Exploring big educational learner corpora for SLA research: perspectives on relative clauses. International Journal of Learner Corpus Research 1(1): 96–129.
Alexopoulou, T., Michel, M., Murakami, A. and Meurers, D. (2017) Task effects on linguistic complexity and accuracy: a large‐scale learner corpus analysis employing natural language processing techniques. Language Learning 67(s1): 180–208.
Baralt, M., Gilabert, R. and Robinson, P. (eds) (2014) Task Sequencing and Instructed Second Language Learning. London: Bloomsbury Publishing.
Brezina, V. and Flowerdew, L. (eds) (2017) Learner Corpus Research: New Perspectives and Applications. London: Bloomsbury Academic.
Bulté, B. and Housen, A. (2012) Defining and operationalising L2 complexity. In A. Housen, F. Kuiken and I. Vedder (eds) Dimensions of L2 Performance and Proficiency 21–46. Amsterdam: John Benjamins Publishing Company.
Byrnes, H., Maxim, H. H., & Norris, J. M. (eds) (2010) Realizing advanced foreign language writing development in collegiate education: curricular design, pedagogy, assessment [Special issue]. Modern Language Journal 94(s1).
Chen, X. and Meurers, D. (2016) CTAP: a web-based tool supporting automatic complexity analysis. Proceedings of the Workshop on Computational Linguistics for Linguistic Complexity 113–19. Osaka, Japan. http://aclweb.org/anthology/W16-4113
Crossley, S. A. and McNamara, D. S. (2014) Does writing development equal writing quality? A computational investigation of syntactic complexity in L2 learners. Journal of Second Language Writing 26: 66–79.
Geertzen, J., Alexopoulou, T. and Korhonen, A. (2014) Automatic linguistic annotation of large scale L2 databases: the EF-Cambridge Open Language Database (EFCAMDAT). Proceedings of the 31st Second Language Research Forum (SLRF). Pittsburgh, PA: Cascadilla Press.
Kyle, K. (2016) Measuring syntactic development in L2 writing: fine grained indices of syntactic complexity and usage-based indices of syntactic sophistication. Doctoral dissertation, Georgia State University. Retrieved on 19 September 2019 from
Loewen, S. (2015) Introduction to Instructed Second Language Acquisition. New York: Routledge.
Loschky, L. and Bley-Vroman, R. (1993) Grammar and task-based methodology. In G. Crookes and S. Gass (eds) Tasks and Language Learning: Integrating Theory and Practice 123–67. Bristol: Multilingual Matters.
Meurers, D. (2015) Learner corpora and natural language processing. In S. Granger, G. Gilquin and F. Meunier (eds) The Cambridge Handbook of Learner Corpus Research 537–66. Cambridge: Cambridge University Press.
Murakami, A. and Alexopoulou, T. (2016) Longitudinal L2 development of the English article in individual learners. In A. Papafragou, D. Grodner, D. Mirman and J. Trueswell (eds) Proceedings of the 38th Annual Meeting of the Cognitive Science Society 1050–5. Austin, TX: Cognitive Science Society.
Ott, N., Ziai, R. and Meurers, D. (2012) Creation and analysis of a reading comprehension exercise corpus: towards evaluating meaning in context. In T. Schmidt and K. Wörner (eds) Multilingual Corpora and Multilingual Corpus Analysis 47–69. Hamburg Studies in Multilingualism (HSM). Amsterdam, Netherlands: John Benjamins.
Pander Maat, H., Kraf, R., van den Bosch, A. P. J., Dekker, N., Gompel, M. V., Kleijn, S. D., ... and Sloot, K. (2014) T-Scan: a new tool for analyzing Dutch text. Computational Linguistics in the Netherlands (4):53–74. Retrieved on 19 September 2019 from
Polio, C. and Yoon, H.-J. (2018) The reliability and validity of automated tools for examining variation in syntactic complexity across genres. International Journal of Applied Linguistics 28(1): 165–88.
Tracy-Ventura, N. and Paquot, M. (eds) (in preparation) The Routledge Handbook of Second Language Acquisition and Corpora. New York: Routledge.
Vajjala, S. and Meurers, D. (2012) On improving the accuracy of readability classification using insights from second language acquisition. In Proceedings of the 7th Workshop on Innovative Use of NLP for Building Educational Applications (BEA) 163–73. Montréal, Canada: ACL. Retrieved on 19 September 2019 from
Vajjala, S. and Meurers, D. (2014) Readability assessment for text simplification: from analysing documents to identifying sentential simplifications. ITL – International Journal of Applied Linguistics 165(2): 194–222.
Vyatkina, N., Hirschmann, H. and Golcher, F. (2015) Syntactic modification at early stages of L2 German writing development: a longitudinal learner corpus study. Journal of Second Language Writing 29: 28–50.
Way, D. P., Joiner, E. G. and Seaman, M. A. (2000) Writing in the secondary foreign language classroom: the effects of prompts and tasks on novice learners of French. Modern Language Journal 84(2): 171–84.
Wieling, M. (2018) Analyzing dynamic phonetic data using generalized additive mixed modeling: a tutorial focusing on articulatory differences between L1 and L2 speakers of English. Journal of Phonetics 70: 86–116.
Wieling, M., Montemagni, S., Nerbonne, J. and Baayen, R. H. (2014) Lexical differences between Tuscan dialects and standard Italian: accounting for geographic and socio-demographic variation using generalized additive mixed modeling. Language 90: 669–92.
Wolfe-Quintero, K., Inagaki, S. and Kim, H.-Y. (1998) Second Language Development in Writing: Measures of Fluency, Accuracy, and Complexity. Manoa, Hawaii: Second Language Teaching and Curriculum Center, University of Hawaii at Manoa.
Wood, S. N. (2017) Generalized Additive Models: An Introduction with R (2nd edn). Boca Raton, FL: Chapman and Hall/CRC.
Yang, W., Lu, X. and Weigle, S. C. (2015) Different topics, different discourse: relationships among writing topic, measures of syntactic complexity, and judgments of writing quality. Journal of Second Language Writing 28: 53–67.