Item Details

Discourse Classification into Rhetorical Functions for AWE Feedback

Issue: Vol 33 No. 1 (2016) Automated Writing Evaluation

Journal: CALICO Journal

Subject Areas:

DOI: 10.1558/cj.v33i1.27047

Abstract:

NLP-based technology constitutes the backbone of automated analysis of writing and feedback on grammar, usage, mechanics, style, organization, coherence, and content. However, analysis of rhetorical intent is a challenging NLP problem that remains to be addressed in order to facilitate the learning of academic genres. This paper reports on the development of the analysis engine for the [TOOL], an AWE program designed to provide genre and discipline-specific feedback on the functional units of research article discourse. Unlike traditional NLP-based applications that categorize complete documents, [TOOL]’s analyzer categorizes every sentence in the text as both a rhetorical move and step using a 17-step schema. The paper first reviews the approaches to and implementations of automated discourse categorization in order to provide a background for the automated genre analysis methodology. Then, it describes the construction of a cascade of two support vector machine classifiers trained on a multi-disciplinary corpus of annotated Introduction texts. Lastly, it demonstrates how this categorization approach was applied to the generation of feedback on the rhetorical functions of the research article genre. This work not only demonstrates the usefulness of NLP for automated genre analysis, but also paves the road for future AWE endeavors and forms of automated feedback that could facilitate construction of functional meaning in writing.

Author: Elena Cotos, Nick Pendar

View Full Text

References :

Anthony, L., & Lashkia, G. (2003). Mover: A machine learning tool to assist in the reading and writing of technical papers. IEEE Transactions on Professional Communication, 46 (3), 185–193. http://dx.doi.org/10.1109/TPC.2003.816789


Attali, Y. (2004). Exploring the feedback and revision features of Criterion. Journal of Second Language Writing, 14, 191–205.


Bazerman, C., Bonini, A., & Figueiredo, D. (Eds). (2009). Genre in a Changing World. Perspectives on Writing. Fort Collins, Colorado: The WAC Clearinghouse and Parlor Press. Available at http://wac.colostate.edu/books/genre/


Biber, D. (1995). Dimensions of register variation: A cross-linguistic comparison. Cambridge: Cambridge University Press. http://dx.doi.org/10.1017/CBO9780511519871


Biber, D., Connor, U., & Upton, T. (2007). Discourse on the Move: Using Corpus Analysis to Describe Discourse Structure. Amsterdam: John Benjamins. http://dx.doi.org/10.1075/scl.28


Burstein, J. (2003). The e-rater text registered scoring engine: Automated essay scoring with natural language processing. In M. D. Shermis & J. Burstein (Eds), Automated essay scoring: A cross-disciplinary perspective, 113–121. Mahwah, NJ: Lawrence Erlbaum.


Burstein, J. (2009). Opportunities for natural language processing research in education. Computational Linguistics and Intelligent Text Processing, 6–27. http://dx.doi.org/10.1007/978-3-642-00382-0_2


Burstein, J., Marcu, D., & Knight, K. (2003). Finding the WRITE stuff: Automatic identification of discourse structure in student essays. IEEE Intelligent Systems: Special Issue on Natural Language Processing, 18 (1), 32–39. http://dx.doi.org/10.1109/MIS.2003.1179191


Burstein, J., Tetreault, J., & Madnani, N. (2013). The e-rater automated essay scoring system. In M. D. Shermis, & J. Burstein (Eds), Handbook of automated essay scoring: Current applications and future directions, 55–67. New York: Routledge.


Chen, C. F. E., & Cheng, W. Y. E. (2008). Beyond the design of automated writing evaluation: Pedagogical practices and perceived learning effectiveness in EFL writing classes. Language Learning & Technology, 12 (2), 94–112.


Chodorow, M., Gamon, M., & Tetreault, J. (2010). The utility of article and preposition error correction systems for English language learners: Feedback and assessment. Language Testing, 27 (3), 419–436. http://dx.doi.org/10.1177/0265532210364391


Connor, U., Upton, U., & Kanoksilapatham, B. (2007). Introduction to Move Analysis. In D. Biber & T. Ulla Upton (Eds) Discourse on the Move: Using Corpus Analysis to Describe Discourse Structure, 23–42. Amsterdam: John Benjamins.


Cotos, E. (2011). Potential of automated writing evaluation feedback. CALICO Journal, 28 (2), 420–459. http://dx.doi.org/10.11139/cj.28.2.420-459


Cotos, E. (2014). Genre-based automated writing evaluation for L2 research writing: From design to evaluation and enhancement. New York: Palgrave Macmillan. http://dx.doi.org/10.1057/9781137333377


Cotos, E. (2015). AWE for writing pedagogy: From healthy tension to tangible prospects. Writing and Pedagogy, 7 (2-3), 197–231.


Cortes, V. (2013). ‘The purpose of this study is to’: Connecting lexical bundles and moves in research article introductions. Journal of English for Academic Purposes, 12 (1), 33–43. http://dx.doi.org/10.1016/j.jeap.2012.11.002


Cortes, C., & Vapnik, V. (1995). Support-Vector Networks. Machine Learning, 20 (3), 273–297. http://dx.doi.org/10.1007/BF00994018


Dikli, S. (2006). An overview of automated scoring of essays. Journal of Technology, Learning, and Assessment, 5 (1). Retrieved 18 August 2007 from http://www.jtla.org.


Dörnyei, Z., & Skehan, P. (2003). Individual differences in second language learning. In C. J. Doughty & M. H. Long (Eds), The handbook of second language acquisition, 589–630. Malden, MA and Oxford: Blackwell. http://dx.doi.org/10.1002/9780470756492.ch18


Gamper, J., & Knapp, J. (2002). A review of intelligent CALL systems. Computer Assisted Language Learning, 15 (4), 329–342. http://dx.doi.org/10.1076/call.15.4.329.8270


Garrett, N. (1987). A Psycholinguistic Perspective on Grammar and CALL. In Wm. Flint Smith (Ed.) Modern media in foreign language education: Theory and implementation, 169–196. Lincolnwood, IL: National Textbook.


Gliner, J. A., & Morgan, G. A. (2000). Research methods in applied settings: An integrated approach to design and analysis. Mahwah, NJ Lawrence Erlbaum.


Hyland, K. (2000). Disciplinary discourses. London: Longman.


Kessler, B., Numberg, G., & Schütze, H. (1997). Automatic detection of text genre. In Proceedings of the 35th Annual Meeting of the Association for Computational Linguistics and Eighth Conference of the European Chapter of the Association for Computational Linguistics, 32–38. Association for Computational Linguistics.


Kivinen, J., Warmuth, M., & Auer P. (1997). The Perceptron algorithm vs. Winnow: Linear vs. logarithmic mistake bound when few input variables are relevant. Artificial Intelligence, 1–2, 325–343. http://dx.doi.org/10.1016/S0004-3702(97)00039-8


Kurohashi, S., & Nagao, M. (1994). Automatic detection of discourse structure by checking surface information in sentences. In Proceedings of the 15th International Conference on Computational Linguistics, Vol. 2, 1123–1127. http://dx.doi.org/10.3115/991250.991334


Litman, D. J. (1994). Classifying cue phrases in text and speech using machine learning. arXiv preprint cmp-lg/9405014.


Mann, W. C., & Thompson, S. A. (1988). Rhetorical structure theory: Toward a functional theory of text organization. Text, 8 (3), 243–281. http://dx.doi.org/10.1515/text.1.1988.8.3.243


Madnani, N., Heilman, M., Tetreault, J., & Chodorow, M. (2012). Identifying high-level organizational elements in argumentative discourse. In Proceedings of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 20–28.


Marcu, D. (2000). The theory and practice of discourse parsing and summarization. Cambridge, MA: MIT Press.


Maron, M. E. (1961). Automatic indexing: An experimental inquiry. Santa Monica, CA: Rand Corporation.


McLachlan, G., Do, K., & Ambroise, C. (2004). Analyzing microarray gene expression data. Hoboken, NJ: John Wiley & Sons. http://dx.doi.org/10.1002/047172842X


Mitchell, T. M. (1997). Machine learning. Boston, MA: McGraw-Hill.


Mladenic, D. (1998). Turning yahoo into an automatic web-page classifier. In Proceedings of the 13th European Conference on Artificial Intelligence, 473–474.


Nagata, R, & Nakatani, K. (2010). Evaluating performance of grammatical error detection to maximize learning effect. In Proceedings of COLING, 894–900.


Pendar, N., & Cotos E. (2008). Automatic identification of discourse moves in scientific article introductions. In Proceedings of the Third Workshop on Innovative Use of NLP for Building Educational Applications, 62–70, Association for Computational Linguistics. Columbus, Ohio. http://dx.doi.org/10.3115/1631836.1631844


Porter, M.F. (1980). An algorithm for suffix stripping. Program, 14 (3), 130−137. http://dx.doi.org/10.1108/eb046814


Saricaoglu, A., & Cotos, E. (2013). A Study of the inter-annotator reliability for an AWE Tool:
Research Writing Tutor (RWT). Paper presented at the
Technology for Second Language Learning Conference, Ames, IA.


Schilder, F. (2002). Robust discourse parsing via discourse markers, topicality and position. Natural Language Engineering, 8 (3), 235–255. http://dx.doi.org/10.1017/s1351324902002905


Sebastiani, F. (2002). Machine Learning in automated text categorization. ACM Computing Surveys, 34, 1–47. http://dx.doi.org/10.1145/505282.505283


Shrout, P., & Fleiss, J. L. (1979). Intraclass correlation: Uses in assessing rater reliability. Psychological Bulletin, 86 (2), 420–428. http://dx.doi.org/10.1037/0033-2909.86.2.420


Stamatatos, E., Fakotakis, N., & Kokkinakis, G. (2000). Automatic text categorization in terms of genre and author. Computational Linguistics, 26 (4), 471–495. http://dx.doi.org/10.1162/089120100750105920


Swales, J. M. (1990). Genre analysis. Cambridge: Cambridge University Press.


Taboada, M., & Mann, W. C. (2006). Rhetorical Structure Theory: Looking back and moving ahead. Discourse Studies, 8 (3), 423–459. http://dx.doi.org/10.1177/1461445606061881


Toms, E. G., & Campbell, D. G. (1999). Genre as interface metaphor: Exploiting form and function in digital environments. In System Sciences, 1999. HICSS-32. Proceedings of the 32nd Annual Hawaii International Conference On. IEEE. http://dx.doi.org/10.1109/hicss.1999.772652


Upton, T., & Connor, U. (2001). Using computerized corpus analysis to investigate the texlinguistic discourse moves of a genre. English for Specific Purposes, 20 (4), 313–329. http://dx.doi.org/10.1016/S0889-4906(00)00022-3


Van Rijsbergen, C. J. (1979). Information Retrieval (2nd ed.). London: Butterworth.


Yang, J. C., & Akahori, K. (1998). Error analysis in Japanese writing and its implementation in a computer assisted language learning system on the World Wide Web. CALICO Journal, 15 (1–3), 47–66.