Exploring the use of machine learning to automate the qualitative coding of church-related tweets
Issue: Vol 14 No. 2 (2019)
Journal: Fieldwork in Religion
Subject Areas: Religious Studies Linguistics
DOI: 10.1558/firn.39789
Abstract:
This article builds-on previous research around the exploration of the content of church-related tweets. It does so by exploring whether the qualitative thematic coding of such tweets can, in part, be automated by the use of machine learning. It compares three supervised machine learning algorithms to understand how useful each algorithm is at a classification task, based on a dataset of human-coded church-related tweets. The study finds that one such algorithm, Naïve-Bayes, performs better than the other algorithms considered, returning precision, recall and f-measure values which each exceed an acceptable threshold of 70%. This has far-reaching consequences at a time where the high volume of social media data, in this case, Twitter data, mean that the resource-intensity of manual coding approaches can act as a barrier to understanding how the online community interact with, and talk about, church. The findings presented in this article offer a way forward for scholars of digital theology to better understand the content of online church discourse.
Author: Anthony-Paul Cooper, Emmanuel Awuni Kolog, Erkki Sutinen
References :
Burgess, Regina L.
2013 Understanding Christian blogger motivations: Woe unto me if I blog not the gospel. Journal of Religion, Media & Digital Culture 2(2): 1-42.
Campbell, Heidi
2012 Understanding the relationship between religion online and offline in a networked society. Journal of the American Academy of Religion 80(1): 64-93.
Chen, Nan-Chen; Rafal Kocielnik; Margaert Drouhard; Vanessa Pena-Araya; Jina Suh; Keting Cen; Xiangyi Zheng and Cecilia R. Aragon
2016 Challenges of applying machine learning to qualitative coding. CHI 2016 workshop on human centred machine learning, California, USA.
Cheng, Zhiyuan; James Caverlee and Kyumin Lee
2010 You are where you tweet: a content-based approach to geo-locating twitter users. In proceedings of the 19th ACM International Conference on Information and Knowledge Management, Toronto, Canada, October, 2010. 759-768.
Codone, Susan
2014. Megachurch pastor Twitter activity: An analysis of Rick Warren and Andy Stanley, two of America’s social pastors. Journal of Religion, Media & Digital Culture 3(2): 1-32.
Author
Author
Author
Author
Crowston, Kevin; Xiaozhong Liu and Eileen E. Allen
2010 Machine learning and rule-based automated coding of qualitative data. ASIST 2010, Pittsburgh, USA.
Dann, Stephen
2010 Twitter content classification. First Monday 15(2).
Hashem, Esraa M. and Mai S. Mabrouk
2014 A study of support vector machine algorithm for liver disease diagnosis. American Journal of Intelligent Systems 4(1): 9-14.
Holmberg, Kim; Johan Bastubacka and Mike Thelwall
2016 @God please open your fridge! A content analysis of Twitter messages to @God: Hopes, humour, spirituality, and profanities. Journal of Religion, Media & Digital Culture 5(2): 339-355.
Hutchings, Tim
2007 Creating church online: A case-study approach to religious experience. Studies in world Christianity 13(3): 243-260.
Kaur, Gaganjot and Amit Chhabra
2014 Improved J48 Classification Algorithm for the Prediction of Diabetes. International Journal of Computer Applications 98(22): 13-17.
Author
Author
Mitchell, Tom M.
1997 Machine Learning. New York, USA: McGraw-Hill.
Naaman, Mor; Jeffrey Boase and Chih-Hui Lai
2010 Is it really about me? Message content in social awareness streams. CSCW 2010, Georgia, USA.
Scharkow, Michael
2011 Online content analysis using supervised machine learning – an empirical evaluation. 2011 ICA Conference, Boston, USA.