Exploring the use of machine learning to automate the qualitative coding of church-related tweets
Issue: Vol 14 No. 2 (2019)
Journal: Fieldwork in Religion
This article builds-on previous research around the exploration of the content of church-related tweets. It does so by exploring whether the qualitative thematic coding of such tweets can, in part, be automated by the use of machine learning. It compares three supervised machine learning algorithms to understand how useful each algorithm is at a classification task, based on a dataset of human-coded church-related tweets. The study finds that one such algorithm, Naïve-Bayes, performs better than the other algorithms considered, returning precision, recall and f-measure values which each exceed an acceptable threshold of 70%. This has far-reaching consequences at a time where the high volume of social media data, in this case, Twitter data, mean that the resource-intensity of manual coding approaches can act as a barrier to understanding how the online community interact with, and talk about, church. The findings presented in this article offer a way forward for scholars of digital theology to better understand the content of online church discourse.
Author: Anthony-Paul Cooper, Emmanuel Awuni Kolog, Erkki Sutinen
Burgess, Regina L.
2013 Understanding Christian blogger motivations: Woe unto me if I blog not the gospel. Journal of Religion, Media & Digital Culture 2(2): 1-42.
2012 Understanding the relationship between religion online and offline in a networked society. Journal of the American Academy of Religion 80(1): 64-93.
Chen, Nan-Chen; Rafal Kocielnik; Margaert Drouhard; Vanessa Pena-Araya; Jina Suh; Keting Cen; Xiangyi Zheng and Cecilia R. Aragon
2016 Challenges of applying machine learning to qualitative coding. CHI 2016 workshop on human centred machine learning, California, USA.
Cheng, Zhiyuan; James Caverlee and Kyumin Lee
2010 You are where you tweet: a content-based approach to geo-locating twitter users. In proceedings of the 19th ACM International Conference on Information and Knowledge Management, Toronto, Canada, October, 2010. 759-768.
2014. Megachurch pastor Twitter activity: An analysis of Rick Warren and Andy Stanley, two of America’s social pastors. Journal of Religion, Media & Digital Culture 3(2): 1-32.
Crowston, Kevin; Xiaozhong Liu and Eileen E. Allen
2010 Machine learning and rule-based automated coding of qualitative data. ASIST 2010, Pittsburgh, USA.
2010 Twitter content classification. First Monday 15(2).
Hashem, Esraa M. and Mai S. Mabrouk
2014 A study of support vector machine algorithm for liver disease diagnosis. American Journal of Intelligent Systems 4(1): 9-14.
2016 @God please open your fridge! A content analysis of Twitter messages to @God: Hopes, humour, spirituality, and profanities. Journal of Religion, Media & Digital Culture 5(2): 339-355.
2007 Creating church online: A case-study approach to religious experience. Studies in world Christianity 13(3): 243-260.
Kaur, Gaganjot and Amit Chhabra
2014 Improved J48 Classification Algorithm for the Prediction of Diabetes. International Journal of Computer Applications 98(22): 13-17.
Mitchell, Tom M.
1997 Machine Learning. New York, USA: McGraw-Hill.
Naaman, Mor; Jeffrey Boase and Chih-Hui Lai
2010 Is it really about me? Message content in social awareness streams. CSCW 2010, Georgia, USA.
2011 Online content analysis using supervised machine learning – an empirical evaluation. 2011 ICA Conference, Boston, USA.