Item Details

Exploring the Use of Machine Learning to Automate the Qualitative Coding of Church-related Tweets

Issue: Vol 14 No. 2 (2019)

Journal: Fieldwork in Religion

Subject Areas: Religious Studies Linguistics

DOI: 10.1558/firn.40610

Abstract:

This article builds on previous research around the exploration of the content of church-related tweets. It does so by exploring whether the qualitative thematic coding of such tweets can, in part, be automated by the use of machine learning. It compares three supervised machine learning algorithms to understand how useful each algorithm is at a classification task, based on a dataset of human-coded church-related tweets. The study finds that one such algorithm, Naïve-Bayes, performs better than the other algorithms considered, returning Precision, Recall and F-measure values which each exceed an acceptable threshold of 70%. This has far-reaching consequences at a time where the high volume of social media data, in this case, Twitter data, means that the resource-intensity of manual coding approaches can act as a barrier to understanding how the online community interacts with, and talks about, church. The findings presented in this article offer a way forward for scholars of digital theology to better understand the content of online church discourse.

Author: Anthony-Paul Cooper, Emmanuel Awuni Kolog, Erkki Sutinen

View Full Text

References :

Bobkowski, Piotr S., and Lisa D. Pearce

2011 Baring their Souls in Online Profiles or Not? Religious Self-disclosure in Social Media. Journal for the Scientific Study of Religion 50(4): 744–62. https://doi.org/
10.1111/j.1468-5906.2011.01597.x

Burgess, Regina L.

2013 Understanding Christian Blogger Motivations: Woe unto Me If I Blog Not the Gospel. Journal of Religion, Media and Digital Culture 2(2): 1–42. https://doi.org/
10.1163/21659214-90000030

Campbell, Heidi

2012 Understanding the Relationship between Religion Online and Offline in a Networked Society. Journal of the American Academy of Religion 80(1): 64–93. https://doi.org/10.1093/jaarel/lfr074

Chen, Nan-Chen, Rafal Kocielnik, Margaret Drouhard, Vanessa Peña-Araya, Jina Suh, Keting Cen, Xiangyi Zheng and Cecilia R. Aragon.

2016 Challenges of Applying Machine Learning to Qualitative Coding. Paper presented at CHI 2016 Workshop on Human Centred Machine Learning, San Jose, CA, USA, 7–12 May 2016. http://chi2016.acm.org/wp/ https://doi.org/10.1109/pacificvis.2017.8031598

Cheng, Zhiyuan, James Caverlee and Kyumin Lee

2010 You Are Where You Tweet: A Content-Based Approach to Geo-locating Twitter Users. In Proceedings of the 19th ACM International Conference on Information and Knowledge Management, Toronto, Ontario, Canada, 26–30 October 2010, 759–68. https://doi.org/10.1145/1871437.1871535

Codone, Susan

2014 Megachurch Pastor Twitter Activity: An Analysis of Rick Warren and Andy Stanley, Two of America’s Social Pastors. Journal of Religion, Media and Digital Culture 3(2): 1–32. https://doi.org/10.1163/21659214-90000050

Cooper, Anthony-Paul

2014 Unwrapping Camden’s Church Tweeters: A Small-scale Thematic Study of Twitter Data. In Social Media in Social Research: Blogs on Blurring the Boundaries, edited by K. Woodfield. eBook. London: NatCen Social Research.

2017 Assessing the Possible Relationship between the Sentiment of Church-related Tweets and Church Growth. Studies in Religion/Sciences Religieuses 46(1): 37–49. https://doi.org/10.1177/0008429816664215

2018 Using Geotagged Twitter Data to Uncover Hidden Church Populations. In The Desecularisation of the City: London’s Churches 1980 to the Present, edited by D. Goodhew and A. P. Cooper, 134–47. Abingdon: Routledge. https://doi.org/10.4324/
9781351167765-6

Cooper, Anthony-Paul, Joshua Mann, Erkki Sutinen and Peter Phillips

2020 Understanding London’s Church Tweeters: A Content Analysis of Church-Related Tweets Posted from a Global City. Manuscript submitted for publication.

Crowston, Kevin, Xiaozhong Liu, Eileen E. Allen and Robert Heckman.

2010 Machine Learning and Rule-based Automated Coding of Qualitative Data. Paper presented at ASIST 2010, Pittsburgh, PA, USA, 22–27 October 2010.

Dann, Stephen

2010 Twitter Content Classification. First Monday 15(2).

Holmberg, Kim, Johan Bastubacka and Mike Thelwall

2016 @God Please Open Your Fridge! A Content Analysis of Twitter Messages to @God: Hopes, Humour, Spirituality, and Profanities. Journal of Religion, Media and Digital Culture 5(2): 339–55. https://doi.org/10.1163/21659214-90000085

Hutchings, Tim

2007 Creating Church Online: A Case-study Approach to Religious Experience. Studies in World Christianity 13(3): 243–60. https://doi.org/10.3366/swc.2007.13.3.243

Kaur, Gaganjot, and Amit Chhabra

2014 Improved J48 Classification Algorithm for the Prediction of Diabetes. International Journal of Computer Applications 98(22): 13–17. https://doi.org/10.5120/17314-7433

Kolog, Emmanuel Awuni

2018 Detecting Emotions in Students’ Generated Content: An Evaluation of EmoTect System. In Technology in Education. Innovative Solutions and Practices, edited by S. Cheung, J. Lam, K. Li, O. Au, W. Ma and W. Ho, 235–48. ICTE 2018: Communications in Computer and Information Science, Vol. 843. https://doi.org/10.1007/
978-981-13-0008-0_22

Kolog, Emmanuel Awuni, Erkki Sutinen and Eeva Nygren

2016 Hackathon for Learning Digital Theology in Computer Science. Modern Education and Computer Science 6: 1–12. https://doi.org/10.5815/ijmecs.2016.06.01

Mitchell, Tom M.

1997 Machine Learning. New York: McGraw-Hill.

Naaman, Mor, Jeffrey Boase and Chih-Hui Lai

2010 Is It Really about Me? Message Content in Social Awareness Streams. Paper presented at CSCW 2010, Savannah, Georgia, USA, 6–10 February 2010. https://doi.org/10.1145/1718918.1718953

Scharkow, Michael

2011 Online Content Analysis Using Supervised Machine Learning—an Empirical Evaluation. Paper presented at International Communication Association (ICA) Conference 2011, Boston, USA, 20–30 May 2011.

Taheri, Sona, Musa Mammodov and A. M. Bagirov

2011 Improving Naive Bayes Classifier Using Conditional Probabilities. Paper 121 presented at 9th Australian Data Mining Conference, Ballarat, Victoria, Australia, 1–2 December 2011, 63–68.