Polysemy and word frequency: A replication
Issue: Vol 4 No. 2 (2017)
Subject Areas: Linguistics
One piece of evidence adduced by George Kingsley Zipf for his eponymous law (Zipf, 1935) and its explanation of the principle of least effort (Zipf, 1949) is the hypothesis that a word's polysemy is proportional to the square root of its frequency (Levelt, 2013). Pawley (2006) following Zipf, also proposes that 'there is a strong general correlation between frequency and the extent of polysemy'. This paper replicates Zipf 's approach but with data drawn from different sources to those available to Zipf, namely, for word frequency, the Kilgarriff most frequent word list drawn from the BNC (Kilgarriff, 1995) and, as a measure of polysemy, the WordNet data for the polysemy of the words in Kilgarriff's list. It also takes note of the syntactic category of lexemes. More advanced statistical modelling is used. Zipf 's observations are confirmed with some provisos. Their utility is examined. Explanations for this relationship remain to be established.
Author: Koenraad Kuiper, Robert Fromont, Daniel Gerhard
Amir, Y. and Sharon, I. (1990). Replication research: A ‘must’ for the scientific advancement of psychology. Journal of Social Behavior and Personality 5 (4): 51–69.
Baayen, R. H., Shaoul, C., Willits, J., and Ramscar, M. (2015). Comprehension without segmentation: A proof of concept with naive discrimination learning. Language, Cognition, and Neuroscience 31 (1): 106–128.
Barque, L. and Chaumartin, F.-R. (2006). Regular polysemy in WordNet. LDV-Forum 21 (1): 1–14.
Chaplot, D. S., Bhattacharyya, P., and Paranjape, A. (2015). Unsupervised word sense disambiguation using Markov random field and dependency parser Paper presented at the 29th AAAI Conference on Artificial Intelligence (AAAI-15), Austin, Texas.
Crossley, S., Salsbury, T., and McNamara, D. (2010). The development of polysemy and frequency use in English second language speakers. Language Learning: A Journal of Research in Language Studies 60 (3): 573–605.
Grimshaw, J. (1990). Argument Structure. Cambridge, MA: MIT Press.
Hernández-Fernández, A., Casas, B., Ferrer-i-Cancho, R., and Baixeries, J. (2016). Testing the robustness of laws of polysemy and brevity versus frequency. In P. Král and C. Martín-Vide (Eds) Statistical Language and Speech Processing. SLSP 2016. Lecture Notes in Computer Science, vol 9918. Champaign, IL: Springer.
Levelt, W. J. M. (1989). Speaking: From Intention to Articulation. Cambridge, MA: MIT Press.
Levelt, W. J. M. (2013). A History of Psycholinguistics: The pre-Chomskian Era. Oxford: Oxford University Press.
Nation, I. S. P. (2008). Teaching Vocabulary: Strategies and Techniques. Boston, MA: Cengage Learning.
Pawley, A. (2006). Where have all the verbs gone? Remarks on the organisation of language with small, closed verb classes. Paper presented at the 11th Biennial Rice University Linguistics Symposium. Austin, Texas.
Tengi, R. I. (1998). Design and implementation of the WordNet lexical database and searching software. In: C. Fellbaum (Ed.) WordNet: An Electronic Lexical Database, 105–127. Cambridge, MA: MIT Press.
Wittgenstein, L. (1965). Philosophical Investigations. New York: The Macmillan Company.
Yang, C. (2016). The Price of Linguistic Productivity: How Children learn to break the Rules of Language. Cambridge, MA: MIT Press.
Zipf, G. K. (1949). Human Behaviour and the Principle of Least Effort. Cambridge, MA: Addison-Wesley.