What is a word?

The way I intend to develop this question is contextualized by another question-and-potential-answer, namely, Q: ‘What is language?’ and, A: ‘Language is words.’

Broadly speaking, ‘language is words’ it not a totally terrible thesis. Many frames we have for thinking about language conceive of ‘language’ as words put together, in bits or en masse, perhaps in some ordered way. At the same time, it is possible to think of things which are very like language which might not strictly involve ‘words’, but in these cases I would wager that we would tend to say that they work like language, though their objects are different, i.e. their objects are not strictly words.

In these cases as well, asking the question of what makes something word-like usually works as a re-framing device, which brings us back to the question of, ‘What is a word?’

Thinking mainly in terms of information and dependencies (what I might think of as underlying the ‘structural’ tradition), I would propose the following:

  • Form is perhaps the most basic level at which we might register something as being a word. A word is sounded or printed, is either speech or script. Any other information that we might conceivably think of as having to do with a particular word is necessarily the subject of this form.
  • An assumption that I have admitted is that word-forms signify, or otherwise contain retrievable information.

At the same time, the considered combination of words appears to give rise to meaning on a higher level of complexity than random word-sequences. It could be a strong illusion, but if not then it appears that some kinds of dependencies between words must exist. About these dependencies, there appear to be at least two broad kinds.

  • The first kind involves thinking of words as forming broad categories, e.g. nouns, articles, counting words, etc. It seems valid to think about words as forming categories due to certain patterns in how they may be meaningfully combined, e.g. in English articles modify nouns. These patterns appear to be repeated at scale, or within a degree or two of fractal complexity. The test for membership of a category is usually substitution, i.e. that a word that is alleged to be an English mass noun (e.g. ‘shade’) can be substituted by another word which is known to be a mass noun.
  • The second kind involves words selecting other words according to a test other than categorial substitution. For example, it seems much less natural to think of ‘the red herring’ than ‘a red herring’. We may describe such dependencies as ‘associative’, or ‘collocative’, etc. These terms bespeak the suspicion that the underlying heuristic is frequency of association of words, within some sort of psychological reality, or other dynamic reality (e.g. social reality).

The paragraph immediately preceding happens to refer to what I think is the other way to think about what words are (i.e. not as structured information), namely that words are the artefacts of some sort of dynamic process, e.g. a cognitive process (possibly an evolved one, though my position is agnostic), a signification (semiotic) process, a social process, a discursive process, a market process (e.g. words as currency), etc. This way-of-thinking is impossible to avoid, insofar as I think that no matter how you slice it, words in use represent some sort of situationally dependent correlation between a form, an image or idea, and real objects or situations. However, it would probably be best to leave developing this idea to another post on another day.

Returning our focus to the two broad kinds of dependencies I mentioned, the point I wanted to get to was that it seems the difficulty is in deciding what the next most important level-of-distinction between kinds of dependencies is. Within either paradigm there is no difficulty in recognizing exceptions and marginal cases, and any of these can be taken (though not very usefully) as recommending the other paradigm. It is also easy to see how one paradigm is better equipped to handle certain kinds of problems than the other.

However it is more difficult to say, for example, what qualifies as a ‘compound word’, or define what ‘compounding’ might involve as compared to simple co-occurrence. It is easy to imagine how the attempt to develop this could be made within the second paradigm, but it is also important to questions of what the starting points are in developing some sort of categorial grammar.

Within the first paradigm, how substitutability gives rise to derivations seems to apply fractally to not just words, phrases, and ‘upwards’, but also to parts of words. This might be taken as indication of the importance of developing and refining a theory of derivations, with global principles applying to elementary and then more complex particles.

This place, from which I tried to ‘read’ the trends in how ideas about language are developing, is about where I wanted to go, and so it shall be where I stop.


