Wherefore art thou Singlish?

An NYT article by Gwee Li Sui about Singlish has been making the rounds this weekend, following the news from earlier this week about Singlish lexical items being added to the OED.

Being a student of the linguistics of Singlish, as well as something of a Singlish advocate (read: I generally like to talk about Singlish to anyone I think might be interested), both were welcome pieces of news.

But I was also motivated to add some comments of my own, after Haikel’s repost of Alfian’s critical comments, about Alfian’s comments as well as the NYT article.

1. ‘there is no standard Singlish’ (Alfian)

True enough: ‘standard’ is difficult to sanction1; and yet there is something that people recognize to be Singlish when they speak it and hear it.

I know I’m used to the idea that there is a Singlish grammar, and though I would agree that this is an idea deserving of critical scrutiny, I don’t think this is a difficult idea to validate.

Is there a Singlish grammar? I think we do use Singlish systematically, and there are clearly ‘wrong’ and ‘right’ ways to use it (among a given group, at a given time: language is always changing, quickly or slowly, in small ways or large). It is true that our use of it admits a greater range of variation than languages which have been more thoroughly codified, but even when we encounter variation the intuition is that some specific parameters have been varied, rather than suspect that the speaker is speaking not-Singlish, or is speaking Singlish wrongly.

Overall I think it is probably true that ‘there will be differences in the Singlish spoken by a Malay speaker, Chinese speaker or Indian speaker’, and, Alfian contends, ‘mainly in the vocabulary used’. Beyond vocabulary, I think even some grammatical constructions are more frequently used by speakers with different linguistic backgrounds as well, for instance in how indefinite subjects are introduced, e.g. ‘Got one man looking for you just now,’ vs. other ways of expressing the same proposition – but even with this example, it is not the case that a person who wouldn’t use this construction as his/her first choice wouldn’t understand the sentence, or judge it to be ungrammatical by whatever Singlish standards (that word again) they provisionally hold.

To take up the point on vocabulary, is there a Singlish lexicon? Whereas I am personally convinced that Singlish is structurally systematic and would defend the idea, I would be much less eager to attempt to designate the boundaries of a Singlish lexicon. Whether a word is a ‘Singlish’ word, as compared to being simply a Hokkien/Malay/etc. word, or ‘from’ one of these languages, is much more contentious.

I’ve also observed how Malay words like ‘koyak’ and ‘rosak’ end up being appropriate for different things for Singlish speakers who speak Malay, as compared to Singlish speakers who don’t speak Malay (mainly Chinese speakers; I actually collected data about this, based on prompting speakers to make judgments about when these words apply, in response to being shown pictures of dysfunctional objects). Whereas I learned that ‘koyak’ means something like ‘torn’ in Malay, and would apply more to, say, a deteriorating old book or a threadbare shirt, the non-Malay speakers thought it applied best to more solid or mechanical things like a keyboard or a chair.

So I think in general a good question to ask is where are we situated, relative to where our nominally ‘Singlish’ words are from, e.g. the examples of Hokkien that Alfian identified as having been presented as examples of Singlish. Growing up not speaking Hokkien, I would have likewise been prone to identifying Hokkien expressions with Singlish as well, and to that extent I found Gwee’s take on the matter, that ‘Singlish’s status grew so powerful that the Chinese dialects took refuge in it to re-seed themselves’, to ring true.

As concerns the lexicon, I could say either that the Singlish lexicon is capacious, implying that the language has somehow ‘integrated’ the vocabulary and made them ‘native’ to the platform, in some sense; or that it is really quite limited, and Singlish speakers simply accommodate the use of non-Singlish words all the time. Falling somewhere in between, one might say that it is a thoroughly bastardized language, with the latter situation contributing to the reality of the former – which would make it much like English, Singlish’s primary lexifier.

2. ‘Its syntax is drawn partly from Chinese, partly from South Asian languages.’ (Gwee)

In fact, based on the most credible linguistic scholarship (Bao 2005, Sato 2014, some others), I would say that Sinitic and Malay languages had the clearest influences on syntax.

Of the authors I cited, Bao rated the influence of Sinitic languages (Mandarin Chinese, but also Hokkien, Teochew, maybe Cantonese) as being primary, but others (like Sato) contest this and point to the influence of other local non-Sinitic languages.

South Asian languages have definitely provided lexical items that Singlish speakers use, but whereas the influence of Sinitic languages and Malay languages has been demonstrated in the literature, the influence of South Asian languages has not. That said, I can’t rule it out, not having examined the structure of South Asian languages in detail myself. Which brings us to:

2a. ‘any authority on Singlish needs to be fluent in at least four languages so as to avoid any kinds of biases that might arise from his or her own linguistic background’ (Alfian)

I agree with this, as far as the problem is making claims about which substrate languages are the most major influences on the grammar of Singlish, in which case we should have surveyed a good range of potential influences, or refer/defer to relevant sources.

On the other hand, being able to speak a language isn’t necessarily the standard a linguist would hold for being able to understand things about a language; there are many ways to think about languages, and many things about language to be interested in understanding.

The conceit of the linguist who studies the structural organization of language (I would say all linguists rely on this, but to varying degrees) is that we can understand the interrelation of grammatical systems on an abstract, universal level. A syntactician would also probably agree that having such an understanding of a language is materially different from mastering the language, but I think this shows that these are simply different knowledges. It is often the case, for instance, that someone working intensively on Javanese syntax gains some ability to converse in Javanese, but in presenting the work to someone who works on, say, Somali, the fact that the linguist studying Somali doesn’t speak Javanese is not assumed to be an impediment. If there is a difficulty, it would suggest that the data from Javanese either isn’t the most relevant to the proposed problem, or that it’s just poorly analyzed.

In trying to make sense of Singlish – what it is, where it comes from – the concept of a ‘contact language’ comes to mind. I might start an explanation with: Singlish is a language formed in a contact situation. Some of the languages (Sinitic languages, Malay) we regard as being part of the substrate, and the lexifier language (English) is regarded as the superstrate.

In looking at the syntax of Singlish, one gets the sense of an organic hybridity.

When it comes to contact languages, researchers are often interested to observe what features of the substrate end up featuring in the contact language. One situation we refer to is substrate transfer2, which is when a certain grammatical structure or system from the substrate language is reproduced in the contact language.

Another situation we refer to is substrate reinforcement, which is when a certain grammatical structure or system is found in more than one of the major substrate languages. This could have a stabilizing effect (i.e. the feature is more likely to be retained over time), and I would speculate that it probably even promotes transfer in the first place (though some may object to me tying reinforcement with transfer).

A related idea is congruency, which is when a structure is found in both the substrate and the superstrate. This is also thought to facilitate transfer.

When talking about contact languages, the term ‘contact ecology’ tends to come up, and I find that thinking about an idea like substrate reinforcement involves the kind of thinking that tries to discern and evaluate the contribution of multiple factors that give rise to the more-or-less equilibrium state of affairs.

The idea of ‘equilibrium’ also reflects the idea that the situation under examination is fundamentally dynamic: the idea I mentioned earlier about language always changing features in historical linguistics, but is also especially pertinent to contact languages.

From the work I’ve seen about Singlish syntax, I’m convinced that patterns of imperfect reinforcement and levelling give rise to a grammar that is fundamentally hybrid. Alfian cites the example of the deletion of the copular verb (‘is’, ‘are’), which he observes in Malay, but which is also found in Chinese, and which is why I think copula deletion has stuck around. Alfian’s example is, ‘They tired already but still want to play,’ (diorang dah penat tapi masih nak main), and in Chinese, I might say, ‘他们累了还要玩’ (plural-pronoun tired already still want play).

The grammar is also still changing, because some patterns of either reinforcement or congruence leave room for ambiguity3.

TL;DR: How Singlish is spoken does vary, but I think there is a core Singlish grammar. The syntax of Singlish is hybrid.

1. As some contend, the idea of ‘standard’ itself as applied to the languages people speak is always problematic: standards are institutions, but (fortunately) people seem to dislike hewing to just one institution at a time, all the time. 

2. Bao 2005 argues that, for instance, the aspectual system of Chinese can be observed in Singlish, though not every element was perfectly translated. (An example: Perfective aspect is marked by ‘already’ in Singlish, e.g. ‘He eat already.’ In Chinese, I might say ‘他吃了,’ which is, character-by-character, ‘male-pronoun eat perfective-aspect’. We mark tense in English morphologically, e.g. ‘He has eat-en,’ or, ‘He ate,’ and we don’t usually think of aspect as being tied into the grammar of English, but we can indicate aspect with words like ‘earlier’ or with constructions like ‘has finished eating’.)

3. An example: some speakers I consulted thought the sentence, ‘A man wore jacket,’ was not just odd, but outright ungrammatical, while other speakers thought it was perfectly fine. I hypothesized it was about an ambiguity about whether ‘a man’ should be analyzed as a subject or topic, which would affect whether a mechanism from English should be given precedence.


  • BAO ZHIMING (2005). The aspectual system of Singapore English and the systemic substratist explanation. Journal of Linguistics, 41, pp 237-267.
  • YOSUKE SATO (2014). Argument ellipsis in Colloquial Singapore English and the Anti-Agreement Hypothesis . Journal of Linguistics, 50, pp 365-401.

W1 Tuesday: Interfaces and Typologies

I missed writing yesterday’s post, and spent some of today out doing some admin (transporting items from home, getting shorn, etc.).

Two main texts I reviewed in detail, the first being Reuland and Meulen’s introductory chapter (pp.1-20) to The Representation of (In)definiteness.

My interest in the book was simply that I’d not seen noun phrases examined at this level before (basically at the level where the syntax and semantics interface, where the close examination could tell us more about the semantics than the historical discussion, valuable and irreplaceable as that is). Potentially, a direction for my research will be to look at at noun gradeability, and understanding an aspect of noun phrases even for something not directly related to gradeability seemed like a useful exercise when I picked up this book some time ago.

The prior reference point I had for the idea of (in)definiteness was Russell’s theory of definite descriptions, that referring expressions he would call ‘definite descriptions’ rather than indefinite ones are only evaluable if they refer to a real entity, i.e. the reason why saying ‘the King of France’ is strange, since there is currently no such King…

In English, at least, the distinction between definite expressions and indefinite ones is captured by the alternation of ‘a’ and ‘the’, for a broad class of nouns. On a more rigorous analysis, however, the main observation is that there are clear situations where either the definite or indefinite is strongly preferred, for instance in a sentence like, ‘There ensued a/the* riot on Massachusetts Ave.’ They refer to such situations as exhibiting a definiteness effect (DE).

Interestingly, even across languages, R&M note that where DEs are observed, some semantic conditions are observed quite consistently. Likewise, some syntactic conditions also seem to be observable, e.g. the correlation between an expletive subject and an indefinite argument elsewhere in the clause.

R&M proceed to discuss three approaches to analyzing definiteness (Milsark 1977; Barwise and Cooper 1981; and finally Heim 1982). Heim’s analysis I found particularly interesting. R&M characterize Heim’s as a ‘dynamic’ approach (in contrast to Milsark’s and B&C’s static ones) in that the interpretation of an NP is analyzed as a process.

The second text from this book I dug into was David Gil’s chapter (pp.254-269), titled ‘Definiteness, Noun Phrase Configurationality, and the Count-Mass Distinction’.

Gil’s interesting result is that there are at least two clearly distinct and definable NP typologies. His thesis is that ‘NP typology is a joint product of the two covarying parameters of configurationality and the count-mass distinction.’ He notes seven typological correlates, each of which either a Type A language has while the Type B language doesn’t, or vice-versa. His examples contrasted mainly English and Japanese.

To illustrate the count-mass example, he observes that in Type A languages usually a large class of nouns e.g. ‘count nouns’ are obligatorily marked for number, whereas in Type B languages there is either no marking (Japanese, Chinese), it is optional, or it has a special interpretation or occurs in specifically constrained contexts (Turkish, Indonesian). He shows how the parameters that determine this also influence other issues like whether (in)definiteness is obligatorily marked.

The other key term in his thesis, ‘configurationality’, refers to whether noun phrases have a hierarchical structure internally. Whereas Japanese does not, English does. Correspondingly, in English it is the determiner system within which NPs are integrated, whereas Japanese cannot truly be said to have determiners. An example that illustrates this difference is how the English phrase ‘Sam’s three books that Cyril read’ can only occur in this configuration as a noun phrase, whereas in Japanese each nominal or admonimal in the phrase can grammatically appear in any order.

Another interesting piece of data Gil introduces is of adnominal distributive numerals in Japanese (sansatsuzutsu meaning potentially ‘three each’ or ‘three at a time’ or ‘in threes’ – all actually subtly different in English). Gil argues that in the phrase sansatsuzutsu no hon the adnominal distributes over the nominal head, but basically Type A languages cannot have adnominal distributive numerals because numerals are determiners and not really adjectival or nominal modifiers.

I’ve not had much experience with language typology either, but I found Gil’s analysis illuminating. It raises the question of how  or whether I might do a similar analysis for Singlish in my project. I’m interested to find out my supervisor’s opinion about this analysis as well.


What is a word?

The way I intend to develop this question is contextualized by another question-and-potential-answer, namely, Q: ‘What is language?’ and, A: ‘Language is words.’

Broadly speaking, ‘language is words’ it not a totally terrible thesis. Many frames we have for thinking about language conceive of ‘language’ as words put together, in bits or en masse, perhaps in some ordered way. At the same time, it is possible to think of things which are very like language which might not strictly involve ‘words’, but in these cases I would wager that we would tend to say that they work like language, though their objects are different, i.e. their objects are not strictly words.

In these cases as well, asking the question of what makes something word-like usually works as a re-framing device, which brings us back to the question of, ‘What is a word?’

Thinking mainly in terms of information and dependencies (what I might think of as underlying the ‘structural’ tradition), I would propose the following:

  • Form is perhaps the most basic level at which we might register something as being a word. A word is sounded or printed, is either speech or script. Any other information that we might conceivably think of as having to do with a particular word is necessarily the subject of this form.
  • An assumption that I have admitted is that word-forms signify, or otherwise contain retrievable information.

At the same time, the considered combination of words appears to give rise to meaning on a higher level of complexity than random word-sequences. It could be a strong illusion, but if not then it appears that some kinds of dependencies between words must exist. About these dependencies, there appear to be at least two broad kinds.

  • The first kind involves thinking of words as forming broad categories, e.g. nouns, articles, counting words, etc. It seems valid to think about words as forming categories due to certain patterns in how they may be meaningfully combined, e.g. in English articles modify nouns. These patterns appear to be repeated at scale, or within a degree or two of fractal complexity. The test for membership of a category is usually substitution, i.e. that a word that is alleged to be an English mass noun (e.g. ‘shade’) can be substituted by another word which is known to be a mass noun.
  • The second kind involves words selecting other words according to a test other than categorial substitution. For example, it seems much less natural to think of ‘the red herring’ than ‘a red herring’. We may describe such dependencies as ‘associative’, or ‘collocative’, etc. These terms bespeak the suspicion that the underlying heuristic is frequency of association of words, within some sort of psychological reality, or other dynamic reality (e.g. social reality).

The paragraph immediately preceding happens to refer to what I think is the other way to think about what words are (i.e. not as structured information), namely that words are the artefacts of some sort of dynamic process, e.g. a cognitive process (possibly an evolved one, though my position is agnostic), a signification (semiotic) process, a social process, a discursive process, a market process (e.g. words as currency), etc. This way-of-thinking is impossible to avoid, insofar as I think that no matter how you slice it, words in use represent some sort of situationally dependent correlation between a form, an image or idea, and real objects or situations. However, it would probably be best to leave developing this idea to another post on another day.

Returning our focus to the two broad kinds of dependencies I mentioned, the point I wanted to get to was that it seems the difficulty is in deciding what the next most important level-of-distinction between kinds of dependencies is. Within either paradigm there is no difficulty in recognizing exceptions and marginal cases, and any of these can be taken (though not very usefully) as recommending the other paradigm. It is also easy to see how one paradigm is better equipped to handle certain kinds of problems than the other.

However it is more difficult to say, for example, what qualifies as a ‘compound word’, or define what ‘compounding’ might involve as compared to simple co-occurrence. It is easy to imagine how the attempt to develop this could be made within the second paradigm, but it is also important to questions of what the starting points are in developing some sort of categorial grammar.

Within the first paradigm, how substitutability gives rise to derivations seems to apply fractally to not just words, phrases, and ‘upwards’, but also to parts of words. This might be taken as indication of the importance of developing and refining a theory of derivations, with global principles applying to elementary and then more complex particles.

This place, from which I tried to ‘read’ the trends in how ideas about language are developing, is about where I wanted to go, and so it shall be where I stop.

Linguistic Productivity: Initial Thoughts

What is linguistic ‘productivity’?

A quick search finds that the emphasis tends to be on morphological derivation, especially when it comes to newly coined words. Taking the verbal form of ‘tweet’ as an example, we might derive forms like ‘tweeting’, ‘tweeted’, etc. according to the conventions of English verbal morphology.

There are cases which are potentially harder to classify. How, for instance, should we describe the relationship between ‘selfie’ and ‘wefie’? Not surprisingly, derivation processes also vary by region; I remember hearing about Japanese ‘selca’ (presumably selfcamera) from my sister before ‘selfie’ became the thing it is in culture.

Looking into the original derivation of ‘selfie’ itself turned up an interesting tangential finding of its own. As observed by etymonline, the ‘-ie’ suffix is now far less common than the ‘-y’ suffix in English. It is the function of the ‘-y’ suffix itself that is the point of interest, in that it appears terribly general in its potential scope of application, acting as a noun suffix, an adjective suffix (e.g. frost → frosty), a general state/condition suffix (crampy, perhaps?) , and as a kind of diminutive in pet names (e.g. kitty). This generality of scope readily suggests a range of conjectures regarding the semantic, syntactic, and even prosodic functions and constraints relevant to this suffixation.

These being my first scribblings on the subject, I don’t want to travel too far down that road, but I do want to record a few general questions that seem worth exploring:

  1. Can we identify, describe, and compare specific productive strategies?
  2. Can we measure productivity or productive force? Are there ways to measure relative preference?
  3. What could the presence of (or preference for) certain productive strategies in a particular dialect, idiom, or language tell us, if anything?

The context I have in mind is Singapore English, which I would characterize as a contact language where the substrate language(s) are Chinese dialects. (While lexical borrowings tend not to be from the Chinese languages, they form the ‘substrate’ in that they provide the syntactic structure.) The difficulty in applying an idea like linguistic productivity is that unlike, say, English, which has derivational morphology, in languages like Mandarin Chinese things like part-of-speech and tense/aspect would be indicated through processes like compounding, or inferred from the overall syntax of phrases and sentences. This leads us to the question of whether ‘productivity’ is potentially applicable beyond morphology at all, to syntactical processes or perhaps lexical compounding.

I’ve got a bunch of materials to read (including an article on ‘Lexical productivity versus syntactic generativity’), though I’m not sure where they might lead. This is part of a project I’ll be working on over the next few months.

The Polynesian /ko/, and a Reflection on Syntactic Categories

Most recently, I’ve been studying Hovdhaugen et al.‘s Handbook of the Tokelauan Language for the preparation of a grammar sketch. My interpretation of the morphemes in the subsequent example from the text are based on their analysis. In particular, when I use the abbreviations ‘VP’ and ‘NP’ below, I refer to their explanatory framework, wherein all sentences in Tokelau may be divided into VPs and NPs only, and whether a phrase is a VP or NP depends on the central word. Each VP and NP has slots in which grammatical particles (prepositions, etc.) can appear.

I ended up looking at one particular example, which involved three variants of a sentence each having the meaning, `The boys built the house,’ in some detail.

  1. The first variant had the phrase-order: [VP Na fau] [NP2 te fale] [NP1 e na tama] .
  2. The second variant had the phrase-order: [NP1 Ko na tama] [VP na ki latou faua] [NP2 te fale] .
  3. The third variant had the phrase-order: [NP1a Ko na tama] [VP na fau] [NP1b e ki latou] [NP2 te fale] .

Some of the elements that appear in the above examples, interpreted:

  • NP2, /te fale/, might be glossed as ‘the house’.
  • The main element of NP1, /na tama/, might be glossed as ‘the boys’. ‘na’ here is analyzed by Hovdhaugen et al. as a plural article.
  • The main element of VP, /na fau/, includes the morpheme /fau/ for ‘build’. ‘na’ in this case is analyzed as a past-tense marker (distinct from the article). The form ‘faua’ we see in the second sentence is necessitated by the inclusion of the pronoun /ki latou/ in the verb phrase.
  • The words /ki latou/ are the third-person plural pronoun (‘them’). It appears in the VP of the second sentence, and its own NP in the third sentence.
  • The word /e/ which we see in the first sentence in NP1 and the third sentence in NP1b was interpreted as a preposition-of-sorts meaning ‘by’. In the above examples, it communicates that ‘na tama’ are they agents.

The element which I did not cover in the above list is /ko/. In Hovdhaugen’s grammar, /ko/ is described as occurring in sentences where an NP is the initial phrase. Ergo, we do not see /ko/ in the first sentence, in which the VP is the initial phrase, but it appears in the second and third sentences.

Regarding the significance of /ko/, one account that the authors present is that between a sentence with a /ko/-initial NP in the sentence-initial and a similar sentence with the VP in the sentence-initial position, the /ko/-initial sentence has the discursive function of giving information about the subject who participated in event x, whereas the VP-initial sentence describes event x, wherein the ‘subject’ was a participant.

One interpretation of this difference is that /ko/ has the function of indicating the subject of the sentence. Relatedly (but not necessarily), one might say that /ko/ marks nominative case.

But the difficulty is precisely in characterizing its function. The category of ‘subject’ itself is difficult to specify, and it is not clear whether the system of markings in Tokelau should indeed be characterized as a system of cases.

The /ko/ morpheme itself is part of several Polynesian languages, and as such has been treated in the academic literature before. Among these descriptions of Polynesian languages, /ko/ has been interpreted as a predicate marker, as well as a ‘copular preposition’. The notion of ‘copular preposition’ is potentially resolvable with Hovdhaugen et al.’s grammar, in that they propose a broadly defined ‘Preposition’ slot in noun phrases. The idea that /ko/ marks a predicate is potentially more difficult to resolve, in that Hovdhaugen et al. analyze it an element of a noun phrase specifically, whereas ‘predication’ itself presupposes a concept of predicate and argument.

Moving away from a discussion of abstract categories for the moment, in the context of the example sentences, one might imagine the effect of ‘ko’-involving constructions to be commensurate with the effect of constructing English sentences beginning with, ‘It is…’ (Credit to my Prof. for this discussion point.) For example, third sentence above (‘Ko na tama na fau en ki latou te fale’) might be constructed as, ‘It is the boys who built the house,’ or even, ‘It is the boys, they built the house.’ (As compared to the default, ‘The boys built the house.’)

The ‘it is’ construction in English itself seems to me to reflect aspects of both the copular preposition and predicate marker proposals, in that they both serve to isolate some function or aspect of the subject, which thereby allows it to be foregrounded in the sentence. (In fact, I think it is fair to say that both the quality of the ‘it’ and the ‘is’ in this construction are recognized as exceptional in some way; we call this particular ‘it’ the expletive ‘it’, and ‘is’ in English what we have traditionally called a copula, and is in any case not a typical ‘verb’.)

On a more fundamental level, I see the concept of predication and the function of a copula as operating on that troublesome boundary between noun-iness and verb-iness.

Earlier today I also happened to attend a senior’s comps talk on ambiguity and syntactic reanalysis. A brief illustrative example of both concepts is with the sentence, ‘The old man the boat.’ While it seems wildly ungrammatical on an initial reading, there is a grammatical interpretation (which I assume you have come to).

One theory of what causes that initial cognitive difficulty is that we construct the syntactic relations between the elements as we read the sentence, and upon encountering a difficulty in continuing the construction, we have to reanalyze the sentence. It was, however, a later point in the talk that seemed relevant to the problem of characterizing the function of /ko/. The presenter quoted a study wherein sentences like, ‘While John hunted the deer ran into the woods,’ were shown to subjects, after which the subjects were asked if John hunted the deer. The result was that many subjects answered in the affirmative. (In contrast, almost all of my friends in the linguistics department at the comps talk answered in the negative; I think there is a simple explanation for this discrepancy, which is that we’d been prompted to process things syntactically by having considered earlier ambiguous examples in the talk.)

making. That most respondents put ‘hunt’ and ‘deer’ together was thoroughly unsurprising, and that this superseded a syntactic judgment which (I believe) they were capable of making suggests that syntax is more superficial than fundamental. For instance, the categories of ‘noun’ and ‘verb’ are treated in our syntax as so clearly distinct that we may decisively call ‘man’ in ‘The old man the boat,’ a verb. By the same theory, we may conceive of the syntactically correct way to put ‘hunt’ and ‘deer’ together.

However, would we be able to make the same claim as decisively about the syntax of another language? Like the copula ‘is’ and the expletive ‘it’ in English, the Polynesian /ko/ seems to occur at that troublesome noun/verb interface, insofar as it seems potentially to mark a subject, suggest an agent (in Theta theory, any verb may be analyzed as having a certain configuration of nominal arguments, for example in how the verb ‘nag’ suggests a nagger and someone who is nagged), or even assign nominative case. Potentially none of these accounts are coherent, if our categories ‘noun’ and ‘verb’ need to be constructed differently.



This post is about a particle of uncertain function in the language we’ve been studying in ‘Field Methods’. It was observed both as a sentence-particle, as well as part of some verbs, and also as a noun-suffix. The question is about whether these are related, and what the relation might be.

In Parts I – III, I survey some of the relevant data. In Parts IV and V, I propose two hypotheses about the particle. In Part VI, I conclude that there may be two distinct ‘so’ elements, and that each may give rise to an unexpected relationship between data-points, but that neither unexpected relationship may have to do with directionality.

I. ‘sɔ’ in a verb

Doing a quick ctrl+F of ‘sɔ’ in the Swadesh list main sheet, we see it come up in the verbs ‘walk’ and ‘flow’. ‘walk’ is given as [sɔʔɔd], but in the sentences it is changed to [usʌɔte]:

[ħælimo wʔheɪ usʌɔte sɪ dɔχsehɔ] (Halimo walked quickly, [dɔχsehɔ] meaning quickly.)
[ħælimo wʔheɪ usʌɔte sɪ tartibə] (Halimo walked slowly, [tartibə] meaning slowly.)

For ‘flow’, it comes up as:

[ħælimo di:keɾɑ wu: sɔdea] (Halimo’s blood flows, [di:keɾɑ] meaning ‘her blood’.)

One interesting point about the verbs for ‘flow’ and ‘walk’ is their superficial similarity in terms of consonants and vowels, as well as meaning; it was noted in the spreadsheet that the consultant said that the word she supplied for ‘flow’ was an inexact translation of ‘walk’.

II. ‘walk’ and ‘flow’

We have two sentences for ‘walk’ in the spreadsheet and one from Moodle.

a. [ħælimo wʔheɪ usʌɔte sɪ dɔχsehɔ/tartibə] – Halimo walked quickly/slowly.
b. [wəhħɐ̃n dʊl soʔəde kɛɪmɐhɐ] – I was walking in the woods.
c. [wuhħɐn kʌsi/ku soɜdeɪ dukaːnkə] – I walk from/to the store.

From our study of verbal inflection, it seems to me that the ‘-adə-‘  suffix/infix seems to indicate progressive tense or aspect. In (a), the irregular form there might be due to the verb inflection reflecting gender.

At the same time, we have a sentence for ‘flow’ from the spreadsheet.

d. [ħælimo di:keɾɑ wu: sɔdea]

Here the consultant observed that [sɔdea] was an inexact translation of ‘flow’, and that it was also used to mean ‘walk’.

These two data-points suggest that if ‘sɔ’ isn’t a morphological root, it might be an etymological one. On the broadest level of meaning, it could encode something like change in general (of position, flux). 

III. ‘so’ in verbs, conjugations, and sentences

ctrl+F for ‘so’ turns up more results, as part of a verb, conjugation, or in a sentence as an apparently separate morpheme.

One tendency is for it to turn up in sentences which involve a change in position: going in the house (sentence), walking in the woods (verb), or falling down (sentence).

A second trend is in sentences with the word ‘will’, e.g. ‘will wash the dog’, ‘will see the snake’. This is similar to the sentence with ‘so’ about swimming in the immediate future which we elicited in class.

A third interesting sentence is about the bag that Farah brings with him to school:

[fɐɾɐħ ʃɪndʌdiso wəhe lɐkħɐbte skul]

[ʃɪndʌdiso] here was glossed as ‘his bag’ in the spreadsheet, but from some other tests to do with possession, I think probably means something more like ‘with his bag’.

The fourth instance was also interesting, this being a sentence about Farah never fearing snakes:

[fɐɾɐħ wɐlʌgis mʌ kɐʕɐbsodo mʌs.so]

[wɐlʌgis] probably means ‘never’, while [kɐʕɐbso] was glossed as ‘to fear’ in the spreadsheet. We actually see ‘so’ twice in this sentence, the other being in [mʌs.so]

IV. Anticipation, fear of snakes

One hypothesis that I am considering is simply that ‘so’ as a sentence-particle encodes the anticipation of change. This is related to the conjecture about its etymological significance.

We see ‘so’ in sentence referring to the expectation of change (‘will wash the dog’, ‘will swim’ in the immediate future, etc.)

If ‘to fear’ was given as the translation of [kɐʕɐbso] (rather than written as a gloss), then this would seem to be a nice dovetail as far as etymological meaning is concerned. 

Generalizing this to the verbs for ‘walk’ and ‘flow’, I might say that there is an etymological relation for ‘so’ as a sentence particle and ‘so’ in some movement-verbs.

V. Noun-suffix, and a particle for subjective perspective

‘so’ in the sentence about the bag ([ʃɪndʌdiso]) and the second instance of ‘so’ in the sentence about the snake ([mʌs.so]) seem to be related to each other at the level of being noun-suffixes, indicating a noun in a specific kind of subsidiary relation to the subject. An interesting test might be to see if the word-order is flexible in either sentence, as this might lend support to the hypothesis that the suffix indicates some theta relation. Might we be able to say [fɐɾɐħ mʌs.so wɐlʌgis  mʌ kɐʕɐbsodo], for instance?

On the other hand, this suffix wouldn’t seem to be related to anticipation of change.

At the same time, if it indicates some kind of relation between subjects, then its appearance in the sentences about going into a house might be explicable in that way. The sentences about going into a house are as follows:

[gorigɐ be so gælijɛn] – They went into the house.
[gorigɐ be so gʌlɐ:n] – They go into the house.

In section III I observed one trend  about ‘so’ turning up in sentences about movement or a change in position, but this would appear to be a false lead, in light of the conjectures made so far. Whereas the sentence-particle in the sentences about falling and going into the house would be related, the ‘so’ element in the verb for ‘walk’ would be distinct.

VI. Summary 

I have speculated that ‘so’ might operate syntactically to indicate some specific relation between subjects. The relevant tests here might be about flexible word order between clauses, for example in sentences that would have to employ relative clauses in English.

At the same time, I suggest that this operation of ‘so’ might be distinct from the ‘so’ indicating anticipatory tense/aspect/modality (e.g. ‘will swim’), and that this anticipatory ‘so’ might be etymologically related to some basic verbs. Something that suggests this relation was the ambiguity about the translation of the word ‘flow’.

It might be the case, then, that despite appearances ‘so’ isn’t related to directionality or movement, but rather to the anticipation of change, while there is a distinct ‘so’ that is a syntactic marker having to do with subjective perspective.



I digress, here, but as I was conversing via keyboard with someone, the word ‘self-depredation’ was used instead of ‘self-deprecation’. Unsure about whether my interlocutor was referencing physically self-destructive behavior or not, I replied with a link to a dictionary entry, as a way of asking for clarification – probably not the best way to clarify, but I happened to be writing this at the same time…

At the same time, I came across the following argument about the relation between ‘self-deprecation’ and ‘self-depreciation’:

Self-deprecation is the act of reprimanding oneself. The term is almost always used incorrectly, to refer to self-depreciation, which means belittling oneself.

I’m not sure I buy this anonymous Wikipedia editor’s argument here; by this argument, how do we know when a self-deprecating is self-depreciating, rather than merely verbally self-deprecating?

Etymology is not a useful guide here either, in that the sense of modern words based on precari (‘prayer’) may have changed quite a bit. Etymologically, ‘deprecation’ means to pray away, in advance, some evil. Dictionaries now give the meaning of ‘deprecate’ as to criticize. Compare this to ‘imprecation’, which, now as ever, refers to a kind of curse; fashions are quite different, however, in that prayer isn’t promoted as a vehicle for cursing very much anymore.

All in all, the superficial and semantic similarity between ‘self-depredation’, ‘self-deprecation’, and ‘self-depreciation’ might be suggestive of some relation, but one would be hard put to supply a serious analysis of it. One would be liable to be far wrong in most cases, or so I imagine.

Predication in an Alien Language

I’m doing ‘Field Methods in Linguistics’ this term, in which we’re working on understanding an unfamiliar language, through direct interviews with our consultant (a native speaker of the language). For our first short paper, we had to propose a basic analysis of the sentence structure. I ended up coming up with an initial theory that I’m quite happy with, and I thought I’d write about how I came to it.

In a nutshell, my theory is that every sentence in the language has an obligatory predicate-marker before the predicate proper. As such, the form of a simple sentence would be something like:

(Subject) – pred – Predicate

By my hypothesis, the predicate marker cannot be omitted, and is always present in some form. The predicate itself may either be a verb, or an adjective (in the sense of ‘is red’). While the predicate marker might be thought of as functioning like ‘be’ in some ways, it doesn’t carry tense; in the language we’re studying it is the verb which realizes the tense/aspect, although the predicate marker is involved in various kinds of agreement. Another interesting feature is that by my hypothesis, it is never absent, so it is akin to having an ‘affirmative’ declaration with every predicate, except, of course, when the sentence expresses something other than ‘affirmative’ (e.g. negative, question, etc.).

But my main motivation for writing this post was to recount some of the ideas and theories that guided my thought-process. What I hypothesized to be a predicate-marker is the so-called [w-]-morpheme that we’ve been discussing in class.

  1. In one of the first few classes, a classmate mentioned a language where both affirmative and negative conditions are marked. (In contrast, in English, we only distinguish ‘waste’ and ‘ do not waste’, without having to say ‘do waste’.)
  2. My lab partner had observed that negation in a sentence patterned with the appearance of the [m-] morpheme and the omission of the [w-] morpheme, in a way that was reminiscent of ‘do’-support in English (e.g. ‘He says.’ vs. ‘He does not say.’).
  3. Some data about how to form questions was also discussed in class, and it was noticed that the [w-] morpheme was also absent.
  4. In a later class, a classmate raised the example of the sentences for ‘I melted the butter.’  and ‘The butter was melted by me.’ I noticed that it was possible to omit both the subject and the object (‘I’/’me’ and ‘butter’), which left only the predicate for ‘melt’ (in its presumably conjugated form) with a preceding morpheme.

I had also observed a few things from the data.

  1. There is a long form of the [w-] morpheme and a shorter form. I noticed that the longer form did not appear with adjectives and verbs with no direct object.
  2. I also noticed that the subject could be dropped if it was either a pronoun, or if the predicate was an adjective. With the butter example, it appeared to me that the subject was omitted in the spoken sentence, but that it was semantically present, in that it seems clearly discernible from the spoken context.
  3. I noticed that the negative marker and the question marker patterned together.

Fragments from these observations and discussions about some points of syntax conspired to bring the notion of predication to mind, and brought me back to my formal logic and computer logic lessons from last term. I was also fortunate to have been looking at adjectives in lab sessions, in that the predication of properties (‘is red’, etc.) was on my mind. After observing how subjects could be dropped, and how question-marker and negation patterned together vis-a-vis the [w-]-marker, I was led to the hypothesis described above.

All in all, this was good fun. On the one hand, the computational and logical paradigm made the patterning comprehensible, but on the other hand more organic questions about register (formality and informality) and semantics (negation is so tricky) also played a part.