Week 8 (Thursday): Characterizing NPs in the Data

All has not been idle on this front, though I did go through two very uncertain weeks, which were followed by two weeks of reorientating and conceptualizing.

I’m using this blog again to just sort out some thoughts from looking through the data I am compiling. I don’t think I will be able to exhaustively tag everything, but I need to note some general trends and describe how some of the categories are emerging.

Possessive Determiners

At least from a quick first look, possessive determiners (pd) are the most common NPs. Their properties tend to be [+def, +spec]. This preference is really quite evident. Possibly worth comparing to CoSiB corpus – actually may not be too difficult to implement a search, since pd is a closed class. If it is super common, this would be a variational feature of note, that definiteness gets marked this way preferentially, over the article.

Unusual nouns and N-Ellipsis

I haven’t been able to find that many unmarked bare common nouns which are ambiguously (in)definite. Many are mass or kind nouns, or seemingly proper nouns. A possible exception is ‘O’-level, which seems like a proper noun, but part of the case for interpreting it as a common noun being that it can be pluralized – but probably that won’t work. It usually means ‘~ results’ or ‘~ exams’, too, so it’s easier to interpret it as N-ellipsis (nominal ellipsis).


Pro-drop is very prevalent, even in mid-sentence. Comparing mid-sentence Pro-drop with the usual kinds of ellipsis in English, we see that it is usually verbs, modals, their respective phrases, dependent clauses, or entire clauses that are gapped or sluiced – but not NPs.

‘Got’ construction

Some sentences are structured, ‘Got NP?’, and it seems like pro drop applies, i.e. ‘e got NP?’ However, ‘Got NP’ can also be in the topic position, e.g. ‘Got one time they went KL.’ ‘Got’ in the latter construction functions in a way comparable to expletive-‘There’.


Abandoned this post half-way; I decided I probably should not do an exhaustive analysis of trends, since there are too many dimensions and too much data to tag.

Now it’s Week 10, essentially. I’m about 5k words in on the Overleaf document. Some ways to go.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s