The 2nd dimension.

Wednesday, January 4, 2017

The lexical category of particles.


 "Particles" have no part of speech.


Earlier descriptions of subcategorization

In that first generation of great young desriptivists from MIT, Robert Lees gave arbitrary and artificial category symbols to express restrictions that tree neighbors place on heads. I'm not sure I've got the following example exactly right, and I suspect Lees wrote it tongue-in-cheek:
  • Vt32 -> "give" (Robert Lees, Grammar of English nominalization)
Realizing that CFG places no restriction on the names of non-terminal symbols (after all he invented CFG), Chomsky proposed a more natural notation to say within the names of the categories of heads what near neighbors could be present:
  • V -> CS (Chomsky, Aspects of the theory of syntax)
means that the non-terminal "V" is replaced with a name made from a set of feature specifications saying what tree sisters could be present for the specific verb that replaced this non-terminal symbol.

However, this proposal of Chomsky's has an odd property that makes it seem to me to be artificial. A subcategorization restriction has to be said twice. You wind up with trees having, for instance, a transitive verb dominated by a symbol whose name says it's transitive and, also, a following NP object. We shouldn't have to do it twice.

A better way to simply not use special categories like "V", "N", "P" for heads. There is no syntactic language evidence I know of for their existence (though there may be morphological evidence). Their only purpose is to let us keep rules concerning words separate from rules that describe phrase structure. I guess the reason for this is that dictionaries are customarily different books from grammars. But this is not evidence from language.

Categorial Grammar (CG) does not require category symbols for heads comparable to "V", "N", "P" and so on. This lets us avoid the issue of how to handle the subcategorization of "V", ...

Illustration of Categorial Grammar

Logicians concerned with the logic of natural language have a fondness for CG, invented by the logician Ajdukiewicz, because the grammatical structure of expressions corresponds in a straightforward way to their semantic structure. Grammatical categories are structures built up with the slash connective, which gives the category of a syntactic function after the slash and the category of the function value before the slash:

"I tidied up the room", S  
    /            \  
"I", NP    "tidied up the room", S/NP  
                    /               \  
         "tidied up", (S/NP)/NP    "the room", NP 
 
Each node of the CG tree for a syntactic structure consists of a constituent and its category. In place of a "V" for the category of "tidied up", we get the category "(S/NP)/NP", which can be read as meaning that this constituent has two NP dependencies which, when satisfied, will yield a constituent of category S. There is no "V" or "VP". In defense of the CG version of grammatical structure, notice that it correctly predicts that "and" cannot be used to conjoin a transitive verb with category (S/NP)/NP with an intransitive verb, which would have a different category: S/NP, because only expressions of the same category can be conjoined.

This illustration I have given does not get right the linear order of a grammatical function expression and its argument expression. More careful formulations of CG can deal with that detail. Emmon Bach proposed a special version of the slash operation, "right wrap", that we could appeal to get the NP "the room" to the left of the the "up" when "tidy up" combines with "the room".

I find the slash operators that are required to make CG work to be artificial, but there is a way to adapt CFG to get the advantages of CG. Noticing some similarity between the information provided at each node of a CG tree and that provided in a rewrite rule of PSG, let us redo the CG tree above with PS rules at each node:

Categorial Grammar partially converted to PSG

S -> "I tidied up the room"  
    /            \  
NP -> "I"    S -> NP "tidied up the room"  
                    /                \  
         S -> NP "tidied up" NP    NP -> "the room"
 
In this form, what would be a derivation in a CFG has been expressed in tree form, with each non-leaf node produced by applying one daughter rewrite rule to the other daughter rewrite rule. Since what were the constituent and category at each node in the CG tree are no longer in two separate parts, if we still wish to refer to constituent and category, we must define those to be the pronunciation part and the non-pronunciation part of a PS rule, respectively.

This revision is unlike a real CG derivation tree in two respects. (1) There is no way to pick out which non-terminal in a PS rule represents the argument. (2) The rewrite operation of CFG is not technically a function, since applying it can give ambiguous results.
As for (1), in an earlier post, How can we describe the vertical dimension ..., I argued that the non-terminals of CFG do have a place along a height scale, and when information about grammatical relations is added, we can pick out the argument of a PS rule as being a lowest or most oblique non-terminal. In the above illustration, we now have

S1 -> NP1 "tidied up" NP2
 
where the NP2 is the lowest point and consequently the non-terminal representing the argument. As for (2), in that post I also described how the rewrite operation of CFG can be reinterpreted as a substitution function. So now we have a full reconstruction of CG.

Representation of discontinuous constituents

As a side benefit of this reworking of CG, it becomes possible to describe discontinuous constituents without making any special assumptions:

S1-> NP1 "tidied" NP2 "up"
 
Note that "up" is given no category here. There is no longer anything corresponding to the "P" of CFG, nor does "up" have any category in the above suggested sense of "category" as meaning the non-pronunciation part of a CFG rule.

No comments:

Post a Comment