The 2nd dimension.

Wednesday, January 4, 2017

The vertical dimension of grammatical structure.

Substitution versus Replacement

Hans Reichenbach in his classic Elements of Symbolic Logic distinguishes between substitution for a variable and replacement of an expression. Substitution is uniform, in the sense that the same value must be substituted for each instance of a variable in a formula, while replacement is not uniform. When a symbol that occurs at several places in a formula is replaced, each instance can be replaced by something different.

In the classic theory of PSG (by which I mean context free phrase structure grammar), the abstract symbols are replaced, not substituted. In my own adaptation of PSG, however, substitution rather than replacement is used. In the formal systems we're all most familiar with, for instance high-school algebra, substitution rules, as we proceed step by step to prove results in the system. The rule of substitution is connected with our notion of a variable representing some particular value. We can substitute various values for the variable that appears at one or more places in a formula.

It appears, at first glance, that in phrase structure grammar, it is replacement that we need, rather than substitution. For instance, suppose we begin with a phrase structure rule
  S -> NP gives NP NP
to describe the structure of a sentence "John gives Mary the book". If we used substitution to derived the particular NPs in the example, we could get
  John gives John John
  The book gives the book the book
  Mary gives Mary Mary
which is of course not what we want. Instead, in PSG, we use a rule of replacement, so we can replace the symbol NP with three different values.

But wait! Isn't that hasty? In the general formula for an indirect object sentence, there aren't really three NPs, since the arguments of "gives" differ along the vertical dimension. Really, the general form here is:
  S -> NP1 gives NP3 NP2
using 1, 2, 3, in roughly the way they are used in Relational Grammar. (I should mention that the canonical treatment in RG actually treats the NP3 in my example as a 2 which has been advanced from an original 3.)

So I propose that PSG is an algebraic system of the familiar sort with constants (in place of the terminal symbols of ordinary PSG) and variables (in place of the non-terminal symbols of ordinary PSG). It looks at first like a different sort of system with "rewrite rules" only when we neglect the vertical dimension which distinguishes variables bearing different grammatical relations.

Because natural languages appear not to have any constructions like the hypothetical *"John gives John John" mentioned above, I also adopt as a general principle the Stratal Uniqueness Law (hereafter SUL) of Relational Grammar, which, in my version, does not permit multiple instances of the same variable with the same grammatical relation to occur at the same level of analysis (in the same "stratum", that is).

Weak Generative Capacity

The weak generative capacity of a grammatical theory is the set of strings it characterizes as sentences of the language the theory purports to describe. Not everyone thinks this is a matter of any importance, but nonetheless, that is what I am concerned with, here.
A language which is generated by a PSG (i.e., a context free phrase structure grammar) is, by definition, a context free language. Whether natural languages are context free languages is controversial, but I think they are. The variety of generative grammar with a vertical dimension that I will describe below generates only context free languages.

However, if it were not for the SUL (stratal uniqueness law), this would not be so. A grammar with variables, constants, and assignments of strings to variables, can generate languages that are not context free. Below is a simple example. To distinguish assignments of string values to variables from phrase structure rules, I use an arrow "->" for a phrase structure rule (or "production") and an equals sign "=" for a string assignment rule.

This grammar,
  1. S = AA
  2. A = a
  3. A = aA
  4. A = b
  5. A = bA
generates the "copy language", each of whose sentences is some string of a's and b's followed by a copy of that string: {abab, bbabbbab, aaaabaaaab, ...}. The sentences generated are the strings assigned to the variable S.

For instance, from rule 3, using the equality in rule 4, I deduce that A = ab, then using this in rule 1, I can substitute ab for A to get S = abab. It works like a very primitive algebraic system.

The copy language is known not to be context free -- it cannot be generated by any PSG. (This fact is the basis of Shieber's famous demonstration that the Swiss-German dialect he studied has a non-context free construction.)

Thus, if the theory is to be context free, in the sense that only context free languages can be generated, grammars like the above example must be disallowed. The SUL does this, since it prohibits rule 1 of the above example. A rule of the grammar may not have multiple instances of the same variable, to conform to the SUL, so the two instances of variable A in rule 1 are not legal.

Vertical Levels

There are a lot of pieces to this theory, and it's hard to figure out what should come first, but I think it is time now to give an informal description of the various levels, before trying to be precise about formal aspects of the grammar. I begin by giving a name to this variation of phrase structure grammar, "2psg", which is short for 2 dimensional phrase structure grammar. The first dimension is the ordinary left-to-right, or earlier-to-later, ordering of symbols described by concatenation. The second dimension is the vertical dimension given in tree structures, with least embedded parts of a structure coming near the top of a tree and most embedded parts coming down closer to the leaves at the bottom of a tree diagram.

Every grammatical phrase type -- S, NP, Aux, Adv, PP, ... -- has five different flavors, which are associated with varying heights in a tree diagram. The flavors are 0, 1, 2, 3, Cho, where 0 is highest, 3 is lowest, and Cho (for the chomeur of Relational Grammar) has variable height.

There is a lawful relationship between the height of a variable and the string assigned to that variable: in ordinary constructions not involving raising, a variable cannot have as value a string containing variables with a greater height. For instance, an S2 (an infinitive or POSS-ing nominalization) cannot have as value (or "contain") a NP1 (a subject). I will defer formalizing this requirement, but I mention it here because it may be intuitively helpful in interpreting the following informal remarks about variable heights.
So here are notes about what I've worked out about the variables in this theory, for English grammar only, so far.

NP0 -- a vocative.
NP1 -- a subject.
NP2 -- a direct object.
NP3 -- an indirect object.
NP -- a chomeur (unclear, but perhaps the displaced original NP2 in a double object construction.
S0 -- a root sentence, in the sense of Emonds. Does not like to be embedded.
S1 -- an ordinary finite declarative clause. (Cannot contain NP0, Aux0, ...)
S2 -- an infinitive or gerund nominalized clause. (Cannot contain NP1, ...)
S3 -- a derived nominalization. (Cannot contain NP2, NP1, ...)
S -- an S chomeur, "that"-clause complement
Adv0 -- a performative adverb, referring to the speaker's act. (E.g. the adverb in "Frankly, my dear, I don't give a damn.")
Adv1 -- a "sentential" adverb, such as "probably/necessarily/..." that qualifies the truth of a sentence
Adv2 -- a moral adverb (Vendler's term), which expresses a judgment, approval, or assigns responsibility
Adv2 -- a manner adverb, which describes the way something happens
Adv3 -- a degree adverb, such as "completely", "almost"
Adv -- not known
Aux0 -- inverted auxiliary verb
Aux1 -- uninverted finite auxiliary (includes modal auxiliaries)
Aux2 -- nonfinite progressive "be/been"
Aux3 -- nonfinite passive "be/been"
Aux --nonfinite perfect auxiliary "have" (chomeur displaced from past tense)
PPx -- It's not clear to me how to describe time and place PPs, so I'll skip these, except for the PP passive by-phrase chomeur, which works out especially nicely in this theory.

Derived Phrase Structure Rules

As PSG was classically presented, a demonstration that a sentence (or some other category) is generated consists in giving a sequence of strings the first of which is S (or some other initial symbol), and in which each other string in the sequence is derived from the preceding by using a phrase structure rule to replace some non-terminal on the left of the rule with the string on the right part of the rule. This is not very convenient for stating the part the vertical dimension plays in 2psg (my name for 2 dimensional PSG).
Instead, suppose we allow phrase structure rules derived those already in the grammar to shorten phrase structure derivations. Here is the rule for deriving a new phrase structure rule: from a rule with a non-terminal symbol A in the string on its right side and another rule A -> x, replace A in the first rule with x to get the new rule (where x is a string of terminal and non-terminal symbols).
Here is a simple little example to illustrate. Given the PSG
  S -> NP VP  
  NP -> John, Mary
  VP -> V NP  
  V -> loves  
we can derive new phrase structure rules:
  VP -> loves NP  
  VP -> loves Mary
  S -> NP loves Mary
  S -> John loves Mary  
  ... and so on. 
 
In a more interesting example, an infinity of new phrase structure rules will be derived. If we want to know whether a sentence is generated by this grammar, we need only look among the phrase structure rules to see whether the sentence is on the right side of a phrase structure rule with S on its left side.

So, we no longer need the mechanism of a phrase structure derivation to describe how a PSG generates a sentence, given that we can derive new phrase structure rules. With this preliminary out of the way, I can state the condition which ensures in 2psg that constituents which are lower along the vertical direction are more deeply embedded in a constituent tree.

Derived Assignments
 
Here is the preceding example PSG given to illustrate derived phrase structure rules, but made into a 2psg by using assignments of string values to variables, deriving new assignments by substituting values previously assigned to variables for those variables, and by making some other minor changes:

Basis (lexicon):
  1. S0 = S1  
  2. S1 = NP1 loves NP2  
  3. NP1 = John  
  4. NP2 = Mary  
  5. NP1 = Mary  
  6. NP2 = John 
 
some other assignments derived by substituting values of variables:
  7. S1 = NP1 loves Mary (using the value assigned in 4 to substitute in 2)  
  8. S1 = John loves Mary (using the value assigned in 3 to substitute in 7)  
  9. S0 = John loves Mary (using the value assigned in 8 to substitute in 1) 
 
In principle, the constants in this example are phonemes, but since phonology is not part of this discussion, for the time-being, I have not written out the phonemic forms of the English words in the example, because it doesn't matter. Please take the words and phrases in conventional orthography in this and other examples as standing for the appropriate strings of phonemes.

Earlier, I gave as my goal, the formulation of a revision to PSG which describes the vertical dimension of natural language. So far, I'm on track. Corresponding to the above example of a 2psg, there is a PSG which generates the same language, which illustrates that 2psg is also a context free theory. I can find a PSG which describes the same set of sentences (well, there are only 4) by replacing the "=" sign in the string assignments with arrows, refer to the variables as non-terminal symbols, refer to the constants (the phonemes) as terminal symbols, and treat the derivation in 7.-9. as giving derived phrase structure rules, of the sort that were given earlier.

Other requirements of PSG carry over here, making terminological adjustments. The numbers of basic assignments, of variables, and of constants, are all finite.

I have several revisions to make, yet, before arriving at a characterization of 2psg, but I shall try to preserve this property of the theory. It is essentially a variety of PSG, that is, context free phrase structure grammar.

Categorial Functions

In Categorial Grammar (CG), popular among logicians interested in natural language, the structures of language expressions are given as pronunciations (or spellings) together with their categories, and a tree is built by combining the pronunciations of daughter nodes somehow, perhaps by concatenation, to get the pronunciation of the mother node, and applying the category of one daughter, considered as a function, to the category of the other daughter, considered to be an argument of that function.
I am now just a heartbeat away from a version of this CG theory, which I need to formulate the notion of a constituent structure tree which has a vertical dimension corresponding to the height of variables.

Using the example of the previous section, I begin by writing assignment statements as small trees, annotated as labeled bracketings with the variable to which a value is assigned as the mother node and written immediately after the left bracket. I call these "forms".

Basis (lexicon):
  1. [S0 S1]  
  2. [S1 NP1 loves NP2]  
  3. [NP1 John]  
  4. [NP2 Mary]  
  5. [NP1 Mary]  
  6. [NP2 John] 
 
And now I want to think of the derivation of new assignnments, now called forms, as done by applying a function (the form in which the substitution is made) to an argument form which says what string will be substituted for what variable. Using the usual "function(argument)" notation for the application of a function to an argument, the derivation of the earlier example now looks like this:
  7. [S1 NP1 loves Mary] = [S1 NP1 loves NP2]([NP2 Mary])  
  8. [S1 John loves Mary] = [S1 NP1 loves Mary]([NP1 John])  
  9. [S0 John loves Mary] = [S0 S1]([S1 John loves Mary]) 
 
Putting the derivation in 7.-9. into tree form gives my approximation to a constituent structure tree of the usual sort:

   [S0 John loves Mary]  
      /            \  
9. [S0 S1]       [S1 John loves Mary]  
                    /              \  
8.         [S1 NP1 loves Mary]  [NP1 John]  
                  /         \   
7.      [S1 NP1 loves NP2]  [NP2 Mary] 
 
In CG, the category of a form is written separately from the pronunciation part, and while there is no apparent need to do that here, for the sake of comparing theories, I make the definitions:

The pronunciation part of a form (that is, the constants, which are phonemes) is a constituent, and the remainder of the form (the variable part) is the category.

For the case where the constant part of a form is a continuous string of phonemes (in general, it need not be), I use the notation [var ... __ ...] for a category, where the underline stands for the constituent. For instance, in the above example, for the form [S1 NP1 loves NP2], the category is [S1 NP1 __ NP2], abstracting away the constituent "loves".

Cyclic Conditions on Substitution
  1. A variable of a form is not subject to substitution when the form has some other variable of lesser height (or, that is, with greater obliqueness). This condition is needed to make the connection between the height of a variable (0, 1, 2, 3) and the height of a constituent in the derivation tree. The condition corresponds to the requirement in transformational grammar that processing starts at the bottom of a constituent structure tree.
  2. A form to which any rules such as substitution are applicable may not be an argument of a substitution function. This condition corresponds to the requirement in transformational grammar that cyclic transformations start applying at the bottom of the constituent structure tree. It is required in 2psg to prevent violations of the SUL (stratal uniqueness law) from arising through the substitution of string values for variables.
Coordination

Most ordinary two-part coordinations can be described by this rule:
  • Constituents can be derived by putting "and" between two constituents of the same category, and the category of the new coordinate constituents is the same as each of the two original constituents.
In PSG, this is usually taken to characterize a schema of phrase structure rules, such as V -> V and V, for example. However, it is not possible to carry over such an account into 2psg, since, for one thing, a form [V V and V] would break the SUL, since there is more than one instance of the same variable in the form, and for another thing, there is no grammatical type V (nor is there a VP. V-bar, N, or N-bar), and for yet another thing, that phrase structure rule is wrong, anyway.
The phrase structure rule is wrong, because we cannot coordinate verbs with different valences:
  1. John wiped the window.
  2. John disappeared into the mist.
  3. *John wiped and disappeared the window.
Accordingly, I take the above rule for coordination as a basic rule of 2psg (not a form), so that we can describe the coordination of verbs of the same valence, for instance, making use of the previously given definitions of "constituent" and "category" as respectively the constant and variable parts of a form.
Here is an example:
  1. Given forms [S1 NP1 wiped NP2] and [S1 NP1 broke NP2], since these have the same category [S1 NP1 __ NP2], we can
  2. form a constituent "wiped and broke" of this same category, [S1 NP1 __ NP2],
  3. which is the form [S1 NP1 wiped and broke NP2],
  4. and substituting in this using [NP2 the window] then [NP1 John], gives
  5. "John wiped and broke the window" of category [S1 __].
Eversion

When a finite clause, an S1, becomes oblique, S2 or S3, what happens to arguments it contains of lesser obliqueness? In Raising to subject constructions, we would have a form containing an argument of less obliqueness than the form itself. For instance:
  [S2 NP1 to explode]
where the cyclic conditions on substitution I gave above prohibit substituting for the variable NP1, since the form has a more oblique variable, S2. When this form is made an argument of
  [S1 seemed S2]
the movement of the NP1 up into the higher S1, giving
  [S1 NP1 seemed [S2 to explode]]
can be thought of as a solution to this difficulty, since now the original subject of "explode" has found a home in the higher clause where it can be legally substituted for -- it resides in an S1, a finite clause, which is no more oblique than NP1, a subject.
I refer to such a change when part of a form must move outside it as "eversion" -- the form is turned partially inside out.

Topicalization

Several questions about why topicalization works the way it does can be answered in 2psg. I begin with:
  Why topics are raised
A topic gives what a sentence is about, and this concerns the performance of a speech act, so we expect topics to have a grammatical relation 0, like vocatives and performative adverbs.
 [S1 NP1 ate NP2 on Sundays] becomes by topicalization of the object of "ate":  
 [S1 NP1 ate NP0 on Sundays] and by applying to the argument [NP1 we]:  
 [S1 we ate NP0 on Sundays]  
however this does not give a pronounceable form, because the variable NP0 is less oblique than the variable S1. The form can be an argument of the function [S0 S1], however,
  [S0 we ate NP0 on Sundays] = [S0 S1]([S1 we ate NP0 on Sundays]) by substitution  
and now, since NP0 is no less oblique than S0, it can be substituted for by, say, [NP0 beans]. This gives a constituent structure:

  [S0 we ate beans on Sundays]  
            |                \
[S0 we ate NP0 on Sundays]  [NP0 beans]  
    /           \
[S0 S1]   [S1 we ate NP0 on Sundays]  
                 |                  \
     [S1 NP1 ate NP0 on Sundays]     [NP1 we]  
                 |  by topicalization  
     [S1 NP1 ate NP2 on Sundays]
 
In this example, "on Sundays" is really an argument, but I suppressed some detail to simplify the example. Also, before substituting for NP0, the constituent "we ate ... on Sundays" is a discontinuous constituent, but I am not sure that is actually possible -- it may be that the NP0 has to be moved to the end or to the beginning, to make the remainder a continuous constituent:
  We ate on Sundays, beans.   
  Beans, we ate on Sundays. 
 
At any rate, the natural place for performative constituents in English is at the beginning of a clause, so at least the latter is an option.
So, topics are dependencies that can only be satisfied in root sentences, S0, and when an embedded argument becomes a topic, the topic has to be "everted" -- moved out of its embedded position in the sentence structure.

No comments:

Post a Comment