(31) Maryi [INFL kiss-en ei] When (31) is fed into the PF component, spelling rules finally generate the expression Mary was kissed.
The interest here is that, of the two positions occupied by Mary and e respectively, the former has Case while the latter has a y-role. Thus the chain (Mary, e), marked by coindexing in (31), satisfies both criterion (26) and the Case filter (29). Think of Mary as heading this chain and e as referentially dependent on Mary. For a simple sentence such as John likes Mary, each of John and Mary are a.s.signed both Case and y-role; thus, John and Mary may be thought of as single-membered chains. This a.n.a.lysis establishes a close link between y-theory and Case theory: there seems to be a relationship of mutual satisfaction. Suppose we capture this by imposing a Visibility Condition (Chomsky 1986, 94).
(32) A chain is visible for y-marking only if it is a.s.signed Case.
So we now have three licensing conditions: y-criterion (26), Case Filter (29), and Visibility Condition (32). They seem to say similar things in slightly dierent ways. Can they be put together in a better package?
It is clear that once we have the interactive condition (32), much of the eect of (29) is already captured. The eect of (32) is to make sure that a chain will not be y-marked unless it is Case-marked, which, in fact, is the requirement of (29). Since an interactive principle is always preferable over an isolated one, we adopt (32) and give up (29). The notions of chain and visibility may now be used to formulate a modified y-criterion (33).
(33) A chain has at most one y-position; a y-position is visible in its maximal chain.
The s-structure (31), we saw, was formed due to the movement/raising of an NP. Movement, surely, is always from an A-position for, by now, we are familiar with a basic P&P idea that movement is always forced by licensing requirements that are not met at a d-structure, which, we saw, is essentially a y-structure. This leaves two options: an element moves either to another A-position or to an A-bar position. NP-movement, an exam- Linguistic Theory I
55.
ple of which we saw, is typically an A-to-A movement. There are other varieties of A-to-A movement that I put aside.
2.3.2.2.
Wh-Movement A typical example of A-to-A-bar movement is the movement of wh-phrases (WPs) in questions and relative clauses.
We will look at the properties of the first kind of movement. There is a pretheoretical intuition that WPs are NPs that are sometimes, and in some sense, best thought of as direct objects of verbs: John ate what? is a natural response to the remark John ate five c.o.c.kroaches. At other times, a WP is naturally viewed as the Subject of a sentence: Who ate five c.o.c.kroaches? is also a natural response to the preceding remark. So the d-structures for sentences (34)(36) will be something like (37)(39), respectively.
(34) I wonder who John saw (35) What is it easy to do today (36) Who ate what (37) [S-bar [S I wonder [S-bar [S John [VP INFL see who]]]]]
(38) [S-bar [S It is easy [VP to do what] today]]
(39) [S-bar [S Who [VP Infl eat what]]]
A number of interesting facts emerge. In the actual sentence (35) the WP has moved from the VP-internal position in (38) to the front of the main clause. For (34) as well the WP moves in (37), but it moves to the front of the embedded clause. In sentence (36) it seems that the first WP has not moved since it is already in a clause-front position; the second WP in (36) stays at the VP-internal position. How do we explain these movements or apparent lack of them? When WPs do move, where do they move to? A host of other questions arise, as we will see.
Although a WP is naturally viewed as a Subject or a direct Object, we cannot think of a WP as a (specific) agent or theme or goal. We will see shortly that WPs are ultimately interpreted as quantifiers. Hence they cannot continue to occupy the position of the complement of a verb where a specific thematic interpretation is typically a.s.signed. Similar remarks apply to the Subject position. This is one of many arguments, leading away from the Standard Theory (Chomsky 1965), that suggest that a d-structure representation is not the proper vehicle for semantic interpretation since it suppresses the nonthematic character of wh-elements (Chomsky 1977).
56.
Chapter 2.
A WP thus must move from the visible positions in (34)(36) to a non-visible position. Since A-positions are typically visible in a chain, a WP moves to an A-bar position, namely, a Comp position that is a clause-external position. The character of this movement is controversial, and it animates much discussion in the Minimalist Program (Hornstein 1995; Johnson 2000; Boeckx 2006). I note some of the controversies as I proceed.
In English, a WP raises to a Comp position. Which one? At this point the Subjacency Principle of Bounding Theory plays a crucial role. The theory requires the concept of bounding nodes: a single application of Move-a may not cross more than one bounding node. Intuitively, a bounding node determines the extent to which instances of Move-a apply in one stroke; in that sense, the principle has a ""least-eort"" flavor, to which I return in chapter 5. Initially, NP and S were taken to be the bounding nodes in English; other languages may have S-bar and NP as bounding nodes. Insofar as this is true, the module is possibly parameter-ized. I have a.s.sumed, following Chomsky 1986, that Subjacency applies between d- and s-structures; others argue that it applies between s-structure and LF. This issue, as well as the issue of where ECP applies (see below), is somewhat moot in the light of the Minimalist Program since, as noted, MP does not have s-structures.
In (38), there is one bounding node to cross for the only available Comp position at the front of the sentence; hence the WP adjoins to the front in one bound. In (37), the WP adjoins to the first of the available Comp positions: a possible ""hopping"" movement to the higher Comp is barred due to the lexical properties of wonder. In (39), the Subject WP adjoins to the only available Comp position, forcing the other WP to remain in situ at the s-structure, so this is how the sentence is going to be p.r.o.nounced. The movement of who forces ""superiority"" (i.e., the leftmost WP moves first); this again has a ""least-eort"" flavor. The other WP what must now be a.s.signed a specific y-role contrary to the nonthematic character of such phrases. We will see that even this phrase will move covertly in LF to avoid the problem. Movement as usual creates coindexed trace elements. The resulting s-structures corresponding to (34)(36), then, are roughly as follows: (40) [S I wonder [S-bar [Comp Whoi [S John Infl see ei]]]]
(41) [S-bar [Comp Whati [S it is easy to do ei today]]]
(42) [S-bar [Comp Whoi [S ei Infl eat what]]]
So far two basic kinds of empty categories have been postulated: those that are projected by the lexicon essentially to satisfy y-theory and those Linguistic Theory I
57.
that are created by movement. The latter kind, namely, trace, again sub-divides into two categories: those created by A-to-A movement and those created by A-to-A0 movement; let us call them ""NP-trace"" and ""wh-trace"" respectively. We also saw briefly that the Subject position of infinitival clauses is sometimes occupied by an empty element PRO. Some languages-for example Spanish and Hindi (but not English)-have an additional empty element called (small) pro, which sometimes occurs as the Subject of finite clauses in the so-called null-subject languages. We will see that all these empty categories are cla.s.sified into four basic kinds.
This proliferation of empty categories, forced throughout by theory as we saw, creates a problem for the language learner. Since these are not phonetically realized, how does the child interpret them? In fact, how does the child know that they are there? Speaking roughly, but quite correctly, interpretation of a sentence ultimately accrues from the meanings of words, which the child has to learn independently in any case. Thus, one of the princ.i.p.al goals of the P&P framework is to shift the child"s burden only to the learning of words while the rest of the business of interpretation is placed on the universal principles of the computational system (Wasow 1985). Empty categories do not seem to fit this explanatory strategy.
The preceding way of stating the problem itself suggests how the problem is to be addressed. Empty elements will not be a problem if there are principled ways in which each empty element is shown to be linked to some phonetically realized element; in so linking, the empty element will be endowed with some ""proxy"" interpretation. In other words, the natural general idea is that empty elements be viewed, across the board, as dependent elements whose antecedents are ultimately some independently interpretable items. The qualification ""ultimately"" is related to the concept of maximal chain mentioned in connection with the revised y-criterion (33). A chain may have more than two members: John seems e1 to have been hit e2 by a car contains the chain (John, e1, e2) headed by John. The last empty element e2 ultimately receives its semantic interpretation via John. What the theory needs to do is to give a naturalistic account of which empty category is linked to what antecedent to receive which interpretation. In this way, the ""burden"" will remain with the computational system (Chomsky 1988, 9091).
2.3.2.3.
Binding Theory In fact, there are also phonetically realized dependent elements in languages that require a similar account. p.r.o.nouns as in John thought that he [John] needed a shave and reflexive p.r.o.nouns 58
Chapter 2.
as in John decided to shave himself are paradigmatic examples of such dependent elements (he may have a disjoint reference as in John thought that he [Bill] needed a shave). So ideally, instead of treating empty categories in a separate block, the theory should explain the general phenomenon of dependency. Perhaps, still more generally, the theory simply gives an account of how the NPs in s-structure are distributed. Much of this ideal is fulfilled in Binding theory, though some of the residual problems with empty categories are treated separately in the Empty Category Principle (ECP).
Let us a.s.sume that all A-positions are freely indexed at s-structure, perhaps barring those, if any, that are already indexed by Move-a (which, we saw, may index some A-bar positions as well).
(43) Definitions: For categories a and b, a. a A-binds b just in case: (i) a c-commands b, and (ii) a and b are coindexed arguments.
b. The governing category for a is the smallest NP or S containing a and the governor of a.
(44) Typology of arguments: a, ap: anaphor (himself, each other, NP-trace) aa, p: p.r.o.nominal (he, him, them, pro) aa, ap: r-expression (John, the man, wh-trace)8 a, p: p.r.o.nominal anaphor (PRO) (45) Principles of A-binding: Principle A: An anaphor is bound in its governing category.
Principle B: A p.r.o.nominal is free in its governing category.
Principle C: An R-expression is A-free.
The preceding definitions and principles of Binding theory have natural explanations as follows. The general and, therefore, the minimal definition (43a) imposes two natural conditions: (i) a dependent element must occur in the domain of its antecedent and, since c-command is the widest grammatically salient concept of a domain, an antecedent must at least c-command its dependent; and (ii) among all the argument-NPs that occur within this domain, only those that are specifically related to each other count, so that an antecedent and its dependent(s) must be coindexed.
Thus the minimal definition maximally captures the concept of grammatical binding as distinguished from, say, pragmatic binding.
Additional restrictions are needed to specify the binding relationships for various subcla.s.ses of NPs. To that end, two things are needed. First, Linguistic Theory I
59.
we need some concept of ""local domain,"" obviously narrower than c-command, to serve as a unit for computing dependencies for various subcla.s.ses of arguments. This is achieved in definition (43b). Second, we need a principled way of part.i.tioning the cla.s.s of arguments to be distributed.
This is stated in (44). Following the paradigmatic cases of phonetically realized dependent elements as mentioned above, we think of two basic features with binary options: anaphoric (Ga) and p.r.o.nominal (Gp). This generates four categories, as shown. I have also listed some suggestive examples alongside each category.
It is obvious that the category of p.r.o.nominal anaphora cannot be treated in Binding theory since the definition of the category (a, p) requires that it is treated both as an anaphor and a p.r.o.nominal and, hence, it is both bound and free in its governing category. The way out of this contradiction is to suggest that definition (43b) does not apply to this category-that is, the category is ungoverned. Since it is ungoverned it cannot receive Case. But Case theory requires that every phonetically realized NP must receive Case; hence, this category cannot be phonetically realized-that is, it is empty. The element that simultaneously meets these conditions is the empty element PRO that occurs as the Subject of an infinitival clause.
Returning to Binding theory, the rest of the cases are covered individually as required by (45). Principles A and B are quickly verified by the following examples.
(46) a.
[S Johni shaved himselfi]
b.
[NP Johni"s shaving himselfi]
c. *Billi said that [S John shaved himselfi]
d. *[S Johni shaved himi]
e. *[NP Johni"s shaving himi]
f.
Billi said that [S John shaved himi]
g.
[S Johni shaved himj]
Except for (46g), definition (43a) is satisfied in these cases since a, which is either John or Bill, is coindexed with b, which a c-commands. In each case, a of definition (43b), which is either an anaphor himself or a p.r.o.noun him, is governed by the verb shave and both b and its governor are contained in the smallest S or NP as indicated by bracketing; so definition (43b) is satisfied as well.
In (46a) and (46b), the argument John correctly binds the anaphor himself, showing that both S and NP are governing categories. The anaphor cannot be bound by an argument outside these categories as (46c) shows.
60.
Chapter 2.
Hence Principle A is satisfied in both directions. Similar arguments extend to NP-trace. We put aside subtle issues regarding indexing that arise for such apparent failures of Principle A as [the children]i thought that [S [NP pictures of [each other]i] were on sale], called ""long-distance binding,"" in which an anaphor each other is bound outside its governing category S (Chomsky 1986, 173).
The ungrammatical structures (46d) and (46e) show, on the other hand, that the p.r.o.noun him is not bound-that is, it is free-in either category.
The p.r.o.noun is bound by Bill in (46f ) but Bill lies outside the governing category. In contrast, (46g) is fine because John and him are not even coindexed; hence him is disjoint and definition (43a) does not apply.
Therefore, Principle B is not violated. These examples suggest that anaphors and p.r.o.nouns have a complementary distribution in that the domain in which an anaphor must find an antecedent to be licensed is the domain in which a p.r.o.noun must not have an antecedent.
R-expressions contrast with both anaphors and p.r.o.nominals. Consider (47a) in which an anaphor each other, a reciprocal, and (47b) in which a p.r.o.noun them are correctly bound, obeying Principles A and B respectively.
(47) a. The musiciansi like [each other]i b. The musiciansi wanted John to like themi Replacement of the bound element by an r-expression, say, the men, however, yields ungrammatical expressions (48a) and (48b).
(48) a. *The musiciansi like [the men]i b. *The musiciansi wanted John to like [the men]i Both (48a) and (48b) are fine if the men has a dierent index. So rexpressions are not only not A-bound in the governing category, they are not A-bound at all; r-expressions are A-free. The expressions in (48) thus violate Principle C. The importance of the qualification ""A"" in ""A-free"" is shown in (49b), where an r-expression the fool has an ""anteced-ent"" John but the relationship lies beyond Binding theory. In contrast, (49a) again shows a violation of Principle C (Chomsky 1986, 79).
(49) a. *Johni didn"t realize that [the fool]i had left the headlights on b.
Johni turned o the motor, but [the fool]i had left the headlights on Interestingly, wh-traces are also r-expressions. Consider the examples in (50).
Linguistic Theory I