Read Of Minds and Language Online
Authors: Pello Juan; Salaburu Massimo; Uriagereka Piattelli-Palmarini
Juan Uriagereka
As all of you know, every time I listen to a talk by Randy Gallistel, I think I have made a career mistake â I should have studied a different animal. But anyway, in the interests of interdisciplinarity, I will talk about human animals, in particular a puzzle that arises in them when considered from the minimalist viewpoint. This might offer a perspective that could be interesting for the general issues of evolution and cognition that we have been discussing.
As all of you know, in the minimalist program we seek what you may think of as
internal coherence
within the system â but
in its natural interactions
with other components of mind (its interpreting interfaces). That is, we are interested in their effective integration. The puzzle that I would like to talk about arises with a phenomenon that is actually quite central in language, namely the hierarchy of Case features â that is, the order in which values are decided within the Case system. I will give you one concrete example, but the phenomenon is general, and the reason the puzzle arises is because the hierarchy involves
uninterpretable
information, to the best of anybody's knowledge. That is,
a fortiori
, this type of information cannot be explicated in terms of interface conditions based on inter-pretability. There are very interesting stories out there for hierarchies that arise in terms of interpretable information. For instance Hale and Keyser (1993, 2002) gave us an approach to thematic hierarchy from just that point of view. But the problem I am concerned with is different. We have interesting interpretations of thematic hierarchy, but whatever featural arrangement we are going to have in terms of uninterpretable Case, such an arrangement simply cannot be the consequence of interpretive properties. So where does it come from?
I'll illustrate this through some examples in Basque, using just the abstract Basque pattern, with English words. So, in Basque, in a simple transitive structure,
(1)Â Â Â Â [
S
NP.subj [
VP
NP.obj    V    agrO.Trans-Aux.agrS]]
        John.subj Mary.obj loved âhe.has.her'
        âJohn has loved Mary' = âJohn loves Mary'
you have an NP with subject Case, an NP with object Case, and then of course you have a verb and, importantly, an auxiliary in the transitive format, something like
V-have
, which shows agreement with subject and object. In turn, when the expression is intransitive (in particular unaccusative),
(2)Â Â Â Â [
S
NP.obj [
VP
t VÂ Â Â Â Â Â Â Â Â agrO.Intrans-Aux]]
           John.obj        arrived âhe.is'
          âJohn is arrived' = âJohn arrived'
then the subject, which arguably displaces from inside the verb phrase, gets
object
Case, and verbal agreement is now of the intransitive sort (something like
V-be
), determined by that single argument.
Now things quickly get more complicated in an interesting way. The facts are taken from both Laka's work on split ergativity and San MartÃn's thesis, adapting earlier work by Hualde and Ortiz de Urbina (2003) (for a presentation and the exact references, see Uriagereka 2008). In essence, when you have a transitive verb, but the object of the sentence is now
another sentence
â for instance a subject-controlled sentence, like
(3)Â Â Â Â [
S
NP.obj    [
VP
[Sâ¦]          V          agrO.Intrans-Aux]]
           John.obj    [to lose weight] tried    âhe.is'
           âJohn tried to lose weight'
â all of a sudden, it is as if the object is no longer there! The object clause is still interpreted, but it no longer behaves as a true complement, and now the subject NP gets
object
Case, as if the structure were unaccusative, and even the auxiliary exhibits the unaccusative pattern we saw for (2), agreeing with a singular element. This is despite the fact that semantically you clearly have two arguments. In effect, instead of saying âJohn
has
tried to lose weight,' you must say the equivalent of âJohn is tried to lose weight.'
So, in a nutshell, when the complement clause is a true subordinate clause (you have to make sure the clause is truly subordinate, not a paratactic complement which behaves like any NP), for some reason it is pushed out of the structural Case system and shows up without structural Case. And then the subject, which again has a perfectly normal subject interpretation, nonetheless receives object Case. So a way to put this is that true clauses, though they are fine thematic arguments, just do not enter into this system of Case. It is nominal phrases that require Case for some reason, and they do so on a first-come, first-served
basis. Simply, the first nominal (not the first interpreted argument) in a derivational cycle is the one that receives object Case, regardless actually of how “high” it is in terms of its thematic interpretation. So this Case distribution is just at right angles with semantics, in the broadest sense.
Now, an immediate question to reflect on is why it is that NPs (or more technically DPs) are subject to this structural Case system, while clauses get away without Case. This is shown by comparing (3) with a version of that same sentence having the same semantics, but where the complement clause is substituted by a pronoun:
(4)Â Â Â Â [
S
NP.subj [
VP
that.obj VÂ Â Â Â agrO.Trans-Aux.agrS]]
           John.subj    that.obj tried  âhe.has.her'
           âJohn tried that'
Now everything is as we would expect it: the subject gets subject Case and the object gets object Case, as is normal in a transitive construction. So what is the difference between (3) and (4), if their interpretation is not the answer? Second, how does this Case valuation mechanism enable the system to “know” that the first element in need of Case has been activated and that Case value has already been assigned, so that the next item that needs Case (which everyone agrees the grammar cannot identify interpretively, remember) can then use the next Case value?
I should say that the situation described is not at all peculiar to Basque. These hierarchies, with pretty much the same sorts of nuances, show up in virtually all other languages, if you can test relevant paradigms (Bresnan and Grimshaw 1978, Harbert 1983, Silverstein 1993; Uriagereka 2008 attempts an account of this sort of generalization). There is at least one parameter that masks things for more familiar languages (whether the first Case value assigned is inside or outside the verb phrase, basically), but if you take that into account, you find puzzles similar to the one just discussed literally all over the place. Which is why, in the end, we think of Case as an uninterpretable feature.
1
Compounding the problem as well is the notorious issue posed by dative phrases, virtually in all languages. Dative Case valuation happens to be determined, for some bizarre reason, after those two Cases I was talking about, although structurally, dative clearly appears in between them. Moreover, whereas there is only one subject and one object Case within any given derivational cycle, as just discussed, you actually can have multiple datives in between. It is almost as if you have a family
affair: first the mother, last the father, and in between a bunch of children. Except this ordering is neither a result of obvious interface conditions, nor of simple derivational activation.
Anyway, this is the picture I am going to keep in mind, and in essence this strange state of affairs is what the language faculty has evolved, for some reason. For our purposes here (and I think this is probably not too far off), you must have a first or mother Case â a domain where there happens to be a parameter, as I said, depending on whether that mother Case is assigned at the edge of the VP or not. And you must have a last, or father Case, if you will â which, depending on the parameter finessing the manifestations of the mother Case, comes up at the TP or further up. And then you have what you may think of as a default Case, or, to use a third family name, the child Cases that are associated with an entirely separate system involving something like prepositions. This Case value is basically used only when the mother and the father Cases have been used, first and last, respectively. That is the hierarchy I have in mind.
These are central, although of course not logically necessary, generalizations that the derivation is clearly sensitive to. And I really mean the derivation in a serious sense, for the hierarchy is actually evaluated at each derivational cycle, meaning that every time you get, basically, a new clausal domain, you have to start all over. It is really like a family affair, relations being reset with each new generation. But it is extremely nuanced too: not simply interpretive (you must distinguish arguments depending on whether they are nominal or clausal) and not simply activated in, as it were, chronological order. True, “mother-Case” comes first, but what shows up structurally next, “child-Case,” is clearly not what simply comes next in the derivation, which is “father-Case.” We know that because in many instances there simply are no “child-Cases,” and then it is only the father/mother-Case duality that shows up. So while this Case valuation system clearly has configurational consequences (association with the VP level or the TP level, for instance), it just cannot be seriously defined by going bottom-up in structure, the way we do, for instance, for the thematic hierarchy.
That, I should say, has an important immediate consequence, consistent with a comment in Chomsky's 2005 paper.
2
If something like this is right, then the architecture of a syntactic derivation simply cannot be of the sort that accesses interpretation fully online. The system must have enough derivational memory to keep a whole cycle in active search space, so that it knows whether, for that cycle, the mother-Case valuation has already been used, so that the father one is
then deployed; or when the father-Case valuation has been accessed, then you move into child Case. Without a memory buffer to reason this way, this system makes no sense. This is what Chomsky calls a “phase”-based derivation, and the present one is a pretty strong argument for it.
What role is this Case valuation playing within cycles to start with â why is it there? Here is where I am going to offer some speculations from a project that Massimo Piattelli-Palmarini and I have been working on (see Piattelli-Palmarini and Uriagereka 2004, 2005; more work is in progress). If you take all this seriously, the particular possibility I will mention has interesting consequences for the issues we have been talking about in this conference. The general question can be posed this way. If you look at the archeological record, early sapiens prior to our own species seem to have exhibited very elaborate causal behaviors, presupposing some kind of computational thought. There should be little doubt about that, especially if, following Randy Gallistel's arguments, we are willing to grant complex computational thoughts to ants or jays. But there surely is a difference between thinking and somehow sharing your thoughts, for thought, as such, can be a pretty multi-dimensional construct.
In grammatical studies alone we have shown that a simple sentence structure like the one I am using right now involves at least one “dimension” for the string of words of arbitrary length, another for labeling/bracketing going up in the phrase-marker, possibly another one for complex phrasal entanglements that we usually get through transformations and similar devices, and I would even go as far as to accept a fourth “dimension” dealing with the sorts of information-encoding that we see deployed in the very rich phenomenon of antecedence and anaphora. So these four “dimensions” at least. But as Jim Higginbotham insightfully observed in 1983, all of that has to be squeezed into the one-dimensional channel of speech.
3
Some of you might be telepathic, but I at least, and I'd say most of us have to share our thoughts in this boring way I am using, through speech, and that compression probably implies some information loss.
This actually has consequences for a very interesting study that Marc Hauser and Tecumseh Fitch did a couple of years ago with tamarins. If I understood the experiment, the tamarins failed to acquire anything that involved relevant syntactic types, and I mean by that simple context-free grammars. They only succeeded in acquiring simpler finite-state tasks, with no “type/token” distinctions. I want to put it in those terms because I want to be neutral about what it
is that you can and cannot do as you organize your thoughts in progressively more complex computational terms.
The very simplest grammars one can think of, finite-state ones, are so rudimentary that they cannot use their own resources to code any sort of groupings, and thus have no way to express, in themselves, very elementary classifications. One could imagine other external ways to impose classifications, but the point is they would not be internal to the grammatical resources, at that level of complexity. In a grammar, it is only as you start going up in formal complexity that you can use grammatical resources â the technical term is a “stack” â to possibly group other symbols into sets of a certain type. So there is a possible issue, then: such a type/token distinction must be significant in the evolution of our line of language, and we want to figure out what sort of leap in evolution allowed us to do that, but not the tamarins. Could it have anything to do with Higginbotham's “compression” problem? In other words, could the tamarins â or other apes, or sapiens other than ourselves in evolutionary history â have been capable of real type/token distinctions in thought, but not in sharing that thought through a unidimensional channel that depends on the motor system?
I do not know, but the matter bears on what I think is a very unimaginative criticism that some researchers have recently raised against Chomsky, Hauser, and Fitch (see Jackendoff and Pinker 2005, and Fitch et al. 2005 for a response). One version of the problem goes like this. Gallistel has shown that many animals execute elaborate computational tasks, which might even reach the the context-free grammars of thought that I was alluding to a moment ago. Now simply looking at the fossil record, coupled with the detailed genetic information that we are beginning to get on them as well, tells us a similar story about pre-sapiens, or at any rate pre-sapiens-sapiens â further grist for Gallistel's mill (see Camps and Uriagereka 2006 for details and references). So here is the issue being raised: how can anyone claim that the defining characteristic of the “full” language faculty is recursion, when recursion may be a hallmark of all those computational layers of complexity that we have to grant to other thinking creatures? How can they have truly complex thoughts if they lack recursion?