Read Of Minds and Language Online
Authors: Pello Juan; Salaburu Massimo; Uriagereka Piattelli-Palmarini
However, even more than being ready for what they might encounter in language, children need to have expectations about what they are
not
going to encounter. This is very important for limiting the vast number of potential hypotheses that they might otherwise entertain. Even in constrained linguistic theories which admit only a finite class of possible grammars, that still amounts to a lot of grammars for children to test against their language sample. We don't want them to waste their time on hypotheses that could not be true. Let's consider an example of movement, such as:
(1)Â Â Â Â Which of the babies at the daycare center shall we teach ASL?
There is a missing (i.e., phonologically null) indirect object between
teach
and ASL, and an overt indirect object (
which of the babies at the daycare center
) at the front of the sentence, not in its canonical position. Let's suppose a learner has put two and two together and has recognized this as a case of movement: the indirect object has moved to the front of the sentence. Now why has it moved to the front? Please imagine that this is the first time that you have ever encountered a sentence with overt movement (you are a very small child), and you think perhaps the phrase was moved because it is a plural phrase, or because it is an animate phrase, or a focus phrase, or because it is a very long phrase â or, maybe, because it is a wh-phrase. Some of these are real possibilities that a learner must take seriously: in Hungarian questions, a wh-phrase is fronted because it is a focus; in Japanese a wh-phrase can be fronted by scrambling, motivated by length or by its relation to prior discourse. But other ideas about what motivated this movement are nothing but a waste of time; an infant without innate assistance from UG might hypothesize them and then would have much work to do later, to establish that they're incorrect and start hypothesizing again. So it helps a great deal to know in advance what
couldn't
be the case. To help us think this through, I'm going to make up my own universal principle: in natural language, there is no such thing as a process of
fronting plural noun phrases. That is to say: a plural noun phrase may happen to be fronted, but not because it's plural; number is not a motivating factor for movement. Maybe I'm wrong, but let's pretend for the moment that this is a guaranteed universal. Then it is good for children to know it, because that makes one less hypothesis they will have to explore.
Similar points apply at all stages of learning. Imagine now a child who has correctly hypothesized that the noun phrase in our English example was fronted qua wh-phrase, not because it is plural, etc. He still needs to know how far he can generalize from this one instance, how broad he should assume this wh-fronting phenomenon to be. Do all wh-phrases front in this language? Or is it only [+ animate] wh-phrases that do, or only non-pronominal wh-phrases, or wh-phrases with oblique case, etc.? I'll assume here that part of the innate knowledge that children have is that wh-movement is sometimes sensitive to case; there are languages in which nominative but not accusative arguments can move in relative clauses.
6
But I'm supposing that wh-movement is never sensitive to number. So if a child hears a question with a singular fronted wh-phrase, he can safely assume that it is equally acceptable to have plural fronted wh-phrases, and vice versa: number is not even a conditioning factor on movement (at least, on A-bar movement). This is another fact that is very useful to know; it eliminates another hypothesis the child would otherwise have wasted time on. Note that it's a quite specific fact. There are other phenomena which are constrained by number. Obviously, anything involving number agreement is bound to be, but also some unexpected things. For example, the construction:
(2)Â Â Â Â How tall a man is John?
has no plural counterpart. You can't say:
(3)Â Â Â Â *How tall men are John and Bill?
That's not English. Nor is:
(4)Â Â Â Â *How tall two men are John and Bill?
where it's clear that the movement of
how tall
isn't vacuous. So there is an odd little bit of number sensitivity here. A wh-adjunct like
how tall
can be fronted within its DP (which is then fronted in the clause), but that process is sensitive, it seems, to singular vs. plural. There are also phenomena that, unlike wh-movement,
7
are sensitive to whether a constituent is pronominal. In some
Scandinavian languages, for example, scrambling treats pronouns differently from non-pronominal elements. So here too, there's specific information that a learner would benefit from knowing in advance.
The general point is that if learners didn't have innate knowledge about which properties can and cannot condition wh-movement or any other linguistic phenomenon, then they would have to check out all the possibilities just in case. Many of you have probably read Steven Pinker's first book on language acquisition.
8
It is a very fat book, because what Steve was trying to do in it was to show how a child would set about checking all the possible hypotheses about which features condition a linguistic phenomenon. One of several examples he worked on was the English double NP dative construction, comparing acceptable and unacceptable instances such as:
(5)Â Â Â Â I gave Susan the book.
(6)Â Â Â Â *I donated the library a book.
The second example can only be expressed as
I donated a book to the library
, with a prepositional phrase. Which verbs permit the double NP? It takes an enormous number of pages to explain how the child would check out, one by one, all the possible features and feature combinations that might govern the extent of the double NP pattern. According to what was being proposed at that time, the key features were that the verb had to be monosyllabic (or to be of Germanic, not Romance origin; or to be prosodically one foot), and its semantics had to be such that the indirect object became the possessor of the direct object of the event described in the sentence. Pinker noted that the range of
potential
constraints on lexical alternations is large and heterogeneous, and you can imagine how far down in the child's priority list this particular combination of constraints would be. Clearly it would take a substantial amount of testing (as Pinker illustrates in detail) to discover which are the properties that matter in any particular case. Worse still: in the absence of innate guidance, a learner could imagine that there might be equally idiosyncratic phonological and semantic conditions on
any
linguistic pattern observed in the input. There would be no way to find out without trying. To be on the safe side, therefore, the child would have to go through the whole laborious procedure of checking and testing in every case â even for phenomena to which no such conditions apply at all. Surely this is not what children do. But if they don't, then it seems they must have advance knowledge of what sorts of conditions might be relevant where (e.g., no language requires the verb of a relative clause to be monosyllabic).
I do not know precisely how UG prepares children for acquisition challenges such as these. But that is what I am shopping for. I want to know how UG could alert children in advance to what is likely to happen in their target language, what could happen, and what definitely could not. A learner who overlooked a conditioning feature on a rule would overgeneralize the rule. And it is not just rules that are the problem; the same is true in a parameter-setting system if it offers competing generalizations over the same input examples. Overgeneralization can cause incurable errors for learners who lack systematic negative evidence. It follows that learners should never overlook a conditioning feature. But we have also concluded that they can't afford to check out every potential feature for every linguistic phenomenon they encounter. Concrete knowledge of what can and cannot happen in natural languages at this level of detail would thus be very valuable indeed for learners. Yet linguists interested in universals and innateness mostly don't map out facts at this level of detail. Why not? Perhaps just because these undramatic facts are boring compared with bigger generalizations. To be able to propose a broad structural universal is much more exciting. But another reason could be that these facts about what can be relevant where in a grammar don't seem to qualify as true universals â perhaps not even as parameterized universals unless parameters are more finely cut and numerous than is standardly assumed.
9
Therefore it appears that we may need a different concept, an additional concept, of what sorts of linguistic knowledge might be innate in children, over and above truly universal properties of languages. To the extent that there are absolute universals, that's splendid for acquisition theory; it clearly contributes to explaining how children can converge so rapidly on their target language. No learning is needed at all for fully universal facts. But it may be that there are also “soft” universals; that is, universal tendencies that tolerate exceptions though at a cost. This would be a system of markedness, which gives the child some sort of idea of what to expect in the default case but also indicates what can happen though it is a little less likely, or is a lot less likely, or is very unlikely indeed.
There certainly has been work on syntactic markedness. Noam has written about it in several of his books, including in his discussions of the P&P model,
10
but not a great deal of research on markedness has actually been done in this framework.
11
We don't have a well-worked-out system of markedness principles that are agreed on. Some linguists are leery of the whole notion. Markedness can be very slippery as a linguistic concept. What are the criteria for something being marked or unmarked? What sort of evidence for it is valid?
(Is it relevant how many languages have the unmarked form? Is the direction of language change more compelling? Or tolerance of neutralization, or ease of processing, etc.?
12
) On the other hand, if we could manage to build a markedness theory, it would provide just what is needed to reduce labor costs for learners. It can chart the whole terrain of possible languages, with all potential details prefigured in outline to guide learners' hypotheses. Perhaps this is extreme, but my picture is that all of the things that can happen in a natural language are mapped out innately, either as absolute principles with parameters, or with built-in markedness scales that represent in quite fine detail the ways in which languages can differ.
13
What learners have to do is to find out how far out their target language is on each of the various markedness scales. They start at the default end, of course, and if they find that that isn't adequate for their language sample they shift outward to a more marked position that does fit the facts.
14
To illustrate how this would work, let's consider which verbs are most likely to bridge long-distance extraction, such as wh-movement out of a subordinate clause. In some languages no verbs do: there is no long-distance extraction at all. In languages that do have long-distance extraction, the bridge verbs will certainly include verbs like
say
and
think
. English allows movement of a wh-element over the verb say in an example like:
(7) Who did you say that Mary was waving to?
In some languages, such as Polish, that's about as far as it goes; there is movement across say but not across
consider
or
imagine
. In English the latter are acceptable bridge verbs, and perhaps also
regret
, but we draw the line at
resent
and
mumble
. It seems that there is a universal list of more-likely and less-likely bridge verbs, and different languages choose different stopping points along it â although we may hope that it is not a mere list, but reflects a coherent semantic or focus-theoretic scale of some sort.
15
If children were innately equipped with this scale, Polish learners could acquire extraction over
say
without overgeneralizing it to
imagine
, and English learners could acquire extraction over say and imagine without overgeneralizing it to
resent
. A different scale seems to control which verbs permit the passive. It's not the same set
in every language, but it also doesn't differ arbitrarily. In all languages the verbs most likely to passivize are action verbs like
push
or
kill
. Languages differ with respect to whether they can passivize perception verbs. We can do so in English, for example:
(8) The boy was seen by the policeman
but many languages cannot; perception verbs are evidently further out than action verbs on the markedness scale for passive. Further out still are verbs of possession and spatial relation. Another example concerns the contexts in which binding-principle exceptions are possible, such as local binding of pronouns. This is extremely unlikely in direct object position, but less unlikely for oblique arguments of the verb; the more oblique an argument is, the less tightly the binding theory seems to hold. Thus a learner can fairly safely ignore the possibility of binding exceptions in some contexts, and yet know to keep an eye out for them in other contexts.
16
My conclusion is that if we insist on absolute universals only, we will forgo a great deal of wisdom that all of us possess, as linguists, concerning the “personality” of natural language. We have to assume, I think, that children have that knowledge too, because otherwise they couldn't do the formidable job they do in acquiring their language. So here is my plea, my consumer's request to the “pure” (theoretical and descriptive) linguists who work on universals: Please tell us everything that is known about the sorts of patterns that recur in natural languages, even if it is unexciting, even if it is squishy rather than absolute, even if it has the “scalar” quality that I've suggested, so that we can pack it all into our learning models. They will work a whole lot better if we can do that. If we bring these facts out into the open, not just the rather small number of absolute universals, and the parameters that allow for broad strokes of cross-language variation, but all the many partial and minor trends, we will thereby strengthen the innateness hypothesis for language acquisition. I should add one comment on that last point, however. For my purposes, my selfish consumer purposes, it doesn't matter at all whether the universal trends are specific to language or whether they are general cognitive tendencies. They may be narrowly language-bound in origin, or very general psychological or biological propensities. It would be of great interest to know which is the case. Certainly we should look to see whether some of the curious trends I have cited can be derived from more general underpinnings, linguistic or otherwise. But as long as they exist, whatever their source, they will do what's needed for psycholinguistics to explain why it doesn't take a child a lifetime to learn a language.