Thursday, August 9, 2012

Gass & Selinker Universal Grammar

6.2 Universal Grammar
The UG approach to second language acquisition begins from the per- spective of learnability. The assumption of innate universal language properties is motivated by the need to explain the uniformly successful and speedy acquisition of language by children in spite of insufficient input. In this section, we deal with UG principles, UG parameters, and lexical and functional categories.
In UG theory, universal principles form part of the mental representa- tion of language, and it is this mental grammar that mediates between the sound and meaning of language. Properties of the human mind are what make language universals the way they are. As Chomsky (1995, p. 167) noted: “The theory of a particular language is its grammar. The theory of languages and the expressions they generate is Universal Grammar (UG); UG is a theory of the initial state So of the relevant component of the language faculty.” The assumption that UG is the guiding force of child language acquisition has long been maintained by many, but only in the past two decades has it been applied to second language acquisi- tion. After all, if properties of human language are part of the mental representation of language, it is assumed that they do not cease being properties in just those instances in which a nonnative language system is being employed.
The theory underlying UG assumes that language consists of a set of abstract principles that characterize core grammars of all natural lan- guages. In addition to principles that are invariable (i.e., all languages have them) are parameters that vary across languages. Cook (1997, pp. 250–251) made an interesting analogy between driving a car and principles and parameters:
Overall there is a principle that drivers have to keep consis- tently to one side of the road, which is taken for granted by all drivers in all countries.1 Exceptions to this principle, such as people driving down motorways on the wrong side, rate stories in the media or car chases in action movies. The principle does not, however, say, which side of the road people should drive on. A parameter of driving allows the side to be the left in England and Japan, and the right in the USA and France. The parameter has two values or “settings”—left and right. Once a country has opted for one side or the other, it sticks to its choice: a change of setting is a massively complex operation, whether it happens for a whole country, as in Sweden, or for the individual travelling from England to France. So, a uni- versal principle and a variable parameter together sum up the essence of driving. The principle states the universal requirement on driving; the parameter specifies the variation between countries.
How does UG relate to language acquisition? If children have to learn a complex set of abstractions, there must be something other than the lan- guage input to which they are exposed that enables them to learn language with relative ease and speed. UG is postulated as an innate language facility that limits the extent to which languages can vary. That is, it specifies the limits of a possible language. The task for learning is greatly reduced if one is equipped with an innate mechanism that constrains possible grammar formation. Before relating the question of UG to SLA, we turn briefly to issues from child language acquisition to explain the basic argumentation of this theory.
The theoretical need for an innate language faculty is based on a negative argument. The claim is that, on the basis of language input alone, children cannot attain the complexities of adult grammars. Innate lin- guistic properties fill in where the input fails. What does it mean to say that the input is insufficient? It is not merely an antibehaviorist notion that argues against an input/output scheme. Rather, it is based on the fact that children come to know certain properties of grammar that are not obviously learnable from input, as illustrated by the following examples from English discussed by White (1989):
(6-1) I want to go.
(6-2) I wanna go.
(6-3) John wants to go but we don’t want to. (6-4) John wants to go but we don’t wanna. (6-5) Do you want to look at the chickens? (6-6) Do you wanna look at the chickens? (6-7) Who do you want to see?
(6-8) Who do you wanna see?
Examples 6-1 to 6-8 show the range of possibilities for changing want to to wanna. However, there are many times in English where the sequence want to cannot be replaced by the informal wanna, as in 6-9 to 6-12:
  1. (6-9)  Who do you want to feed the dog?
  2. (6-10)  *Who do you wanna feed the dog?
  3. (6-11)  Who do you want to win the race?
  4. (6-12)  *Who do you wanna win the race?
Without prior information to guide learners, it would be difficult to determine the correct distribution of want to versus wanna in informal English. The input does not provide sufficiently specific information about where to use wanna and where not to use it. White explained that there are principles of UG involving question formation to account for the distribution of these English forms. Briefly, sentence 6-7 can be represented by something like You want to see X and 6-9 by something like You want X to feed the dog. Note the location of X, the element about which a question is being asked. In 6-9, but not in 6-7, the question is about an element (X) that is placed between want and to. This is what effectively blocks contraction. In 6-7, want and to are adjacent, thereby allowing contraction; that is, no intervening element blocks it. Impor- tantly, the input alone does not provide this information. This argument is called the poverty of the stimulus.
One could, of course, argue that direct or indirect intervention is indeed forthcoming and that one does not need innateness to explain language acquisition. However, in most instances, the language-learning environment does not provide information to the child concerning the well-formedness of an utterance (Chomsky, 1981, 1986), or even when it does, it provides information only about the ungrammatical (or inappropriate) utterance, not about what needs to be done to modify a current hypothesis. Furthermore, as we saw in chapter 5 (section 5.1),even with explicit correction, children’s grammars are often impervious to change.
Theoretically, there are two kinds of evidence available to learners as they make hypotheses about correct and incorrect language forms: positive evidence and negative evidence.2 Positive evidence comes from the speech learners hear/read and thus is composed of a limited set of well-formed utterances of the language being learned. When a particular sentence type is not heard, one does not know whether it is not heard because of its impossibility in the language or because of mere coinci- dence. It is in this sense that the sentences of a language that provide the input to the learner are known as positive evidence. It is on the basis of positive evidence that linguistic hypotheses can be made. Negative evidence, on the other hand, is composed of information to a learner that his or her utterance is deviant with regard to the norms of the language being learned. We provide more detail on this in chapter 10. For now, suffice it to say that negative evidence can take many forms, including direct correction, such as That’s not right or indirect questions, such as What did you say?
The child language literature suggests that negative evidence is not frequent (see Brown and Hanlon, 1970, and the theoretical arguments by Baker, 1979), is often ignored, and can therefore not be a necessary condition for acquisition. Because positive evidence alone cannot delineate the range of possible and impossible sentences, and because negative evidence is not frequently forthcoming, there must be innate principles that constrain a priori the possibilities of grammar formation.
In sum, Universal Grammar is “the system of principles, conditions, and rules that are elements or properties of all human languages” (Chomsky, 1975, p. 29). It “is taken to be a characterization of the child’s prelinguistic state” (Chomsky, 1981, p. 7). Thus, the necessity of positing an innate language faculty is due to the inadequate input, in terms of quantity and quality, to which a learner is exposed. Learning is mediated by UG and by the L1, as we will see below.
How does this relate to second language acquisition? The question is generally posed as an access-to-UG problem. Does the innate language faculty that children use in constructing their native language grammars remain operative in second language acquisition? More recently, this question is formulated as an issue of initial state. What do second lan- guage learners start with?
6.2.1 Initial state
The question posed in this section is: What is the nature of the linguistic knowledge with which learners begin the second language acquisition process? That is, what is the unconscious linguistic knowledge that learners have before receiving L2 input, or, to take a variant of the question, what are early L2 grammars like? The two variables influencing this debate are transfer (i.e., the availability of the first language grammar) and access to UG (i.e., the extent to which UG is available).
Two broad views are discussed here: the Fundamental Difference Hypothesis (Bley-Vroman, 1989; Schachter, 1988), which argues that what happens in child language acquisition is not the same as what happens in adult second language acquisition, and the Access to UG Hypothesis, which argues that the innate language facility is alive and well in second language acquisition and constrains the grammars of second language learners as it does the grammars of child first language learners. We take a look at each of these positions, the latter in actuality being made up of several branches.
6.2.1.1 Fundamental Difference Hypothesis
As was seen in chapters 4 and 5, much of the work in second language acquisition was driven by the notion that first and second language acquisition involve the same processes. This is not to say that differences were not noted; rather, proposals to account for these differences were made with an attempt to salvage the major theoretical claim of L1 and L2 similarities.
The Fundamental Difference Hypothesis starts from the belief that, with regard to language learning, children and adults are different in many important ways. For example, the ultimate attainment reached by children and adults differs. In normal situations, children always reach a state of “complete” knowledge of their native language. In second lan- guage acquisition (at least, adult second language acquisition), not only is “complete” knowledge not always attained, it is rarely, if ever, attained. Fossilization, representing a non-TL stage, is frequently observed (Han, 2004; Long, 2007).
Another difference concerns the nature of the knowledge that these two groups of learners have at the outset of language learning. Second language learners have at their command knowledge of a full linguistic system. They do not have to learn what language is all about at the same time that they are learning a specific language. For example, at the level of performance, adults know that there are social reasons for using different language varieties.3 What they have to learn in acquiring a second lan- guage system is the specific language forms that may be used in a given social setting. Children, on the other hand, have to learn not only the appropriate language forms, but also that there are different forms to be used in different situations.
Related to the idea that adults have complete knowledge of a language system is the notion of equipotentiality, expressed by Schachter (1988).
She pointed out that children are capable of learning any language. Given exposure to the data of a language (i.e., the input), a child will learn that language. No language is easier to learn than another; all languages are equally learnable by all children. This is not the case with second language learners. Spanish speakers have less difficulty learning Italian than they do Japanese. If language relatedness (perceived or actual) were not a determining factor in ultimate success, we would expect all learners to be equally able to learn any second language. This is not borne out by the facts.
One final difference to mention is that of motivation and attitude toward the target language and target language community (see chapter 12 for a fuller discussion). It is clear that, as in any learning situation, not all humans are equally motivated to learn languages, nor are they equally motivated to learn a specific language. Differential motivation does not appear to impact a child’s success or lack of success in learning language. All human beings without cognitive impairment learn a first language.
In sum, the basic claim of the Fundamental Difference Hypothesis is that adult second language learners do not have access to UG. Rather, what they know of language universals is constructed through their NL. In addition to the native language, which mediates access to UG, second language learners make use of their general problem-solving abilities. Second language learners come to the language-learning situation know- ing that a language contains an infinite number of sentences; that they are capable of understanding sentences they have never heard before; and that a language has rules of syntax, rules of combining morphemes, limits on possible sounds, and so forth. With specific regard to syntax, learners know that languages can form questions and that the syntax of questions is syntactically related to the syntax of statements. They know that languages have a way of modifying nouns, either through adjectives or relative clauses.
This information is gleaned by means of knowing that the NL is this way and by assuming that these facts are a part of the general character of language rather than a part of the specific nature of the native language. Thus, the learner constructs a pseudo-UG, based on what is known of the native language. It is in this sense that the NL mediates knowledge of UG for second language learners.
6.2.1.2 Access to UG Hypothesis
The opposing view to the Fundamental Difference Hypothesis is the Access to UG Hypothesis. The common perspective is that “UG is constant (that is, unchanged as a result of L1 acquisition); UG is dis- tinct from the learner’s L1 grammar; UG constrains the L2 learner’s interlanguage grammars” (White, 2003, p. 60). White (2003) outlines five positions with regard to the initial state of second language learning; the first three take the first language as the basis of the initial state and the second two take UG as the initial state: (1) Full Transfer/Full Access, (2) Minimal Trees, (3) Valueless Features, (4) Initial Hypothesis of Syntax, and (5) Full Access (without transfer).
Before beginning the discussion of access to UG, it is important to make one further distinction and that is between lexical and functional categories. In addition to principles, part of the innate language compon- ent consists of lexical and functional categories. Lexical categories are the categories that we learn about in school: nouns, adjectives, verbs, adverbs, and so forth. These can be thought of as content words. Functional cat- egories, on the other hand, are words that serve particular functions (e.g., articles, possessives) or they may be categories consisting of grammatical morphemes (e.g., plurals, tense markers).
Functional categories can be thought of as grammatical elements that in a sense form the glue of a sentence. Examples of functional categories are determiners (e.g., a, the, our, my, this), complementizers (e.g., if, whether, that), and grammatical markers (past tense endings, case mark- ings, plural endings, and gender marking). These differ from lexical cat- egories in a number of ways. In general, functional categories represent a fixed set of words in a language, whereas lexical categories can be added to as the need arises (consider the recent addition to the English lexicon of the word dotcom, as in dotcom industry or in the recent Time magazine headline “Doom stalks the dotcoms”).4
However, the most important distinction has to do with whether or not a class of words is associated with lexical properties. Prepositions, for example, though typically having the functional category characteristic of a fixed set of words in a language, are best thought of as a part of the lexical category. This is so because prepositions are often associated with such roles as agent (who does what to whom), patient (who is the recipient of the action), and location. For example, in English the preposition by can be associated with an agent in passive sentences (John was kissed by Mary), and the preposition in can take on the role of location (John was kissed in the park).
We now turn to different conceptualizations of the roles of the L1 and UG as possible starting points for L2 acquisition.
L1 AS THE BASE
1 Full Transfer/Full Access
This position assumes that the starting point is the L1 grammar, but that there is full access to UG during the process of acquisition (Schwartz, 1998; Schwartz and Sprouse, 1994, 1996, 2000; Whong- Barr, 2005). The learner is assumed to use the L1 grammar as a basis but to have full access to UG when the L1 is deemed insufficient for the learning task at hand. L1 and L2 learning differ, and there is no prediction that learners will eventually attain complete knowledge of the L2.
  1. 2  Minimal Trees Hypothesis
    Recall that in the previous position, full transfer/full access, learners draw on both the L1 and UG. The first option was to draw on the L1 and, where that was insufficient, to draw on UG. The Minimal Trees Hypothesis also maintains that both L1 and UG are available concur- rently (Vainikka and Young-Scholten, 1994, 1996a, 1996b). However, the L1 grammar that is available contains no functional categories, and these categories, initially, are not available from any source. The emergence of functional categories is not dependent on the L1 and hence there is no transfer; rather, they emerge in response to L2 input. The development of functional categories of learners from different languages will be the same. On this view, learners may or may not reach the final state of an L2 grammar, depending on what is available through the L1 and what is available through UG. They should be able to reach the final state of an L2 grammar with regard to functional categories.
  2. 3  Valueless Features
    This is the most technical of the hypotheses and will be dealt with in the least detail. In essence, the claim is that there is weak transfer (Eubank 1993, 1993/1994, 1996). The L1 is the primary starting point. Unlike the Minimal Trees Hypothesis, both functional and lexical categories are available from the L1, but the strength of these features is not available. There are consequences of feature strength in areas such as word order. Acquisition involves acquiring appropriate feature strength of the L2. Learners should be able to fully acquire the L2 grammar.
UG-BASED
  1. 4  The Initial Hypothesis of Syntax (Platzack, 1996)
    This position maintains that, as in child language acquisition, the starting point for acquisition is UG.
  2. 5  Full Access/No Transfer
    This position maintains that, as in child language acquisition, the starting point for acquisition is UG (Epstein, Flynn, and Marto- hardjono, 1996, 1998; Flynn, 1996; Flynn and Martohardjono, 1994). There is a disconnection between the L1 and the developing L2 grammar. A prediction based on this position is that L1 and L2 acquisition will proceed in a similar fashion, will end up at the same point, and that all L2 acquisition (regardless of L1) would proceed along the same path. Learners should be able to reach the same level of competence as native speakers. If there are differences, they are performance-related rather than competence-related.
In the following sections, we examine data that bear on these issues of access to UG. There are two types of relevant data: data relating to UG principles that are invariant, and data relating to UG parameters that vary across languages.
6.2.2 UG principles
White (1989) reported on a study by Otsu and Naoi (1986) dealing with the principle of structure dependence. The basic concept behind this principle is that linguistic principles operate on syntactic (or structural) units. That is, most importantly, according to this view, what makes lan- guage knowledge different from other types of knowledge is the notion of structure dependency; language is not just a string of unstructured segments. White pointed out that this accounts for the grammatical question in 6-14 and the ungrammaticality of 6-15.
(6-13) The boy who is standing over there is happy.
(6-14) Is the boy who is standing over there happy? (6-15) *Is the boy who standing over there is happy?
The rule for question formation makes reference to the subject, which in the case of 6-13 is a complex subject consisting of a determiner phrase (the boy) and a relative clause (who is standing over there). The rule does not make reference to a nonstructural unit, such as “the first verb.” Thus, yes/no questions are formed by moving the main verb to the front of the sentence, not by moving the first verb in the sentence to the front (as in 6-15).
Otsu and Naoi tested knowledge of structure dependency among Japanese learners of English. In Japanese, questions are formed by adding a question particle to the end of a sentence. No word-order changes are made. The learners tested knew how to form simple questions and passed a test showing knowledge of relative clauses, but they had no knowledge of question formation involving complex subjects. It was hypothesized that if a UG principle, structure dependence, were operative, it could not have come into the learner language system through the L1 as the L1 does not have a principle of structure dependence relevant to question for- mation. Thus, the only way the principle of structure dependence could have come into the learners’ second language grammar is through direct access to UG. In general, the results of this study support the notion that learners’ grammars are constrained by principles of UG, in this case the principle of structure dependence.
Another study relevant to the issue of UG principles is one by Schachter (1989). She tested the principle known as subjacency, which limits the amount of movement that can take place within sentences. Consider the following contrived conversation:
Speaker 1: I agree with the idea that David loves Mary Jo. Speaker 2: I didn’t hear you. *Who do you agree with the idea
that David loves?
The ungrammaticality of Who do you agree with the idea that David loves? is due to the fact that, in English, movement of the question word from the position of the original noun phrase (Mary Jo) to its new sentence-initial position is constrained by the distance and intervening syntactic structures between the two positions.5 In Speaker 2’s sentence, the necessary syntactic relationships cannot hold; that is, the movement rule is violated and, hence, the sentence is ungrammatical.
Schachter (1989) tested knowledge of this principle by eliciting gram- maticality judgments by native speakers of Indonesian, Chinese, and Korean learning English. In a separate article, Schachter (1990) added a group of Dutch speakers to her database. The languages in question have different requirements on subjacency. In Korean, there is no evidence of subjacency; in Chinese and Indonesian, there is some evidence of subja- cency, although in both of these languages wh- movement is more limited than in English; and in Dutch, subjacency restrictions are much the same as in English. The results of Schachter’s study suggest that the Dutch speakers recognize that English is constrained by the principle of subjacency; the results for the other groups are not as clear. The Korean- speaking learners, in keeping with the no-access position, were not con- strained by subjacency. The Chinese and Indonesian speakers behaved more English-like than the Korean speakers, but their interlanguage gram- mars could not be said to be constrained by the principle of subjacency.6
A third example comes from White’s (2003) discussion of the results of studies based on the Empty Category Principle (ECP) (Chomsky, 1981). In essence, the ECP is a way of accounting for asymmetry found in the use or nonuse of case particles. Examples can be seen from Japanese in 6-16, 6-17, and 6-18.
  1. (6-16)  John ga sono hon o yonda. John NOM that book ACC read-PAST “John read that book.”
  2. (6-17)  John ga sono hon yonda. John NOM that book read-PAST
  3. (6-18)  *John sono hon o yonda. John that book ACC read-PAST
6-16 is grammatical with both a nominative and an accusative case marker; 6-17 is possible with a nominative case marker and no accusative case marker, but 6-18 is ungrammatical because it has only an accusative case marker, but no nominative case marker. Kanno (1996) investigated whether beginning learners of Japanese were able to recognize this dis- crepancy, arguing that, if they recognized the asymmetry in the early stages of learning, one could assume that the ECP functions in early second language learning.7 Both L2 learners and native speakers of Japanese accepted accusative case drop sentences more than nominative case drop. This suggests that ECP does in fact function in the early grammars of L2 learners.
Thus, with regard to UG principles, there is conflicting evidence as to whether learners have direct access to UG, have access through the NL, or have no access at all.
6.2.3 UG parameters
There are certain linguistic features that vary across languages. These are expressed through the concept of linguistic parameters. Parameters have limited values. In learning a first language, the data a child is exposed to will determine which setting of a parameter that child will select. Whereas parameters are not invariable, as we saw with principles, they are limited, thereby easing the burden on the child. In other words, if parameters exist, the child’s task is eased, because there is a limited range of options to choose from.
The issue for second language acquisition is the determination of whether and how a given linguistic parameter can be reset. Let’s assume a parameter with two values. Let’s further assume a native speaker with a NL setting in one way who is learning a second language with a setting in another way. If UG is available to that learner, there should be little difficulty in resetting the parameter because the speaker has access to both settings through UG. If UG is operative only through the L1 (as the Fundamental Difference Hypothesis suggests), then we would expect only those features that are available through the L1 to manifest themselves in the L2. Finally, if UG is not operative at all, we would expect none of the UG features to be available.
One of the most interesting aspects related to the concept of parameters is that they involve the clustering of properties. Once a parameter is set in a particular way, all related properties are affected. In other words, there are consequences for other parts of the grammar. We examine one such parameter, known as the pro-drop parameter. This parameter encompasses a number of properties, namely (a) the omission of subject pronouns, (b) the inversion of subjects and verbs in declarative sentences, and (c) that-trace effects—that is, the extraction of a subject (leaving a trace) out of a clause that contains a complementizer. A language will either have all of these properties or none of them. Languages like Italian and Spanish are [+pro-drop] and have all of the associated properties, whereas English and French are [−pro-drop], having none of them. Examples from English and Italian that illustrate the differences follow:
English
Obligatory use of subject pronouns
She is going to the movies this evening.
Italian
Omit subject pronouns
Va al cinema stasera.
goes to the movies this evening *is going to the movies this evening
Subject–verb inversion
È arrivata Laura. is arrived Laura
That-trace
Chi hai detto che è venuto? who you said that is come?
Laura has arrived. *has arrived Laura
Whom did you say came? *Whom did you say that came?
White (1985) and Lakshmanan (1986) presented data from Spanish and French learners of English (White) and Spanish, Japanese, and Arabic learners of English (Lakshmanan) on precisely these three structures.8 White found that the learners did not recognize these three structures as related. Although there was a difference in judgments of acceptability between the Spanish and the French speakers on the first type of sentences (i.e., those with and without overt subject pronouns), there was no difference between the two groups on the other two types of sentences. Thus, these learners did not see these three properties as a unified parameter. Lakshmanan’s results were similar. Her groups of learners responded similarly to the first two sentence types but dif- ferently with regard to the third, again suggesting that these properties were not seen by these learners as unified under the umbrella of a single parameter.
There is evidence, however, that is more compelling with regard to the clustering of properties. Hilles (1986) assumed different properties of the pro-drop parameter in her investigation of the acquisition of English by a native speaker of Spanish named Jorge: (a) obligatory pronoun use; (b) use of nonreferential it, as in weather terms (it’s raining, it’s pouring) and use of nonreferential there, as in There is rain in the forecast; and (c) use of uninflected modals (e.g., must, could). Hilles showed that these three features were related in the speech of her learner. Specifically, there was an inverse relationship between Jorge’s lack of referential subject use and the appearance of modal verbs. As Jorge began to use subject pronouns in English (i.e., as his null-subject use went down), he also began to use modals as noninflected forms. Hilles hypothesized that the triggering factor for the switch from [+pro-drop] to [−pro-drop] was the use of nonreferential subjects. This was an indication that this learner had truly understood the mandatory nature of subjects in English.
Park (2004) analyzed pronominal subjects and objects. She observed that Spanish speakers learning English frequently drop subject pronouns, whereas Korean speakers learning English frequently drop object pro- nouns. She attempts to account for this discrepancy through the inter- pretability of agreement features in the native languages.
The results of research on L2 parameters, like those of the research on principles discussed in section 6.2.2, is mixed. There are data supporting the view that UG constrains the grammars that learners can come up with; there are data arguing against this position. Thus, the answer to the question of whether L2 acquisition is fundamentally the same as L1 acquisition is no; the answer to the question of whether L2 acquisition is fundamentally different from L1 acquisition is also no. Although it may be the case that universal principles (either typological or formal) guide L2 acquisition, it is also the case that there are areas of conflict between NL and TL grammars yielding grammars that fall beyond the domain of what would be predicted if the only constraining factor were universals. However, White (2003, p. 149), following her discussion of parameters, concludes that “[d]espite conflicting evidence and conflicting theories, results from several studies suggest that interlanguage grammars conform to parameters of UG.”
Within the Minimalist framework (Chomsky, 1995, 2000, 2002), the lexicon assumes great importance. Parameterization within the Mini- malist Program is no longer in the syntax, but in the lexicon. Most of the constraints on language described earlier in terms of complex principles and parameters now fall out of a handful of general constraints on movement and the specific information stored in the lexicon of indi- vidual languages. Furthermore, most of the parametric variation relates to grammatical features such as tense and agreement. When we think of learning vocabulary, what we typically think of is learning the “mean- ings” of words (e.g., what the word chair refers to or what subterfuge means). But knowing that, for example, break is defined as “to disjoin or reduce to pieces with sudden or violent force” (American Heritage Dictionary) is only part of what we know about the word break. Knowing a word entails much more than that, and the additional knowledge is as important as any other piece of knowledge we have of language. For example, we also know that the verb break is irregular in its past tense formation, whereas love is not. We know that a sentence such as
(6-19) Harvey broke the glass jar.
is a good English sentence, but 6-20 is not.
We know that some words require objects (hit), other words allow objects but do not require them (eat), and still other words disallow objects (sleep). This is part of what we know about a language. Within Minimalism, parameters are part of the lexicon and language learning is largely lexical learning.
An example of how parametric variation is attached to the lexicon comes from the use of reflexives. Given an English sentence such as 6-21:
(6-21) The mother told the girl to wash herself.
speakers of English recognize that the word herself must refer to the girl. But the same is not true in sentence 6-22, where her can refer to the mother or to someone else.
(6-22) The mother told the girl to wash her.
Thus, the word herself in English contains information about possible antecedents. Other languages choose different options. For example, in Japanese, one reflexive form, zibun, can be ambiguous, as in 6-23 (from Lakshmanan and Teranishi, 1994):
(6-23) John-wa Bill-ga kagami-no naka-de zibun-o John-TOP Bill-NOM mirror-GEN inside-LOC self-ACC
“John said that Bill saw self in the mirror.” (Either John or Bill can have seen himself.)
In 6-24, the reflexive zibun-zisin removes the ambiguity.
(6-24) John-wa Bill-ga kagami-no naka-de zibun-zisin mita to itta. “John said that Bill saw himself in the mirror.”
(John cannot have seen himself.)
Languages thus contain information in the lexicon that signals grammat- ical relationships.
There are two important questions that are in need of resolution. (a) Are universals the major organizing factor of learner language gram- mars? (b) If so, are the two types of universals discussed here and in the subsequent chapter only variants of one another, or is one a more appro- priate model than the other?
173
mita to itta saw that said

6.2.4 Falsification
In trying to come up with a parsimonious account of how second lan- guages are acquired, it is necessary to have a theory that will explain (and predict) the facts of learner grammars. In order to determine the accuracy of our theories, an important consideration is the issue of falsification. Our theory must predict what will occur and what will not occur. It is only in this way that we can test the accuracy of our hypotheses. In other words, our theories need to be falsifiable based on the data.
Learner languages are highly complex systems and, to some extent, are unique, making it difficult to make absolute predictions. Thus, it is more appropriate to think about probabilistic predictions. Unlike L1 gram- mars, no two individuals have the same L2 grammar, and hence there is no way of predicting what will happen to a grammar when new informa- tion is added, causing changes in the existing system. One might think of this as the kaleidoscope factor. Each kaleidoscope pattern differs. Any change in the system (a shake or twist to the kaleidoscope) will result in a different unpredictable pattern.10 One can make certain predictions, but given the many factors involved in a kaleidoscope (does one twist the box or shake it, how hard, etc.?), one cannot make absolute predictions. One can only establish guidelines within which all of the images are likely to fall.
The advantage of research within a UG framework is that, because it is based on a well-defined linguistic theory, more accurate predictions can be made, although the arguments made earlier regarding absolute versus probabilistic predictions still hold (see also Pinker, 1987).
When there are counterexamples—that is, when the predictions are not borne out—there are various approaches one can take: (a) assume a no-access to UG position, as we have seen with regard to the Funda- mental Difference Hypothesis; (b) attribute the results to methodological problems; (c) attribute the results to an undefined performance compon- ent; (d) attribute the results to mapping factors; or (e) assume the theory is false.
Within UG the fifth possibility has been common. Because the pre- dictions are based on theoretical constructs that are abstractions (that thus have to be argued rather than empirically verified), and because the theory is in a state of development, there is little concrete evidence that one can bring to bear to show that the linguistic analysis of a principle or parameter is indeed the correct one. Thus, if one maintains the assump- tion that second language grammars are natural grammars, then SLA data can be brought into the arguments in the field of linguistics in the determination of linguistic principles and parameters.
Because of the changing nature of the linguistic constructs on which it is based, UG-based research is difficult to falsify. Upon being confronted with data apparently contradicting the predictions of UG access, it is equally possible to argue that the underlying linguistic formulation was the incorrect one.
To illustrate this point, reconsider the discussion of the pro-drop par- ameter. We noted that there were differing views as to what constituted the appropriate clusters in this parameter. In White’s study, the predicted clusterings were not evidenced in the data. A possible conclusion she comes to is:
It is of interest that some recent proposals suggest that the possi- bility of VS word order [i.e., subject inversion] is not, in fact, part of the pro-drop parameter, but derives from other principles of grammar (Chao 1981; Safir 1982; Hyams 1983), a position that these results would be consistent with.
(White, 1985, p. 59)
Thus, rather than assuming a no-access position, White suggests the possibility that the parameter has been inaccurately described.
Yet another way of viewing the falsification problem is to allow for violations of universals, as these violations are temporary, given the ever- changing nature of learner languages. UG then serves as a “corrective mechanism” (see Sharwood Smith, 1988). A violation is only to be taken as a serious violation if it can also be shown that the person’s interim system (i.e., his or her learner language) has stabilized. This would mean that most cross-sectional studies would have to be eliminated, because it is only with longitudinal data that we can determine whether a grammar has stabilized/fossilized or not. There is an added difficulty here. As we have no independent means of determining whether stabilization/ fossilization has taken place, we can never know when we are confronted with a stabilized grammar and when we are not. Thus, if we are to take this view, we cannot determine whether or not universal principles are violated. But if the principles are followed, then we can conclude that second language grammars are constrained by the particular principles. If the principles are not followed, there is little that can be concluded. We have no way of determining with certainty that the principles are permanently not followed.
If we consider the initial-state discussion earlier in this chapter, it is clear that there are difficulties defining what is meant by initial-state. For example, how early must data be to be relevant? First day of exposure, first utterance? What about a period of nonproduction before pro- duction begins? Is this relevant? Does it exist? If these data are relevant, then is there any way of falsifying certain claims (for example, whether functional categories are in place or not)? Or, to think about the Valueless Feature Hypothesis, if research is conducted with early learners—say, those in their first year or semester of study—and they had acquired feature strength, does that mean that the semester or year of exposure was sufficient to acquire strength or does it mean that they started with specified feature strength, rendering the hypothesis false?
To take a similar example, recall that one of the questions in UG-based research is the extent to which functional categories are available in early stages of learning. For example, it is frequently the case that there is little morphological marking in early L2 production, suggesting the absence of functional categories. However, plural marking is often absent at very late stages of SLA, making it difficult to maintain that omission is solely due to an absence of functional categories. Therefore, on the surface, one might consider a certain type of data as evidence of falsification whereas different explanations might be plausible for the same phenomenon in different contexts.
6.3 Transfer: the UG perspective
In chapters 4 and 5 we discussed historical and current views of transfer respectively. Conducting SLA research within a paradigm such as the one discussed in this chapter necessitates a reconsideration of the concept of transfer. The question arises: What new insights do recent linguistic approaches and, in particular, theoretical paradigms provide regarding the old concept of transfer?
White (1992) provided detail on this issue. She notes four areas that make current views of the phenomenon of transfer truly different from earlier conceptualizations, particularly those embodied in the framework of contrastive analysis. We deal with three of these areas here: levels of representation, clustering, and learnability.
6.3.1 Levels of representation
Within a theory of Universal Grammar, our knowledge of syntax is best represented by positing different levels of grammatical structure. To simplify matters, assume that there is an underlying structure and a surface structure. To understand the difference, consider 6-25:
(6-25) Visiting relatives can be boring.
This sentence can be parsed in one of two ways, each with a different meaning.
(6-26) When I visit relatives, I am bored. (6-27) Relatives who visit me can be boring. The two different meanings are a result of two different underlying syntactic structures that can be computed for sentence 6-25.
If sentences have multiple levels of representation, one can imagine that transfer could occur not just on the basis of surface facts, but also on the basis of underlying structures (see Tarone, Frauenfelder, and Selinker, 1976).
6.3.2 Clustering
With regard to clustering, recall that within a UG theory claiming that learning involves setting/resetting of parameters, there are properties that cluster together within a parameter. Within this framework (as with typological universals, discussed in chapter 7), one is concerned with how multiple properties of language do or do not behave in a like fashion. Further, there is evidence that mixed values are adopted for multivalued parameters and continuous linguistic features (for examples, see Broselow and Finer, 1991; Gass, 1984).
Within earlier approaches to transfer (particularly a contrastive analysis approach), there was no way to show how related structures were linked in the minds of second language learners. Nonetheless, a model that involves structural relatedness clearly represents an innovative approach to language transfer.
6.3.3 Learnability
A UG perspective on SLA is heavily dependent on arguments of learn- ability. In particular, the issue of positive evidence is central because learners construct grammars on the basis of the input (the positive evi- dence to which the learner is exposed) together with principles of UG. But, there are some language structures that may be in a superset/subset relationship. In fact, a learning principle, the Subset Principle, has been proposed that ensures that language learning can proceed on the basis of input alone. When there are multiple possibilities in a language, child learners adopt the most restrictive grammar possible so that she or he can proceed to learn the appropriate forms on the basis of input alone. If she or he were to assume a superset grammar, there would be no way to retreat from that grammar. Consider adverb placement in French and English. In French, adverbs can be positioned in a greater number of places than in English. In English, sentence 6-28 is ungrammatical, whereas the French counterpart is not.
(6-28) *The man is drinking slowly his coffee.
If an English child were to start with a grammar that allowed all possi- bilities for adverb placement, it would be difficult to learn on the basis of positive evidence (input) alone that the grammar was actually more restrictive.
Looking at this across languages, we can see that the input necessary for the learner may be different depending on the superset/subset relation- ship of the two languages in question on a particular feature. For a French learner learning English, she or he has to learn that 6-28 is ungrammatical (and, in fact, this is learned late and is characteristic of a French person speaking English), whereas, an English learner learning French only has to hear the broader range of possibilities to know that French has more possibilities for adverb placement.
Where positive evidence is readily available, allowing a learner to reset a parameter, little transfer (and, when present, of short duration) is pre- dicted (as in the case of the L2 being a superset of the L1). On the other hand, when positive evidence will not suffice to provide learners with adequate information about the L2, possibly necessitating negative evidence, transfer is predicted (as when the L2 is a subset of the L1).
6.4 Phonology
Another area where SLA and linguistics intersect is phonology. The study of L2 phonology is not unlike other areas of L2 acquisition in that it attempts to account for the patterns of knowledge and use of L2 learners, in this case of pronunciation and perception. It is commonly accepted that the native language origin of a second language speaker is often identifiable by his or her accent. In fact, nonnative speaker pronunciation is often the source of humor, as in the case of comedians mimicking particular accent types, or in cartoon characters adopting nonnative accents.
The acquisition of a second language phonology is a complex process. An understanding of how learners learn a new phonological system must take into account linguistic differences between the NL and the TL sys- tems as well as universal facts of phonology. Phonology is both similar to and different from other linguistic domains. It is similar to what we have seen in other parts of language in that some of a learner’s pronunciation of the second language is clearly attributable to the NL, whereas some is not. It is different in that not all of the concepts relevant to syntax are applicable to phonology. For example, avoidance is a common L2 strategy used when a syntactic construction is recognizably beyond one’s reach. Thus, if a learner wants to avoid passives, it is relatively easy to find an alternative structure to express the same concept. However, if a learner wants to avoid the sound [ð], as in the in English, it would be virtually impossible. Phonology differs from syntax in that in the former, but not the latter, most people can detect the linguistic origin of a speaker (although see arguments in Ioup, 1984, relating to “syntactic accent”).
178
FORMAL APPROACHES TO SLA
Table 6.1 Hierarchy of phonological difficulty
NL
0
0
Optional Obligatory Obligatory Optional Optional Obligatory
TL
Obligatory difficult Optional
Obligatory
Optional
0
0
Optional
Obligatory easy
Source: Adapted from R. Stockwell and J. Bowen. The Sounds of English and Spanish. Chicago: University of Chicago Press. Reprinted by permission.
As discussed in chapter 4, in its simplest form, the Contrastive Analy- sis Hypothesis did not make accurate predictions. It did not predict why speakers of language X learning language Y would have difficulty on a given structure, whereas speakers of language Y learning language X did not have difficulty on that same structure. These discrepancies were also evident in phonology. As an example, consider Stockwell and Bowen’s (1965) proposed hierarchy of difficulty (Table 6.1). The hierarchy (ordered from most difficult to least difficult) attempts to make pre- dictions of difficulty based on whether or not phonological categories are absent or present and, if present, whether they are obligatory or optional. Thus, if a learner comes from a language that has no phonemic contrast between two sounds (e.g., /l/ and /r/) and is learning a language where that contrast is obligatory, she or he will have difficulty. However, if the first language and the target language both have the same contrast, there will be little difficulty in learning.
6.2 Universal Grammar
The UG approach to second language acquisition begins from the per- spective of learnability. The assumption of innate universal language properties is motivated by the need to explain the uniformly successful and speedy acquisition of language by children in spite of insufficient input. In this section, we deal with UG principles, UG parameters, and lexical and functional categories.
In UG theory, universal principles form part of the mental representa- tion of language, and it is this mental grammar that mediates between the sound and meaning of language. Properties of the human mind are what make language universals the way they are. As Chomsky (1995, p. 167) noted: “The theory of a particular language is its grammar. The theory of languages and the expressions they generate is Universal Grammar (UG); UG is a theory of the initial state So of the relevant component of the language faculty.” The assumption that UG is the guiding force of child language acquisition has long been maintained by many, but only in the past two decades has it been applied to second language acquisi- tion. After all, if properties of human language are part of the mental representation of language, it is assumed that they do not cease being properties in just those instances in which a nonnative language system is being employed.
The theory underlying UG assumes that language consists of a set of abstract principles that characterize core grammars of all natural lan- guages. In addition to principles that are invariable (i.e., all languages have them) are parameters that vary across languages. Cook (1997, pp. 250–251) made an interesting analogy between driving a car and principles and parameters:
Overall there is a principle that drivers have to keep consis- tently to one side of the road, which is taken for granted by all drivers in all countries.1 Exceptions to this principle, such as people driving down motorways on the wrong side, rate stories in the media or car chases in action movies. The principle does not, however, say, which side of the road people should drive on. A parameter of driving allows the side to be the left in England and Japan, and the right in the USA and France. The parameter has two values or “settings”—left and right. Once a country has opted for one side or the other, it sticks to its choice: a change of setting is a massively complex operation, whether it happens for a whole country, as in Sweden, or for the individual travelling from England to France. So, a uni- versal principle and a variable parameter together sum up the essence of driving. The principle states the universal requirement on driving; the parameter specifies the variation between countries.
How does UG relate to language acquisition? If children have to learn a complex set of abstractions, there must be something other than the lan- guage input to which they are exposed that enables them to learn language with relative ease and speed. UG is postulated as an innate language facility that limits the extent to which languages can vary. That is, it specifies the limits of a possible language. The task for learning is greatly reduced if one is equipped with an innate mechanism that constrains possible grammar formation. Before relating the question of UG to SLA, we turn briefly to issues from child language acquisition to explain the basic argumentation of this theory.
The theoretical need for an innate language faculty is based on a negative argument. The claim is that, on the basis of language input alone, children cannot attain the complexities of adult grammars. Innate lin- guistic properties fill in where the input fails. What does it mean to say that the input is insufficient? It is not merely an antibehaviorist notion that argues against an input/output scheme. Rather, it is based on the fact that children come to know certain properties of grammar that are not obviously learnable from input, as illustrated by the following examples from English discussed by White (1989):
(6-1) I want to go.
(6-2) I wanna go.
(6-3) John wants to go but we don’t want to. (6-4) John wants to go but we don’t wanna. (6-5) Do you want to look at the chickens? (6-6) Do you wanna look at the chickens? (6-7) Who do you want to see?
(6-8) Who do you wanna see?
Examples 6-1 to 6-8 show the range of possibilities for changing want to to wanna. However, there are many times in English where the sequence want to cannot be replaced by the informal wanna, as in 6-9 to 6-12:
  1. (6-9)  Who do you want to feed the dog?
  2. (6-10)  *Who do you wanna feed the dog?
  3. (6-11)  Who do you want to win the race?
  4. (6-12)  *Who do you wanna win the race?
Without prior information to guide learners, it would be difficult to determine the correct distribution of want to versus wanna in informal English. The input does not provide sufficiently specific information about where to use wanna and where not to use it. White explained that there are principles of UG involving question formation to account for the distribution of these English forms. Briefly, sentence 6-7 can be represented by something like You want to see X and 6-9 by something like You want X to feed the dog. Note the location of X, the element about which a question is being asked. In 6-9, but not in 6-7, the question is about an element (X) that is placed between want and to. This is what effectively blocks contraction. In 6-7, want and to are adjacent, thereby allowing contraction; that is, no intervening element blocks it. Impor- tantly, the input alone does not provide this information. This argument is called the poverty of the stimulus.
One could, of course, argue that direct or indirect intervention is indeed forthcoming and that one does not need innateness to explain language acquisition. However, in most instances, the language-learning environment does not provide information to the child concerning the well-formedness of an utterance (Chomsky, 1981, 1986), or even when it does, it provides information only about the ungrammatical (or inappropriate) utterance, not about what needs to be done to modify a current hypothesis. Furthermore, as we saw in chapter 5 (section 5.1),even with explicit correction, children’s grammars are often impervious to change.
Theoretically, there are two kinds of evidence available to learners as they make hypotheses about correct and incorrect language forms: positive evidence and negative evidence.2 Positive evidence comes from the speech learners hear/read and thus is composed of a limited set of well-formed utterances of the language being learned. When a particular sentence type is not heard, one does not know whether it is not heard because of its impossibility in the language or because of mere coinci- dence. It is in this sense that the sentences of a language that provide the input to the learner are known as positive evidence. It is on the basis of positive evidence that linguistic hypotheses can be made. Negative evidence, on the other hand, is composed of information to a learner that his or her utterance is deviant with regard to the norms of the language being learned. We provide more detail on this in chapter 10. For now, suffice it to say that negative evidence can take many forms, including direct correction, such as That’s not right or indirect questions, such as What did you say?
The child language literature suggests that negative evidence is not frequent (see Brown and Hanlon, 1970, and the theoretical arguments by Baker, 1979), is often ignored, and can therefore not be a necessary condition for acquisition. Because positive evidence alone cannot delineate the range of possible and impossible sentences, and because negative evidence is not frequently forthcoming, there must be innate principles that constrain a priori the possibilities of grammar formation.
In sum, Universal Grammar is “the system of principles, conditions, and rules that are elements or properties of all human languages” (Chomsky, 1975, p. 29). It “is taken to be a characterization of the child’s prelinguistic state” (Chomsky, 1981, p. 7). Thus, the necessity of positing an innate language faculty is due to the inadequate input, in terms of quantity and quality, to which a learner is exposed. Learning is mediated by UG and by the L1, as we will see below.
How does this relate to second language acquisition? The question is generally posed as an access-to-UG problem. Does the innate language faculty that children use in constructing their native language grammars remain operative in second language acquisition? More recently, this question is formulated as an issue of initial state. What do second lan- guage learners start with?
6.2.1 Initial state
The question posed in this section is: What is the nature of the linguistic knowledge with which learners begin the second language acquisition process? That is, what is the unconscious linguistic knowledge that learners have before receiving L2 input, or, to take a variant of the question, what are early L2 grammars like? The two variables influencing this debate are transfer (i.e., the availability of the first language grammar) and access to UG (i.e., the extent to which UG is available).
Two broad views are discussed here: the Fundamental Difference Hypothesis (Bley-Vroman, 1989; Schachter, 1988), which argues that what happens in child language acquisition is not the same as what happens in adult second language acquisition, and the Access to UG Hypothesis, which argues that the innate language facility is alive and well in second language acquisition and constrains the grammars of second language learners as it does the grammars of child first language learners. We take a look at each of these positions, the latter in actuality being made up of several branches.
6.2.1.1 Fundamental Difference Hypothesis
As was seen in chapters 4 and 5, much of the work in second language acquisition was driven by the notion that first and second language acquisition involve the same processes. This is not to say that differences were not noted; rather, proposals to account for these differences were made with an attempt to salvage the major theoretical claim of L1 and L2 similarities.
The Fundamental Difference Hypothesis starts from the belief that, with regard to language learning, children and adults are different in many important ways. For example, the ultimate attainment reached by children and adults differs. In normal situations, children always reach a state of “complete” knowledge of their native language. In second lan- guage acquisition (at least, adult second language acquisition), not only is “complete” knowledge not always attained, it is rarely, if ever, attained. Fossilization, representing a non-TL stage, is frequently observed (Han, 2004; Long, 2007).
Another difference concerns the nature of the knowledge that these two groups of learners have at the outset of language learning. Second language learners have at their command knowledge of a full linguistic system. They do not have to learn what language is all about at the same time that they are learning a specific language. For example, at the level of performance, adults know that there are social reasons for using different language varieties.3 What they have to learn in acquiring a second lan- guage system is the specific language forms that may be used in a given social setting. Children, on the other hand, have to learn not only the appropriate language forms, but also that there are different forms to be used in different situations.
Related to the idea that adults have complete knowledge of a language system is the notion of equipotentiality, expressed by Schachter (1988).
She pointed out that children are capable of learning any language. Given exposure to the data of a language (i.e., the input), a child will learn that language. No language is easier to learn than another; all languages are equally learnable by all children. This is not the case with second language learners. Spanish speakers have less difficulty learning Italian than they do Japanese. If language relatedness (perceived or actual) were not a determining factor in ultimate success, we would expect all learners to be equally able to learn any second language. This is not borne out by the facts.
One final difference to mention is that of motivation and attitude toward the target language and target language community (see chapter 12 for a fuller discussion). It is clear that, as in any learning situation, not all humans are equally motivated to learn languages, nor are they equally motivated to learn a specific language. Differential motivation does not appear to impact a child’s success or lack of success in learning language. All human beings without cognitive impairment learn a first language.
In sum, the basic claim of the Fundamental Difference Hypothesis is that adult second language learners do not have access to UG. Rather, what they know of language universals is constructed through their NL. In addition to the native language, which mediates access to UG, second language learners make use of their general problem-solving abilities. Second language learners come to the language-learning situation know- ing that a language contains an infinite number of sentences; that they are capable of understanding sentences they have never heard before; and that a language has rules of syntax, rules of combining morphemes, limits on possible sounds, and so forth. With specific regard to syntax, learners know that languages can form questions and that the syntax of questions is syntactically related to the syntax of statements. They know that languages have a way of modifying nouns, either through adjectives or relative clauses.
This information is gleaned by means of knowing that the NL is this way and by assuming that these facts are a part of the general character of language rather than a part of the specific nature of the native language. Thus, the learner constructs a pseudo-UG, based on what is known of the native language. It is in this sense that the NL mediates knowledge of UG for second language learners.
6.2.1.2 Access to UG Hypothesis
The opposing view to the Fundamental Difference Hypothesis is the Access to UG Hypothesis. The common perspective is that “UG is constant (that is, unchanged as a result of L1 acquisition); UG is dis- tinct from the learner’s L1 grammar; UG constrains the L2 learner’s interlanguage grammars” (White, 2003, p. 60). White (2003) outlines five positions with regard to the initial state of second language learning; the first three take the first language as the basis of the initial state and the second two take UG as the initial state: (1) Full Transfer/Full Access, (2) Minimal Trees, (3) Valueless Features, (4) Initial Hypothesis of Syntax, and (5) Full Access (without transfer).
Before beginning the discussion of access to UG, it is important to make one further distinction and that is between lexical and functional categories. In addition to principles, part of the innate language compon- ent consists of lexical and functional categories. Lexical categories are the categories that we learn about in school: nouns, adjectives, verbs, adverbs, and so forth. These can be thought of as content words. Functional cat- egories, on the other hand, are words that serve particular functions (e.g., articles, possessives) or they may be categories consisting of grammatical morphemes (e.g., plurals, tense markers).
Functional categories can be thought of as grammatical elements that in a sense form the glue of a sentence. Examples of functional categories are determiners (e.g., a, the, our, my, this), complementizers (e.g., if, whether, that), and grammatical markers (past tense endings, case mark- ings, plural endings, and gender marking). These differ from lexical cat- egories in a number of ways. In general, functional categories represent a fixed set of words in a language, whereas lexical categories can be added to as the need arises (consider the recent addition to the English lexicon of the word dotcom, as in dotcom industry or in the recent Time magazine headline “Doom stalks the dotcoms”).4
However, the most important distinction has to do with whether or not a class of words is associated with lexical properties. Prepositions, for example, though typically having the functional category characteristic of a fixed set of words in a language, are best thought of as a part of the lexical category. This is so because prepositions are often associated with such roles as agent (who does what to whom), patient (who is the recipient of the action), and location. For example, in English the preposition by can be associated with an agent in passive sentences (John was kissed by Mary), and the preposition in can take on the role of location (John was kissed in the park).
We now turn to different conceptualizations of the roles of the L1 and UG as possible starting points for L2 acquisition.
L1 AS THE BASE
1 Full Transfer/Full Access
This position assumes that the starting point is the L1 grammar, but that there is full access to UG during the process of acquisition (Schwartz, 1998; Schwartz and Sprouse, 1994, 1996, 2000; Whong- Barr, 2005). The learner is assumed to use the L1 grammar as a basis but to have full access to UG when the L1 is deemed insufficient for the learning task at hand. L1 and L2 learning differ, and there is no prediction that learners will eventually attain complete knowledge of the L2.
  1. 2  Minimal Trees Hypothesis
    Recall that in the previous position, full transfer/full access, learners draw on both the L1 and UG. The first option was to draw on the L1 and, where that was insufficient, to draw on UG. The Minimal Trees Hypothesis also maintains that both L1 and UG are available concur- rently (Vainikka and Young-Scholten, 1994, 1996a, 1996b). However, the L1 grammar that is available contains no functional categories, and these categories, initially, are not available from any source. The emergence of functional categories is not dependent on the L1 and hence there is no transfer; rather, they emerge in response to L2 input. The development of functional categories of learners from different languages will be the same. On this view, learners may or may not reach the final state of an L2 grammar, depending on what is available through the L1 and what is available through UG. They should be able to reach the final state of an L2 grammar with regard to functional categories.
  2. 3  Valueless Features
    This is the most technical of the hypotheses and will be dealt with in the least detail. In essence, the claim is that there is weak transfer (Eubank 1993, 1993/1994, 1996). The L1 is the primary starting point. Unlike the Minimal Trees Hypothesis, both functional and lexical categories are available from the L1, but the strength of these features is not available. There are consequences of feature strength in areas such as word order. Acquisition involves acquiring appropriate feature strength of the L2. Learners should be able to fully acquire the L2 grammar.
UG-BASED
  1. 4  The Initial Hypothesis of Syntax (Platzack, 1996)
    This position maintains that, as in child language acquisition, the starting point for acquisition is UG.
  2. 5  Full Access/No Transfer
    This position maintains that, as in child language acquisition, the starting point for acquisition is UG (Epstein, Flynn, and Marto- hardjono, 1996, 1998; Flynn, 1996; Flynn and Martohardjono, 1994). There is a disconnection between the L1 and the developing L2 grammar. A prediction based on this position is that L1 and L2 acquisition will proceed in a similar fashion, will end up at the same point, and that all L2 acquisition (regardless of L1) would proceed along the same path. Learners should be able to reach the same level of competence as native speakers. If there are differences, they are performance-related rather than competence-related.
In the following sections, we examine data that bear on these issues of access to UG. There are two types of relevant data: data relating to UG principles that are invariant, and data relating to UG parameters that vary across languages.
6.2.2 UG principles
White (1989) reported on a study by Otsu and Naoi (1986) dealing with the principle of structure dependence. The basic concept behind this principle is that linguistic principles operate on syntactic (or structural) units. That is, most importantly, according to this view, what makes lan- guage knowledge different from other types of knowledge is the notion of structure dependency; language is not just a string of unstructured segments. White pointed out that this accounts for the grammatical question in 6-14 and the ungrammaticality of 6-15.
(6-13) The boy who is standing over there is happy.
(6-14) Is the boy who is standing over there happy? (6-15) *Is the boy who standing over there is happy?
The rule for question formation makes reference to the subject, which in the case of 6-13 is a complex subject consisting of a determiner phrase (the boy) and a relative clause (who is standing over there). The rule does not make reference to a nonstructural unit, such as “the first verb.” Thus, yes/no questions are formed by moving the main verb to the front of the sentence, not by moving the first verb in the sentence to the front (as in 6-15).
Otsu and Naoi tested knowledge of structure dependency among Japanese learners of English. In Japanese, questions are formed by adding a question particle to the end of a sentence. No word-order changes are made. The learners tested knew how to form simple questions and passed a test showing knowledge of relative clauses, but they had no knowledge of question formation involving complex subjects. It was hypothesized that if a UG principle, structure dependence, were operative, it could not have come into the learner language system through the L1 as the L1 does not have a principle of structure dependence relevant to question for- mation. Thus, the only way the principle of structure dependence could have come into the learners’ second language grammar is through direct access to UG. In general, the results of this study support the notion that learners’ grammars are constrained by principles of UG, in this case the principle of structure dependence.
Another study relevant to the issue of UG principles is one by Schachter (1989). She tested the principle known as subjacency, which limits the amount of movement that can take place within sentences. Consider the following contrived conversation:
Speaker 1: I agree with the idea that David loves Mary Jo. Speaker 2: I didn’t hear you. *Who do you agree with the idea
that David loves?
The ungrammaticality of Who do you agree with the idea that David loves? is due to the fact that, in English, movement of the question word from the position of the original noun phrase (Mary Jo) to its new sentence-initial position is constrained by the distance and intervening syntactic structures between the two positions.5 In Speaker 2’s sentence, the necessary syntactic relationships cannot hold; that is, the movement rule is violated and, hence, the sentence is ungrammatical.
Schachter (1989) tested knowledge of this principle by eliciting gram- maticality judgments by native speakers of Indonesian, Chinese, and Korean learning English. In a separate article, Schachter (1990) added a group of Dutch speakers to her database. The languages in question have different requirements on subjacency. In Korean, there is no evidence of subjacency; in Chinese and Indonesian, there is some evidence of subja- cency, although in both of these languages wh- movement is more limited than in English; and in Dutch, subjacency restrictions are much the same as in English. The results of Schachter’s study suggest that the Dutch speakers recognize that English is constrained by the principle of subjacency; the results for the other groups are not as clear. The Korean- speaking learners, in keeping with the no-access position, were not con- strained by subjacency. The Chinese and Indonesian speakers behaved more English-like than the Korean speakers, but their interlanguage gram- mars could not be said to be constrained by the principle of subjacency.6
A third example comes from White’s (2003) discussion of the results of studies based on the Empty Category Principle (ECP) (Chomsky, 1981). In essence, the ECP is a way of accounting for asymmetry found in the use or nonuse of case particles. Examples can be seen from Japanese in 6-16, 6-17, and 6-18.
  1. (6-16)  John ga sono hon o yonda. John NOM that book ACC read-PAST “John read that book.”
  2. (6-17)  John ga sono hon yonda. John NOM that book read-PAST
  3. (6-18)  *John sono hon o yonda. John that book ACC read-PAST
6-16 is grammatical with both a nominative and an accusative case marker; 6-17 is possible with a nominative case marker and no accusative case marker, but 6-18 is ungrammatical because it has only an accusative case marker, but no nominative case marker. Kanno (1996) investigated whether beginning learners of Japanese were able to recognize this dis- crepancy, arguing that, if they recognized the asymmetry in the early stages of learning, one could assume that the ECP functions in early second language learning.7 Both L2 learners and native speakers of Japanese accepted accusative case drop sentences more than nominative case drop. This suggests that ECP does in fact function in the early grammars of L2 learners.
Thus, with regard to UG principles, there is conflicting evidence as to whether learners have direct access to UG, have access through the NL, or have no access at all.
6.2.3 UG parameters
There are certain linguistic features that vary across languages. These are expressed through the concept of linguistic parameters. Parameters have limited values. In learning a first language, the data a child is exposed to will determine which setting of a parameter that child will select. Whereas parameters are not invariable, as we saw with principles, they are limited, thereby easing the burden on the child. In other words, if parameters exist, the child’s task is eased, because there is a limited range of options to choose from.
The issue for second language acquisition is the determination of whether and how a given linguistic parameter can be reset. Let’s assume a parameter with two values. Let’s further assume a native speaker with a NL setting in one way who is learning a second language with a setting in another way. If UG is available to that learner, there should be little difficulty in resetting the parameter because the speaker has access to both settings through UG. If UG is operative only through the L1 (as the Fundamental Difference Hypothesis suggests), then we would expect only those features that are available through the L1 to manifest themselves in the L2. Finally, if UG is not operative at all, we would expect none of the UG features to be available.
One of the most interesting aspects related to the concept of parameters is that they involve the clustering of properties. Once a parameter is set in a particular way, all related properties are affected. In other words, there are consequences for other parts of the grammar. We examine one such parameter, known as the pro-drop parameter. This parameter encompasses a number of properties, namely (a) the omission of subject pronouns, (b) the inversion of subjects and verbs in declarative sentences, and (c) that-trace effects—that is, the extraction of a subject (leaving a trace) out of a clause that contains a complementizer. A language will either have all of these properties or none of them. Languages like Italian and Spanish are [+pro-drop] and have all of the associated properties, whereas English and French are [−pro-drop], having none of them. Examples from English and Italian that illustrate the differences follow:
English
Obligatory use of subject pronouns
She is going to the movies this evening.
Italian
Omit subject pronouns
Va al cinema stasera.
goes to the movies this evening *is going to the movies this evening
Subject–verb inversion
È arrivata Laura. is arrived Laura
That-trace
Chi hai detto che è venuto? who you said that is come?
Laura has arrived. *has arrived Laura
Whom did you say came? *Whom did you say that came?
White (1985) and Lakshmanan (1986) presented data from Spanish and French learners of English (White) and Spanish, Japanese, and Arabic learners of English (Lakshmanan) on precisely these three structures.8 White found that the learners did not recognize these three structures as related. Although there was a difference in judgments of acceptability between the Spanish and the French speakers on the first type of sentences (i.e., those with and without overt subject pronouns), there was no difference between the two groups on the other two types of sentences. Thus, these learners did not see these three properties as a unified parameter. Lakshmanan’s results were similar. Her groups of learners responded similarly to the first two sentence types but dif- ferently with regard to the third, again suggesting that these properties were not seen by these learners as unified under the umbrella of a single parameter.
There is evidence, however, that is more compelling with regard to the clustering of properties. Hilles (1986) assumed different properties of the pro-drop parameter in her investigation of the acquisition of English by a native speaker of Spanish named Jorge: (a) obligatory pronoun use; (b) use of nonreferential it, as in weather terms (it’s raining, it’s pouring) and use of nonreferential there, as in There is rain in the forecast; and (c) use of uninflected modals (e.g., must, could). Hilles showed that these three features were related in the speech of her learner. Specifically, there was an inverse relationship between Jorge’s lack of referential subject use and the appearance of modal verbs. As Jorge began to use subject pronouns in English (i.e., as his null-subject use went down), he also began to use modals as noninflected forms. Hilles hypothesized that the triggering factor for the switch from [+pro-drop] to [−pro-drop] was the use of nonreferential subjects. This was an indication that this learner had truly understood the mandatory nature of subjects in English.
Park (2004) analyzed pronominal subjects and objects. She observed that Spanish speakers learning English frequently drop subject pronouns, whereas Korean speakers learning English frequently drop object pro- nouns. She attempts to account for this discrepancy through the inter- pretability of agreement features in the native languages.
The results of research on L2 parameters, like those of the research on principles discussed in section 6.2.2, is mixed. There are data supporting the view that UG constrains the grammars that learners can come up with; there are data arguing against this position. Thus, the answer to the question of whether L2 acquisition is fundamentally the same as L1 acquisition is no; the answer to the question of whether L2 acquisition is fundamentally different from L1 acquisition is also no. Although it may be the case that universal principles (either typological or formal) guide L2 acquisition, it is also the case that there are areas of conflict between NL and TL grammars yielding grammars that fall beyond the domain of what would be predicted if the only constraining factor were universals. However, White (2003, p. 149), following her discussion of parameters, concludes that “[d]espite conflicting evidence and conflicting theories, results from several studies suggest that interlanguage grammars conform to parameters of UG.”
Within the Minimalist framework (Chomsky, 1995, 2000, 2002), the lexicon assumes great importance. Parameterization within the Mini- malist Program is no longer in the syntax, but in the lexicon. Most of the constraints on language described earlier in terms of complex principles and parameters now fall out of a handful of general constraints on movement and the specific information stored in the lexicon of indi- vidual languages. Furthermore, most of the parametric variation relates to grammatical features such as tense and agreement. When we think of learning vocabulary, what we typically think of is learning the “mean- ings” of words (e.g., what the word chair refers to or what subterfuge means). But knowing that, for example, break is defined as “to disjoin or reduce to pieces with sudden or violent force” (American Heritage Dictionary) is only part of what we know about the word break. Knowing a word entails much more than that, and the additional knowledge is as important as any other piece of knowledge we have of language. For example, we also know that the verb break is irregular in its past tense formation, whereas love is not. We know that a sentence such as
(6-19) Harvey broke the glass jar.
is a good English sentence, but 6-20 is not.
We know that some words require objects (hit), other words allow objects but do not require them (eat), and still other words disallow objects (sleep). This is part of what we know about a language. Within Minimalism, parameters are part of the lexicon and language learning is largely lexical learning.
An example of how parametric variation is attached to the lexicon comes from the use of reflexives. Given an English sentence such as 6-21:
(6-21) The mother told the girl to wash herself.
speakers of English recognize that the word herself must refer to the girl. But the same is not true in sentence 6-22, where her can refer to the mother or to someone else.
(6-22) The mother told the girl to wash her.
Thus, the word herself in English contains information about possible antecedents. Other languages choose different options. For example, in Japanese, one reflexive form, zibun, can be ambiguous, as in 6-23 (from Lakshmanan and Teranishi, 1994):
(6-23) John-wa Bill-ga kagami-no naka-de zibun-o John-TOP Bill-NOM mirror-GEN inside-LOC self-ACC
“John said that Bill saw self in the mirror.” (Either John or Bill can have seen himself.)
In 6-24, the reflexive zibun-zisin removes the ambiguity.
(6-24) John-wa Bill-ga kagami-no naka-de zibun-zisin mita to itta. “John said that Bill saw himself in the mirror.”
(John cannot have seen himself.)
Languages thus contain information in the lexicon that signals grammat- ical relationships.
There are two important questions that are in need of resolution. (a) Are universals the major organizing factor of learner language gram- mars? (b) If so, are the two types of universals discussed here and in the subsequent chapter only variants of one another, or is one a more appro- priate model than the other?
173
mita to itta saw that said

6.2.4 Falsification
In trying to come up with a parsimonious account of how second lan- guages are acquired, it is necessary to have a theory that will explain (and predict) the facts of learner grammars. In order to determine the accuracy of our theories, an important consideration is the issue of falsification. Our theory must predict what will occur and what will not occur. It is only in this way that we can test the accuracy of our hypotheses. In other words, our theories need to be falsifiable based on the data.
Learner languages are highly complex systems and, to some extent, are unique, making it difficult to make absolute predictions. Thus, it is more appropriate to think about probabilistic predictions. Unlike L1 gram- mars, no two individuals have the same L2 grammar, and hence there is no way of predicting what will happen to a grammar when new informa- tion is added, causing changes in the existing system. One might think of this as the kaleidoscope factor. Each kaleidoscope pattern differs. Any change in the system (a shake or twist to the kaleidoscope) will result in a different unpredictable pattern.10 One can make certain predictions, but given the many factors involved in a kaleidoscope (does one twist the box or shake it, how hard, etc.?), one cannot make absolute predictions. One can only establish guidelines within which all of the images are likely to fall.
The advantage of research within a UG framework is that, because it is based on a well-defined linguistic theory, more accurate predictions can be made, although the arguments made earlier regarding absolute versus probabilistic predictions still hold (see also Pinker, 1987).
When there are counterexamples—that is, when the predictions are not borne out—there are various approaches one can take: (a) assume a no-access to UG position, as we have seen with regard to the Funda- mental Difference Hypothesis; (b) attribute the results to methodological problems; (c) attribute the results to an undefined performance compon- ent; (d) attribute the results to mapping factors; or (e) assume the theory is false.
Within UG the fifth possibility has been common. Because the pre- dictions are based on theoretical constructs that are abstractions (that thus have to be argued rather than empirically verified), and because the theory is in a state of development, there is little concrete evidence that one can bring to bear to show that the linguistic analysis of a principle or parameter is indeed the correct one. Thus, if one maintains the assump- tion that second language grammars are natural grammars, then SLA data can be brought into the arguments in the field of linguistics in the determination of linguistic principles and parameters.
Because of the changing nature of the linguistic constructs on which it is based, UG-based research is difficult to falsify. Upon being confronted with data apparently contradicting the predictions of UG access, it is equally possible to argue that the underlying linguistic formulation was the incorrect one.
To illustrate this point, reconsider the discussion of the pro-drop par- ameter. We noted that there were differing views as to what constituted the appropriate clusters in this parameter. In White’s study, the predicted clusterings were not evidenced in the data. A possible conclusion she comes to is:
It is of interest that some recent proposals suggest that the possi- bility of VS word order [i.e., subject inversion] is not, in fact, part of the pro-drop parameter, but derives from other principles of grammar (Chao 1981; Safir 1982; Hyams 1983), a position that these results would be consistent with.
(White, 1985, p. 59)
Thus, rather than assuming a no-access position, White suggests the possibility that the parameter has been inaccurately described.
Yet another way of viewing the falsification problem is to allow for violations of universals, as these violations are temporary, given the ever- changing nature of learner languages. UG then serves as a “corrective mechanism” (see Sharwood Smith, 1988). A violation is only to be taken as a serious violation if it can also be shown that the person’s interim system (i.e., his or her learner language) has stabilized. This would mean that most cross-sectional studies would have to be eliminated, because it is only with longitudinal data that we can determine whether a grammar has stabilized/fossilized or not. There is an added difficulty here. As we have no independent means of determining whether stabilization/ fossilization has taken place, we can never know when we are confronted with a stabilized grammar and when we are not. Thus, if we are to take this view, we cannot determine whether or not universal principles are violated. But if the principles are followed, then we can conclude that second language grammars are constrained by the particular principles. If the principles are not followed, there is little that can be concluded. We have no way of determining with certainty that the principles are permanently not followed.
If we consider the initial-state discussion earlier in this chapter, it is clear that there are difficulties defining what is meant by initial-state. For example, how early must data be to be relevant? First day of exposure, first utterance? What about a period of nonproduction before pro- duction begins? Is this relevant? Does it exist? If these data are relevant, then is there any way of falsifying certain claims (for example, whether functional categories are in place or not)? Or, to think about the Valueless Feature Hypothesis, if research is conducted with early learners—say, those in their first year or semester of study—and they had acquired feature strength, does that mean that the semester or year of exposure was sufficient to acquire strength or does it mean that they started with specified feature strength, rendering the hypothesis false?
To take a similar example, recall that one of the questions in UG-based research is the extent to which functional categories are available in early stages of learning. For example, it is frequently the case that there is little morphological marking in early L2 production, suggesting the absence of functional categories. However, plural marking is often absent at very late stages of SLA, making it difficult to maintain that omission is solely due to an absence of functional categories. Therefore, on the surface, one might consider a certain type of data as evidence of falsification whereas different explanations might be plausible for the same phenomenon in different contexts.
6.3 Transfer: the UG perspective
In chapters 4 and 5 we discussed historical and current views of transfer respectively. Conducting SLA research within a paradigm such as the one discussed in this chapter necessitates a reconsideration of the concept of transfer. The question arises: What new insights do recent linguistic approaches and, in particular, theoretical paradigms provide regarding the old concept of transfer?
White (1992) provided detail on this issue. She notes four areas that make current views of the phenomenon of transfer truly different from earlier conceptualizations, particularly those embodied in the framework of contrastive analysis. We deal with three of these areas here: levels of representation, clustering, and learnability.
6.3.1 Levels of representation
Within a theory of Universal Grammar, our knowledge of syntax is best represented by positing different levels of grammatical structure. To simplify matters, assume that there is an underlying structure and a surface structure. To understand the difference, consider 6-25:
(6-25) Visiting relatives can be boring.
This sentence can be parsed in one of two ways, each with a different meaning.
(6-26) When I visit relatives, I am bored. (6-27) Relatives who visit me can be boring. The two different meanings are a result of two different underlying syntactic structures that can be computed for sentence 6-25.
If sentences have multiple levels of representation, one can imagine that transfer could occur not just on the basis of surface facts, but also on the basis of underlying structures (see Tarone, Frauenfelder, and Selinker, 1976).
6.3.2 Clustering
With regard to clustering, recall that within a UG theory claiming that learning involves setting/resetting of parameters, there are properties that cluster together within a parameter. Within this framework (as with typological universals, discussed in chapter 7), one is concerned with how multiple properties of language do or do not behave in a like fashion. Further, there is evidence that mixed values are adopted for multivalued parameters and continuous linguistic features (for examples, see Broselow and Finer, 1991; Gass, 1984).
Within earlier approaches to transfer (particularly a contrastive analysis approach), there was no way to show how related structures were linked in the minds of second language learners. Nonetheless, a model that involves structural relatedness clearly represents an innovative approach to language transfer.
6.3.3 Learnability
A UG perspective on SLA is heavily dependent on arguments of learn- ability. In particular, the issue of positive evidence is central because learners construct grammars on the basis of the input (the positive evi- dence to which the learner is exposed) together with principles of UG. But, there are some language structures that may be in a superset/subset relationship. In fact, a learning principle, the Subset Principle, has been proposed that ensures that language learning can proceed on the basis of input alone. When there are multiple possibilities in a language, child learners adopt the most restrictive grammar possible so that she or he can proceed to learn the appropriate forms on the basis of input alone. If she or he were to assume a superset grammar, there would be no way to retreat from that grammar. Consider adverb placement in French and English. In French, adverbs can be positioned in a greater number of places than in English. In English, sentence 6-28 is ungrammatical, whereas the French counterpart is not.
(6-28) *The man is drinking slowly his coffee.
If an English child were to start with a grammar that allowed all possi- bilities for adverb placement, it would be difficult to learn on the basis of positive evidence (input) alone that the grammar was actually more restrictive.
Looking at this across languages, we can see that the input necessary for the learner may be different depending on the superset/subset relation- ship of the two languages in question on a particular feature. For a French learner learning English, she or he has to learn that 6-28 is ungrammatical (and, in fact, this is learned late and is characteristic of a French person speaking English), whereas, an English learner learning French only has to hear the broader range of possibilities to know that French has more possibilities for adverb placement.
Where positive evidence is readily available, allowing a learner to reset a parameter, little transfer (and, when present, of short duration) is pre- dicted (as in the case of the L2 being a superset of the L1). On the other hand, when positive evidence will not suffice to provide learners with adequate information about the L2, possibly necessitating negative evidence, transfer is predicted (as when the L2 is a subset of the L1).
6.4 Phonology
Another area where SLA and linguistics intersect is phonology. The study of L2 phonology is not unlike other areas of L2 acquisition in that it attempts to account for the patterns of knowledge and use of L2 learners, in this case of pronunciation and perception. It is commonly accepted that the native language origin of a second language speaker is often identifiable by his or her accent. In fact, nonnative speaker pronunciation is often the source of humor, as in the case of comedians mimicking particular accent types, or in cartoon characters adopting nonnative accents.
The acquisition of a second language phonology is a complex process. An understanding of how learners learn a new phonological system must take into account linguistic differences between the NL and the TL sys- tems as well as universal facts of phonology. Phonology is both similar to and different from other linguistic domains. It is similar to what we have seen in other parts of language in that some of a learner’s pronunciation of the second language is clearly attributable to the NL, whereas some is not. It is different in that not all of the concepts relevant to syntax are applicable to phonology. For example, avoidance is a common L2 strategy used when a syntactic construction is recognizably beyond one’s reach. Thus, if a learner wants to avoid passives, it is relatively easy to find an alternative structure to express the same concept. However, if a learner wants to avoid the sound [ð], as in the in English, it would be virtually impossible. Phonology differs from syntax in that in the former, but not the latter, most people can detect the linguistic origin of a speaker (although see arguments in Ioup, 1984, relating to “syntactic accent”).
178
FORMAL APPROACHES TO SLA
Table 6.1 Hierarchy of phonological difficulty
NL
0
0
Optional Obligatory Obligatory Optional Optional Obligatory
TL
Obligatory difficult Optional
Obligatory
Optional
0
0
Optional
Obligatory easy
Source: Adapted from R. Stockwell and J. Bowen. The Sounds of English and Spanish. Chicago: University of Chicago Press. Reprinted by permission.
As discussed in chapter 4, in its simplest form, the Contrastive Analy- sis Hypothesis did not make accurate predictions. It did not predict why speakers of language X learning language Y would have difficulty on a given structure, whereas speakers of language Y learning language X did not have difficulty on that same structure. These discrepancies were also evident in phonology. As an example, consider Stockwell and Bowen’s (1965) proposed hierarchy of difficulty (Table 6.1). The hierarchy (ordered from most difficult to least difficult) attempts to make pre- dictions of difficulty based on whether or not phonological categories are absent or present and, if present, whether they are obligatory or optional. Thus, if a learner comes from a language that has no phonemic contrast between two sounds (e.g., /l/ and /r/) and is learning a language where that contrast is obligatory, she or he will have difficulty. However, if the first language and the target language both have the same contrast, there will be little difficulty in learning.