Are generative grammarians abandoning innateness?
A recent blog post by Martin Haspelmath has the very Buzzfeed title: “Some (ex-)generative grammarians who are abandoning innateness”. The actual post then goes on to nuance this somewhat but Haspelmath still takes these individuals to be abandoning core tenets of generative grammar and asks:
“Are these linguists who are abandoning most of Chomsky’s programme from the 1960s through 1990s still “generative grammarians”, or are they ex-generative grammarians? How can they continue to work with the assumption of uniform building blocks, if these are not innate? I am as puzzled as was back in 2018.”
I’ll try here to clear up some of this puzzlement. It’s a long post so …
tl;dr: yes, they are still generative grammarians; the reason they work with current theories is that they recognize that theoretical posits are placeholders for future understanding. Whether a particular phenomenon is explained by a language specific innate property of the mind, or by a general cognitive capacity, or by a law of nature is something to find out, and working with current generative theories is a good way to do that.
First a few remarks about Innateness. There’s no argument that we bring innate capacities to bear when we learn languages. There is, however, a question about what these capacities are and whether they are specific to language. In a paper I wrote in response to Adele Goldberg many years ago, I distinguished three possibilities when thinking about syntax and innateness; (i) When a child is learning a language, every aspect of their innate cognitive capacities can be brought to bear in learning the syntax of their language. Learning the syntax of a language is like learning other complex skills; (ii) Cognition is innately structured so that only some aspects of the child’s innate cognitive capacities can be used for learning syntax, but these are all capacities that are used outside of language learning too; (iii) as well as the capacities used in (ii), there are some capacities that are not used elsewhere in cognition, or at least, if they are, their use there is derivative of language (counting comes to mind as a possibility). These are innate and unique to language.
Option (i) is, I think, a non-starter, for the reasons I discussed in my 2019 book, Language Unlimited (soon to be out in paperback I hear!).
The distinction between options (ii) and (iii) is, of course, Hauser, Chomsky and Fitch’s (2001) distinction between the Broad and Narrow Faculties of Language.
1980s style generative grammar (say, Government and Binding Theory) developed theories that attributed quite a lot of content and structure to the Narrow Faculty of Language: D-structure, S-structure, C-command, government, X-bar Theory, Binding Theory, particular categories and features, and all the rest of what fascinated me as an undergraduate in the 1980s. I still fondly remember reading, in the staff-room of the Edinburgh Wimpy fast food restaurant, clad in my stripy red and white dungarees and cap, van Riemsdijk and Williams’ fascinating 1987 textbook, much to the perplexity of my co-workers!
Government and Binding Theory was, I think, empirically successful, at least in terms of enriching our understanding of the syntax of lots of different languages and, though I guess Haspelmath and I differ here (see our debate in Haspelmath 2021 and Adger 2021), in uncovering some deep theoretical generalizations about human language in general. The concepts the theory made available, and the questions it raised, led to both a broader and a deeper type of syntactic investigation. It wasn’t to everyone’s taste, and many people developed alternatives, which, as anyone who’s read my views on this knows, I consider to be all to the good. I was myself, while working in that Wimpy, a budding unification categorial grammarian, sceptical of GB, though intrigued by it.
Keeping to generative grammar internal criticisms and putting aside for the moment challenges from outside that field, two things militate against simply stopping with the successes of Government and Binding as a theory of syntax. One is methodological: we should assume that the organization of things in the world is simpler than it looks because that’s been a successful strategy for deepening understanding in the past; the other is more phenotypical: how did the Narrow Faculty of Language get so complex, given the apparently brief evolutionary time over which it appeared?
Minimalism: Hence the appearance, almost three decades ago, of Minimalism. I first encountered it through the 1992 MIT Occasional Papers version of Chomsky’s paper A Minimalist Program for Linguistic Theory. That paper attempted to reduce the complexity of what is in the Narrow Faculty of Language (forgive the anachronism) by removing D-Structure and S-structure, by creating a version of the Binding Theory that held of essentially semantic representations, and by arguing that restrictions on syntactic dependencies were a side effect of restrictions that are likely to hold of computational systems in general. It also proposed reducing cross-linguistic variation to what has to be learned anyway, the morphological properties of lexical items, following a much earlier idea of my colleague Hagit Borer’s. The approach within Minimalist syntax since then has generally followed this broad path, attempting to build a theory of the Narrow Faculty of Language that is much “reduced”.
Reduction: There are two types of reduction we see over the history of minimalism. The first is what leads Haspelmath to wonder whether Norbert Hornstein (!!!) is an ex-generative grammarian, and it involves seeking the explanation for aspects of syntax outside of the syntactic system itself. The second is, I think, what leads to his puzzlement.
Type 1 reduction: The first type concerns the description and explanation of phenomena. Haspelmath points to a paper by Julie Legate which argues that there is nothing in the Narrow Faculty of Language that constrains the way that agents and themes operate in passives. Legate concludes that the promotion of themes and the demotion of agents are independent factors, so that the Narrow Faculty of language doesn’t have constrain their interaction. This is good, from a minimalist point of view, as it means that what is posited as part of the Narrow Faculty of Language is reduced tout court.
A slightly different example is Amy-Rose Deal’s work on ergativity, also quoted by Haspelmath. In her 2016 paper, Deal argues for a syntactic analysis of a person split in Nez Perce case assignment (basically 1st and 2nd person get nominative as opposed to ergative case). To reconcile this with other work which accounts for the same pattern in other languages through morphological rule, as opposed to syntactic structure, Deal suggests that there is something extra-grammatical at the heart of person splits. This is, as she notes, a possibility in minimalist syntactic theory, again reducing what is posited as part of the Narrow Faculty of Language, though in this case, unlike Legate’s conclusions about Passive, there is still something doing the job of constraining the typology, but that something is outside of the Narrow Faculty of Language.
Aside: Personally, I think that an alternative, where both the morphology and syntax are a side effect of some deeper syntactic fact, is still worth considering and I’ve never understood why, under a functionalist account, we don’t find splits between first and second persons, but I’m no expert in this area, and Deal’s proposal is certainly not inconsistent with Minimalist syntax.
In any event, a quick glance at both Legate’s and Deal’s papers shows a fair amount of appeal to language specific mechanisms of the kind Haspelmath finds puzzling, a point I return to below.
A final example of attempted reduction comes from my own work with Peter Svenonius on bound variable anaphora, where we are more explicit about how primitives that seem to be in the Narrow Faculty of Language may actually be due to what Chomsky in his 2005 paper calls Third Factor properties: these Third Factor properties are not part of the Narrow Faculty of Language nor are they attributable to the information in the data a child learning their language is exposed to. In our 2015 paper, Peter and I proposed a minimalist system to capture the constraints on when a pronoun can, and cannot, be bound by a quantifier, and we then subjected the various primitives of that system to the question of what language external systems might be responsible for them. We suggested that, for example, the notion of “phase” in syntax could be seen as an instantiation of the periodicity of computational systems more generally (Strogatz and Stewart 1993). We also suggested that the notion of spellout of syntactic copies could be connected to general cognitive mechanisms for keeping track of single objects in different temporal or spatial locations (Leslie et al 1998). The idea is that these properties are part of the Broad Faculty of Language but are obligatorily coopted by the Narrow one. On this perspective the Narrow Faculty of Language is basically a specification of which cognitive capacities are co-opted by language and how they are co-opted.
Type 2 reduction: Now to the second type of reduction. Imagine we have a phenomenon that appears to require a great deal of rich structure in the Narrow Faculty, say the features responsible for pronominal systems in languages of the world. In such a case we can try to reduce the amount of structure and content of the Narrow Faculty of Language by improving the theory itself. This means making the primitives of the theory fewer and more abstract, but with wider empirical reach. This is the point made by Kayne in the quote Haspelmath gives:
“Cross-linguistically valid primitive syntactic notions will almost certainly turn out to be much finer-grained than any that Haspelmath had in mind.” (Kayne 2013: 136, n. 20)
My go-to example here is my colleague Daniel Harbour’s theory of person and number features. Harbour develops an approach to explaining the typology of person-number systems in pronouns that reduces the rich range of types of system to the interaction of three features. His crucial insight is that these features are functions which can take other features as their arguments. This allows him to derive the empirical effects of the highly structured feature geometries that had been used by researchers like Ritter, Harley, Cowper and others to capture pronoun typologies. His system is much sparser in what it posits, but it has the same (in fact, he argues, better) empirical coverage.
Another example is Chomsky’s reduction of movement and phrase structure to a single mechanism. Within GB and early Minimalism the two were always assumed to be distinct aspects of syntax and so the theory claimed that the Narrow Faculty of Language included two distinct operations. Chomsky’s 2004 simplification of Merge in his paper Beyond Explanatory Adequacy (apologies, can’t find a link) meant that the same mechanism was responsible for both types of structure, so that the specification of what is in the Narrow Faculty is reduced.
This second approach to reduction doesn’t remove richness by attributing it to other aspects of cognition, it rather improves the theory itself. It leads inevitably to high degrees of abstraction. The complexity of the empirical phenomena is argued to arise from behaviour of simple elements interacting. Crucially, these simple elements do not directly correspond to aspects of the observable phenomena. To the extent we see the same elements involved in the explanation of very distinct phenomena, we have a kind of explanation that relies on abstraction.
Abstraction: One thing I’ve noticed in my (fun but sometimes frustrating) Twitter conversations with Martin Haspelmath, Adele Goldberg and others is an argument that goes as follows:
“…but look at all this ridiculous stuff you put in your trees. Just no!”
I think of this as the Argument from Allergy to Abstraction: the trees are too complex in structure, they have too many things in them that seem to come from nowhere, and far too many null things. It’s not reasonable to think that all that complexity is built in to syntax.
But abstraction is a valuable mode of explanation.
Trees are, of course, not the primitives of the theory. A complex tree can be built out of simple things. As I pointed out in Language Unlimited, the fractal shapes of a Romanesco cauliflower can be given by a simple equation. The whole point of generative grammars as a formalism is that they generate unbounded structures from some very minimal units and modes of combination. That’s what makes them good models for syntax. Simpler syntax (to steal a phrase) is not about making the trees simple, it’s about making the system that generates them simple. Syntax is the system, not the output.
Lets assume that the trees aren’t the issue then. What seems to exercise Haspelmath is the categories and operations, and this is the question we started with.
Haspelmath’s Puzzlement: Haspelmath is puzzled by why generative syntacticians, even those like Legate and Deal who have rejected the innateness of certain properties by following the first type of reduction, quite happily pepper their analyses with categories like DP and vP, and with dotted lines expressing movement or long distance Agree dependencies. What justifies these? The only justification Haspelmath sees for using such categories and relations in syntactic analysis is that they are innate. But those same individuals seem to have a cavalier attitude to innateness in general.
Placeholders for a better understanding: I’ve struggled for a while to make sense of Haspelmath’s worry here. I think it comes down to how abstraction requires you to be comfortable with theoretical uncertainty and change. If your categories are abstract, they are grounded by just their explanatory capacity, and since understanding is always changing, you’d better be prepared for your abstract categories to change too.
From my perspective (and I think the perspective of most if not all generative theoretical syntacticians), DP, vP, Agree and all the rest are then very likely to be placeholders for a better future theory. The analyses that use them are not meant to be the final word. They are stepping stones which have taken us further than we were, but there’s a long way to go still.
Perhaps vP (or D, or even Merge) will indeed end up being a necessary component of the Narrow Faculty of Language. Perhaps, though, it will end up being dissolved into a number of yet more abstract primitives. Perhaps it will end up being the interaction of some third factor property with the syntax. Perhaps it will just be plain wrong, to be replaced by a totally different view of categories, just as phrase structure rules have vanished from Minimalism. It doesn’t matter right now though. It serves as an anchor for the analysis, a stable point that the relevant generalizations can hook on to, and a crystallization of a set of claims that can be challenged. We hope, of course, that our theoretical posits are the right ones, but realistically they’re surely not.
This is the point of theory: it gives us a momentary platform which we can use to find the next, and then the next steps, or which we can dismantle because it turned out to be wrong. It improves understanding, but gives no final answer (at least not at the state we are at in linguistics). I think generativists are generally comfortable with that kind of approach to theory because a major mode of explanation relies on abstract theory. The concepts we work with are good enough to enhance our understanding, and drive new empirical discoveries, and open up new questions to be answered, new theories to be developed.
Haspelmath asks how generativists can “continue to work with the assumption of uniform building blocks, if these are not innate”.
The answer is, as Gillian Ramchand, José-Luis Mendívil-Giró, Radek Šimík, and Dennis Ott all said in one way or another in the quotes in Haspelmath’s blog: we don’t really care about DPs or vPs or whatever being innate. The theory (or family of theories) we use are our best guesses, but we’re happy when what we thought had to be in the Narrow Faculty of Language can be explained by some third factor, (as long as that explanation is at least as empirically adequate as the one we had before).
My answer to the question “Are generative grammarians abandoning innateness?” is: we’ve continually been abandoning (and adopting, and abandoning again) particular suggestions for what is in, or is not in, the Narrow Faculty of language. That’s the nature of the field and it’s always been so.
Chomsky, back in 1980, writes this in response to the philosopher Hilary Putnam, who is complaining about Chomsky’s “Innateness Hypothesis”
“For just this reason I have never used the phrase “the innateness hypothesis” in putting forth my views, nor am I committed to any particular version of whatever Putnam has in mind in using this phrase (which, to my knowledge, is his and his alone) as a point of doctrine. As a general principle, I am committed only to the “open-mindedness hypothesis” with regard to the genetically determined initial state for language learning (call it S0), and I am committed to particular explanatory hypotheses about S0 to the extent that they seem credible and empirically supported.”
This has always been the attitude of generative linguists: we adopt particular hypotheses as long as they do explanatory work for us, and if it seems they have no explanation outside of the Narrow Faculty of Language (what Chomsky calls S0 in the quote), that’s where we put them. If they are superseded by alternative hypotheses that are Third Factor, that’s all to the good. This is why the individuals Haspelmath mentions in his post are not ex-generativists, and it’s why they work with those theoretical ideas which seem to them to be “credible and empirically supported”