UofA

Spring 2004

Ling/Phil 596D: Topics in Linguistics and Philosophy

Heidi Harley and Massimo Piattelli-Palmarini

Compositionality

Wednesday January 14

Handout 1 (M. Piattelli-Palmarini)

Introductory remarks: Compositionality, the very idea

Let’s start with the so-called Frege’s Principle (a vast collection of quotations from authors using it, and variously rephrasing it, is to be found in a website by Francis Jeffry Pelletier, at the University of Alberta (http://www.ualberta.ca/~jeffp/FregeQuotes.html). See also his interesting paper (and other papers) in the special issue on compositionality of the Journal of Logic, Language and Information (2001, Vol 10, issue 1). For the sake of clarity, let’s start with a very strong version, one nobody wants to endorse:

Very Strong Compositionality Principle (VSCP): The meaning of a compound expression is exhaustively determined by the meanings of its components, plus their mode of composition.

This is too strong for several reasons:

(a) “Exhaustively determined” is too strong. Indexicals, pronouns and quantifiers, for instance, are allowed to pick their referents contextually, even within a reasonably strong version of compositional semantics.

(b) Some components may well have no meaning by themselves. For instance, contiguous sentential components that are not constituents (if the, was my) and arbitrary “split” components (John ….. her…), have no semantic value, and do not contribute as such to the semantic value of a sentence that contains them. Some notion of “canonical” or “standard” component must be introduced. The proviso “plus their mode of composition” can filter away such perverse choices, but only if we interpret “plus” as not being just plus. This is our next point.

(c) It is not always the case that we are able to specify the meaning of all canonical components (phrasal constituents), and then (or anyway separately) examine their (syntactic) mode of composition. Notorious examples to the contrary (illustrations of the need for a simultaneous morpho-lexical-syntactic decomposition) are selectional restrictions, idioms, anaphoras, traces, and cases of ellipsis.

Let’s amend VSCP and move to a (defensibly) strong principle

Strong Compositionality Principle (SCP): The meaning of a compound expression is systematically derivable from the meaning of its proper constituents, given the syntactic architecture of the expression.

Why this is still a strong (though, I think, defensible) principle is best seen by examining several conceivable ways that are open for weakening it, while still remaining within a theory of semantics that can pass for compositional.

(1) Weakening systematicity and derivability. The speaker-hearer’s tacit knowledge of syntax is supplemented with parsing strategies, rules of use, pieces of guesswork (of the speaker’s desires and intentions), pragmatic know-how, relevant items of knowledge of the situation and of the world at large. We obtain the following weaker version : A Pragmatically Weaker Compositionality Principle (PWCP): The meaning of a compound expression is constrained by the meaning of its proper constituents, given the syntactic architecture of the expression. The term “constrained” signals that meaning is not exhausted, nor strictly determined. Such versions (recent history teaches us) usually deny the competence/performance distinction. (For a thesis close to this weaker version, see (Fodor, 2003)).

(2) Weakening the role of syntax. Constituents in a canonical order may suffice to assign thematic roles (see (Grodzinsky, 2000; Townsend and Bever, 2001), or the meaning of the whole expression is primary (à la earlier Frege), and we have strong contextualism. We obtain the following weaker version : A Contextually Weaker Compositionality Principle (PWCP): The meaning of a compound expression is derivable from the meaning that its proper constituents have in the expression. [Heidi’s comment: Isn’t this version circular? How can one determine the meaning a constituent has in an expression and then determine the meaning of the expression? Can one determine the meaning a constituent has in an expression without knowing the meaning of the expression? Actually, now that I say it, I think I understand: I suppose one can, as long as ‘meaning in an expression’ is purely form-triggered, which I suppose is what they intend. MPP Not so sure. This is a local (sentence-relative) criterion, and maintains that the sentence as a whole is primary, and the meaning of its constituents derived. There are such meanings, and they do contribute to the meaning of the sentence, but they cannot be derived antecedently. Mathematico-logical expressions are the best example dF(x) = G(x)dx (Frege’s own example) shows that the differentials are first to be interpreted in the expression as a whole, and only subsequently interpreted in isolation. In natural languages, I think, this amounts to the thesis that the meaning of the sentence as a whole has to be analyzed first (think of sentence-final “and I did too”, or – remember the song? – “You are every thing and every thing is you”].

(3) Weakening the parts-whole relations altogether. Various proposals by Montague, Dummett, Pelletier, Hodges and others (see the special issue on compositionality of the Journal of Logic, Language and Information, 2001, Vol 10, n.1). The more generic thesis is that the meaning of the compound expression is a function of the meanings of its canonical components, possibly a different function for different types of composites. Functionally Weaker Compositionality Principle (FWCP): The meaning of a compound expression is some function of the meanings of its proper constituents. [A finite number of different functions for a finite repertoire of different sentence-types].

Arguments for compositionality

(1) Substitutivity

This is a centerpiece of Frege’s Ueber Sinn und Bedeutung (On sense and reference) (Frege, 1892/1972, 1948). (For an exacting formal treatment, and a critique, see (Hodges, 2001), for an accurate historical analysis see (Janssen, 2001), for a later revision see (Carnap, 1947)). Intuitively, focusing on natural languages, the core fact is that often (though not always, see infra) canonical components of a same class (DPs, VPs, APs) can be substituted inside composite expressions, not only preserving meaningfulness, but modifying meaning predictably and systematically. The sameness/difference of the contribution of the parts to the meaning of the composite expression is subject to equivalence classes, and this can be rendered formally with a relation of congruence (see Hodges, 2001)

The difference in meaning between The professor met Sally and The dentist met Sally, is exhausted by the difference between the meanings of The professor and The dentist. The same applies to the difference in meaning between John sold a car and John washed a car, and between She is a blond pianist and She is a German pianist. Systematicity is the key notion here. The exhaustiveness of the contribution of the meanings of the components to the difference in the meanings of the compound is an important, but by no means universal, factor. It does not apply, for instance to notorious cases such as

John is easy to please.

John is eager to please.

Moreover, the substitutivity of same-class constituents has notorious exceptions, such as:

They ate a whole chicken.

*They dined a whole chicken.

[Heidi’s comment: but perhaps it’s incorrect to call ‘eat’ and ‘dine’ members of the same class since ‘dine’ does not in fact occur in the same syntactic frame as ‘eat’ (V-DP), but rather in a V-PP frame -- how do we determine ‘same class’? To take a more extreme example, is the non-substitutability of ‘appeal’ for ‘eat’ an instance of a notorious exception?

they ate a sandwich

*they appealed a sandwich

MPP: I was trying to be “naïve” here (eat-dine). Of course, if one introduces fine syntactic machinery the criteria for substitutivity become more stringent].

But the pervasiveness of substitutivity (with systematically predictable effects on meaning) in countless other cases makes, indeed, compositionality central to the semantics of natural languages.

(2) Generativity

Under a different wording (creativity), this was forcefully pointed out by Frege in works published posthumously.

“…a sentence consists of parts, which must somehow contribute to the expression of the sense of the sentence, so they themselves must somehow have a sense. [. . . ] The possibility for us to understand sentences which we have never heard before, is evidently based on this, that we construct the sense of a sentence from parts, which correspond to the words.” (Frege, Letter to Jourdain, 1914/1976, cited in (Janssen, 2001))

“It is wonderful what language can achieve. With a few sounds and combinations of sounds it is capable of expressing a huge number of thoughts, and in particular also thoughts which have never before been grasped or expressed by any man. What makes these achievements possible? The fact that thoughts are built up from building blocks of thoughts. And these building blocks correspond to groups of sounds, out of which the sentence expressing the thought is built up, so that the construction of the sentence out of the parts of the sentence corresponds to the construction of a thought out of parts of thoughts. And we may call the part of the thought the sense of that part of the sentence which corresponds to it, in the same way as a thought can be conceived of as the sense of the sentence”. (Frege Logik in der Mathematik. 1914/1969, also cited in Jannsen, 2001)

To a modern ear, this is basically also Chomsky’s argument (inspired by the work of Von Humboldt) of the infinite use of finite means. Without compositionality, the capacity of any speaker-hearer to produce and understand a potential infinity of novel sentences would remain inexplicable. This will enter with force also in the Hume-Fodor treatment (see next week’s handout)

(3) Disambiguation and transparency

The emergence of Logical Form in Generative Grammar as a distinct level of representation-derivation (for an earlier systematic treatment see (May, 1985)) was largely determined by the desirability to postulate that, at some level, all interpretive relations (notably quantifier-variable, quantifier scope, co-reference, and the placement of empty categories) become visible (i.e. fully explicit) to the mind. At LF, there are no more ambiguities, no more implicit components, no more tacit co-indexing. All is laid bare for the mind to “see”. The compositionality of natural languages and the characteristics of LF are intimately intertwined. Indeed, the very existence of LF follows directly from a strong version of compositionality for natural languages. Relaxing strict compositionality would entail questioning the very existence of LF. (What if there is no linguistic level at which all components are made explicit? I am indebted to Jerry Fodor for pointing out to me this inter-dependency very clearly).

The argument for compositionality from disambiguation and transparency goes (approximately) like this: Some sentences are ambiguous (allow for more than one interpretation). But there are syntactic transformations of those sentences that generate selectively synonymous expressions which eliminate ambiguity (heavy NP-shift, extraction, cliticization, pronominalization, there-insertion etc.). They are characteristically synonymous under one, but not under the other, interpretation.

Every man loves a woman is a notorious example. One interpretation, and one only, is synonymous with

There is a woman, such that every man loves her.

The other interpretation, and the other only, is synonymous with

For every man, there is a woman (possibly different for each man) such that he loves her.

These syntactic transformations, indeed, bring us a step closer to two canonical, distinct, logical forms. This is not a fortuitous coincidence. We have two sentences that “just happen to sound alike” (an expression I borrow from Jim Higginbotham). They sound identical only when taken at face value (at the level of SS, or surface structure, in an older terminology). But not any more, when analyzed at a deeper level (LF). At that level, there is no ambiguity any more, everything is laid bare.

[Heidi’s comment: Even the kind of example that doesn’t invoke LF or transformations at all -- Mary saw the man with the telescope -- but merely constituent structure, can be useful in beginning to make the argument that string-ambiguity is only an illusion; strong compositionality predicts that when considered structurally, all sentences have one and only one interpretation].

The argument is now ready to snap shut: Since thoughts cannot be ambiguous, and the meaning of a sentence is a thought, something must happen before the expression is “delivered” for interpretation to the mind (to the conceptual-intentional system). What actually happens is the derivation of one, and only one, logical form. In the case of ambiguous sentences, which one is for the hearer to decide freely (maybe under the influence of his/her knowledge of context, of some piece of guesswork, etc.), but it can only be one or the other. The selectivity of the syntactic synonyms reveals this forced choice. But, of course, it cannot be that only ambiguous sentences undergo this derivation. Every sentence has a logical form, and this is where strict compositionality fully shines through.

(4) Extractability (Also due to Frege (1896/1976), under a different wording, see (Janssen, 2001))

Again intuitively: A given proper constituent often (though not always, see supra) makes the same contribution to the meaning of different composite expressions. We have to conclude, therefore, that it has a meaning by itself (on its own, für sich “Er muss…. für sich eine Bedeutung haben”), regardless of the composite meaning of those expressions, and of the meanings of the other parts that are present in those expressions.

(5) Knowledge of language versus knowledge in general

The computational apparatus that governs natural languages has remarkable properties of automatism and “bullheadedness”. Once activated, it cannot but proceed all the way up (i.e. up to the derivation of logical form). The semantics of natural languages has features of modularity.

A remarkable example is the automatic processing of syntactically well-formed pseudo-sentences, composed of nonsense open-class words, but with all real morphemes and real closed-class words in the correct positions. (The best known example in English is Lewis Carroll’s Jabberwocky. The Florentine writer and anthropologist Fosco Maraini has written equivalent poems in Italian. No doubt, the process can be reproduced in any language, with the same results).

Twas brillig, and the slithy toves

Did gyre and gimble in the wabe:

All mimsy were the borogoves,

And the mome raths outgrabe.

Let’s now ask: Where did the slithy toves gyre and gimble? What was all mimsy?

There are very straightforward answers to such questions. We cannot refrain from computing the obvious answers. The machinery derives bullheadedly the canonical constituents and the thematic roles, and also inflection and tense, and maybe more, depending on the input.

(A. Moro, S. Cappa et al. have verified, on the basis of fMRI imaging data, that well-formed pseudo-sentences like these activate brain areas involved in syntactic comprehension, something that non-well-formed strings of pseudo-words fail to do).

The automatic, mandatory derivation of basic logical-form dependencies even in such anomalous cases is strong evidence for compositionality. The linguistic apparatus cannot refrain from “composing”, syntactically and semantically.

Coda

Let’s, therefore, take the distinction between knowledge of language and general knowledge seriously. In other words, let’s acknowledge the autonomy of syntax, the competence/performance distinction and the separation between lexical meanings and encyclopedic knowledge. The interpretation of linguistic expressions is a complex and multi-faceted process, but we have reasons to believe that it is made up by separate components. The semantics of natural languages can legitimately, and productively, be restricted to the compositional component of the interpretive process. Other interpretive processes simply “don’t belong”.

Clear and uncontroversial cases are, for instance: Yesterday he opened the window. This means what it means, and it is happily left to pragmatics to decide who that male person, known to the speaker and hearer, actually is, on which calendar date this feat actually happened, and which window (also known to the speaker and the hearer) was the object of the action of opening. (We will examine, however, interesting presumed challenges to this “division of labor” later on in this class.)

We want to keep tacit knowledge of language separate from other cognitive domains and skills, notably those that guide our guessing of people’s feelings, motivations and purposes, our social know-how, our general knowledge of things in the world. We do not expect (indeed, we do not want) this semantic theory to explain, for instance, when and why something has been said ironically, or out of malevolence, nor to trace a boundary between meanings that are acceptable in the light of, say, physical laws, and those that are not.

There is a risk of circularity, here, that we have to avoid: We do not want to define the domain of relevant meanings (those that fall under this version of semantics) as restricted to the speaker-hearer’s knowledge of language, and then define knowledge of language as that knowledge which halts at the boundaries of these meanings. We claim that this partition is not arbitrary, that it cuts the domain of interpretation at its natural joints. We will have a lot more to say about this in the following weeks, but let’s consider here and now some comforting facts:

(a) We have good theories of syntactic knowledge, that sharply demarcate it from general (encyclopedic) knowledge. Therefore, interpretive processes (semantic derivations) that are exhaustively and automatically determined right at the interface with syntax are ipso facto guaranteed to be distinct from general knowledge.

(b) There are many items of information that we may care to know about, but that cannot find their place within the lexical meanings, and/or that cannot be accommodated by the syntax of a single sentence. These, therefore, lie beyond the purview of this semantic theory (for instance, the texture of the objects appearing in the sentence, the normal versus exceptional nature of the action conveyed by the verb, the amount of effort it requires, whether it’s pleasant or unpleasant, the likely consequences of its occurrence, etc.). This optional additional information can only be added by means of adjunctions, or by the insertion of more sentences. The syntax-semantics of the single sentence cannot accommodate it. Which brings us to the next point:

(c) We have to make a principled distinction between intra-sentential compositionality and inter-sentential (discourse-related) compositionality. Reference and meaning may well remain fixed across sentences (And then he…, because it was…, but that was what….) almost ad infinitum. Obviously, sentences also do “compose” in a text, and in discourse. However, the special character of intra-sentential compositionality is to be found in the extremely limited availability of nodes and “slots” (thematic roles) that can assign meanings, and in the rigidity of the kinds of meanings that they assign (the paramount importance of stressing this datum is especially linked to the work of Kenneth Hale and Jay Keyser (Hale and Keyser, 1993, 2002). We will hear a lot more about this from Heidi.

Conclusions (for the time being):

To put it bluntly, as far as linguistic meanings are concerned, the cognitive apparatus at large adds nothing of its own. It “receives” the meaning of a linguistic expression for what it is, as delivered by the linguistic apparatus, and then, depending on a host of other factors, decides how to integrate it (if needed) (although there are doubtless some rigid rules governing this process as well, for instance, in terms of pronominal reference resolution etc. -- even at this level, integration of meaning is not infinitely contextually malleable), what to do with it, to which use this meaning is to be put. The assumption of strict compositionality for natural languages is central to this style of doing semantics. It is far from being a truth of reason, and (as we will see) far from being unanimously accepted. It is surely conceivable, in the abstract, that intelligent beings not too dissimilar from ourselves might have adopted as their natural language a huge host of conventional non-compositional messages (one if by land, two if by sea). But we are not such creatures. There is nothing of interest to be gained (pace the semioticians) from inserting natural languages in the league of symbolic communications at large. No communication that is not strictly compositional has anything to tell us about the semantics of natural languages.

The compositionality of natural languages is both a blessing and a curse. A blessing because (as we shall amply hear from Heidi) it allows us to give body and substance, and intelligibility, to a rich, subtle analysis of linguistic meanings as intimately connected with syntax. A curse, because it is sometimes a narrow straightjacket for the speaker (UG prevents us from saying things we would want to say), and for the theoretician (it requires ingenuity, and serious efforts, to explain why certain meanings indeed are strictly determined by the derivation of LF, without anything being added on its own by the interpretive apparatus – see Hornstein and Uriagereka, 2003 and Piattelli-Palmarini and Uriagereka, 2003 for the thorny case of binary quantifiers, such as most). It would also help, sometimes, to open the boundary between lexical meanings and encyclopedic knowledge (Ernest LePore pointed out to me that it’s hard for a strict lexicalist to explain why we all understand that a smoker is someone who smokes regularly, while a killer is such even if he/she has only killed once in a lifetime [and that a singer is likely someone that earns a living by being it]).

The analysis of this mixed blessing will occupy us for the weeks to come.

Selected references:

Special issue on compositionality Journal of Logic, Language and Information, 2001, Vol

10, n.1

Carnap, R. (1947). Meaning and Necessity. Chicago, ILL: The University of Chicago Press.

Fodor, J. A. (2003). Hume Variations. Oxford: Clarendon Press/Oxford University Press.

Frege, G. (1892/1972). On Sense and Reference. In Davidson, D. & G. Harman (Eds.), The Logic of Grammar (pp. 116-128). Encino, CA: Dickerson.

Frege, G. (1948). On Sense and Reference. The Philosophical Review 57, pp. 207-230. [Original reference: (1892) Ueber Sinn und Bedeutung, "Zeitschrift fur Philosophie und philosophische Kritik", Vol 100, pp. 25-30. Reprinted in Peter Ludlow (Ed.) (1997) "Readings in the Philosophy of Language", Cambridge MA: The MIT Press, pp. 563-583]

Grodzinsky, Y. (2000). The neurology of syntax: Language use without Broca's area. Behavioral and Brain Sciences 23, pp. 1-71.

Hale, K., & S. J. Keyser. (1993). On argument structure and the lexical representation of semantic relations. In Keyser, S. J. & K. Hale (Eds.), The View from Building 20. Cambridge, MA: The MIT Press.

Hale, K., & S. J. Keyser. (2002). Prolegomenon to a Theory of Argument Structure. Cambride, MA: The MIT Press.

Hodges, W. (2001). Formal features of compositionality. Journal of Logic, Language and Information 10(1), pp. 7-28.

Janssen, T. M. V. (2001). Frege, contextuality and compositionality. Journal of Logic, Language and Information 10(1), pp. 115-136.

May, R. (1985). Logical Form: Its Structure and Derivation. Cambridge, MA: The MIT Press.

Townsend, D. J., & T. G. Bever. (2001). Sentence Comprehension: The Integration of Habits and Rules. Cambridge, MA: Bradford Books/The MIT Press.