Hyper-Evaluativity Jim Pryor NYU Rough Draft 2 - 11 June 2010 Feedback very welcome. Abstract: Predicates are "hyper-evaluative" when they depend on more than just the semantic values (be they intensional or more fine-grained) of their individual arguments, but also on the way those arguments are "coordinated" or "wired." I examine motivations and semantic implementations for such predicates, drawing from linguistics and computer science. Section 1 --------- It's hard to do epistemology while staying resolutely non-Fregean. Suppose John believes and accepts: (1) Cicero praised Tully. Why then does he refrain from inferring: (2) Someone praised himself. Can we explain John's restraint, without bringing in some kind of Fregean machinery? Can we explain how his restraint may be rational? Presumably we'll also want to distinguish between the epistemic situation John is in, and a situation that justifies him in accepting: (3) Tully praised Cicero. But it's not easy to do this if we count (1) and (3) as the same belief. Let's not fuss right now with the semantics of attitude *reports*. My concern is with how fine-grained the representational systems we think with themselves have to be. Some philosophers like Nathan Salmon are resolutely non-Fregean about the former but have much in common with Fregeans about the latter. (There are also important disagreements.) Doesn't it look like *anyone* will have to go *somewhat* in that direction? Won't we have to be somewhat Fregean about the representational systems we think with, to explain why subjects who accept (1) don't immediately also accept (3) and (2)? I'll argue that the machineries usually associated with "Fregeanism" are the wrong tools for prising (1) apart from (3) and (2). I won't get too specific about what counts as "Fregeanism." On some definitions, the machinery *I* want to sell you might itself be counted as a heterodox form of it. On the other hand, my machinery is one that self-labeled Russellians/Millians have invoked in good conscience. So whether it should be counted as Fregean or as Russellian is a job for subtle lexicography. I'm disposed to think it's neither. To bring out the limits of orthodox Fregeanism with respect to (1)-(3), let's set aside for the moment what it's *rational* for John to believe. Instead think about what it could be *intelligible* from John's perspective to believe---even when the beliefs so had *may fail* to be rational. So, John is thinking about how talented various of his students are. He judges: (4) Alice is smarter than Betty, and also smarter than ... and also smarter than Alice, and ... Whoops, he lost track of who he was comparing there. So he made a mistake. Now I'm supposing he really did think something that, in part, involved Alice being smarter than Alice; it wasn't a case where he had Alyssa in mind but just said "Alice" by mistake. That can happen too, but I'm supposing the former is what happened here. In thinking (4) John may well be exhibiting a momentary cognitive defect. But it does seem he could think it. At the same time, he hasn't completely lost his senses. He's not exhibiting the kind of irrationality he'd need to judge: (5) Alice is smarter than Betty, and also smarter than ... and also smarter than herself, and ... I trust this characterization of John's thinking sounds intuitively natural, even if it's not clear yet what the difference could be between judging (4) and judging (5). We have here a difference in cognitive significance, and we'd like to know how to think about it. It may be that a sober, fully rational version of John wouldn't hear (4) or (5) as more informative than the other. But they differ in cognitive significance for John, *as he is now*---and in theorizing about the mind it's essential we think about and have good models for cases where we fall short of being fully rational, too. So, questions of John's *rationality* aside, how can we even allow for the *possibility* of the judgments (4) and (5) differing for him? Fregeanism doesn't offer much help here. It doesn't matter how descriptively rich and specific John's way of thinking of Alice is: for he was deploying *one and the same* mode of presentation of her in each argument place when thinking (4). Well, perhaps we could make modes of presentation ephemeral, tied specifically to the token occurrent uses now being made of them. In that case the two thinkings of "Alice" could have different modes of presentation. But at the same time we'd be making these modes of presentation unrepeatable and effectively useless. They'd no longer help explain why the inference "Alice is smart, so Alice exists" is rationally sanctioned: the sense of the first "Alice" would then differ from that of the second, and it would be a cognitive risk to assume these two "Alice" thinkings coreferred. You may want to say that in (4), John must be using "Alice" as two different names in his thinking. I don't encourage that, but I don't mind it either. However, I see no reason to think there's any *qualitative* difference between the ways he's thinking of Alice in the two cases; the only available differences are so tied to the token uses that they'd get in the way of modes of presentation doing the explanatory work they're supposed to do. So however many "Alice" names we count here, a Fregean diagnosis doesn't look too promising. For arguments in the same spirit, see Kit Fine 2007's "two Bruces" example (pp. 36-7 and 71), and his "higgedly-piggedly" Mates Puzzle (pp. 129-31). Section 2 --------- Let R be a dyadic predicate, and h and p be two coreferential names. A familiar and plausible line of thought says the way we represent the world when we assert or think this: (6) h Rs itself [e.g. h illuminates itself] or this, using "^x" to represent lambda abstraction: (7) (^x: Rxx) h or this: (8) exists x: x=h & Rxx involves representing some self-Ring to be going on. That is, what we say or think in these cases *represents* the phenomenon of reflexivity. I'm gesturing at an idea here; I don't mean that any of these assert something *about* reflexivity. Fine 2007 pp. 39ff has an attractive distinction between representing objects *as* the same, as he says we do in: (9) Hesperus = Hesperus. and merely representing them *as being* the same, as we do in: (10) Hesperus = Phosphorus. In the first case, he says, no one who understands the claim "can sensibly raise" the question whether it is the same object involved. In the other case, this question *can* be sensibly raised, even if we know the answer. Sometimes I've found it helpful to say that in (6)-(9), the coreference is semantically *de jure*, whereas in claims of the form: (11) Rhp even the identity claim (10), it's at best semantically *de facto*. It's natural to hear *some* kind of difference between these last claims and the earlier ones---even for philosophers with predominantly Russellian intuitions. I do however want to register some caution about whether claims like (9), or more generally, anything of the form: (12) Rhh should always be understood reflexively. This is disputed. And the John/Alice case from Section 1 seemed to exhibit a difference between (12) and (6). The view I'll eventually endorse says that (12) may be ambiguous between an understanding which is reflexive, and equivalent to (6), and an understanding which is not. When John has lost track of the fact that it's one and the same student he's then comparing, he's judging the unreflexive form of (12). On the other hand, when John thinks in a way that he recognizes as licensing the inference to (8), his thought is reflexive. When we assert or think: (11) Rhp generally it seems we *don't* represent any self-Ring or reflexivity, even if we're antecedently sure that h *is* p. Or at least, so there is some temptation to say. It's natural to understand even this: (13) h=p & Rhp as failing to represent reflexivity in the way that (6)-(8) do. But whether that's correct must await our settling the logic for these phenomena; as we'll see, it's not straightforward whether (13) should entail (7) or (8). Section 3 --------- The ideas just floated have been tempting to many philosophers who otherwise prefer their propositions rather coarse-grained: built up of bare objects and properties, with logical structure as the only glue or scaffolding. Putnam characterized phenomena of the sort we're considering as *part* of logical structure: "'Greek' and 'Hellenes' are synonymous. But 'All Greeks are Greek' and 'All Greeks are Hellenes' do not *feel* quite like synonyms. But what has changed? Did we not obtain the second sentence from the first by 'putting equals for equals'? The answer is that the *logical structure* has changed." (pp. 153-4 in Salmon & Soames, ed. 1988) [Putnam, Synonymy and the Analysis of Belief Sentences, Analysis 14 (1954), 114-22] So far, we've only heard suggestions of a distinctive logical structure, of *representing* the world in a distinctive way that manifests reflexivity. That doesn't yet mean there's a difference in anything's *truth-conditions*. But if there really are representational differences here, it's natural to suppose some things we assert or think could be sensitive to those differences. Are there any predicates R such that (6) might differ in *truth-value* from (11) and (14)? (6) h Rs itself (11) Rhp [as before, h and p are coreferential, and we may suppose, have the same individual semantic values] (14) exists x: x=p & Rhx If there are, I'll call such predicates "hyper-evaluative." This is a distinctive kind of hyper-intensionality. If there are such predicates, they depend on more than just the "values" of their individual arguments---be those values extensions, or intensions, or even something finer-grained. They even depend on more than just the cognitive associations or modes of presentation of their individual arguments, if there be such. They depend in particular on how those arguments are "coordinated" or "wired" together. If there are semantic features that track coordination relations in this way, they will be part of the language's semantics too. So why do I describe this as being "hyper-" a term's semantic value, rather than as being a hitherto unacknowledged aspect of semantic value? * It will help to first think about the Montague-inspired strategy of taking names to (at least sometimes) have generalized-quantifier meanings. That is, sometimes a name's semantic contribution is not just an entity from the domain (say, Alice), but rather a property of predicate semantic values (say, the property of containing Alice). Even on such views, it remains appropriate to call Alice the name's bearer or "referent." Montague didn't propose that where we thought we had been talking *about Alice*, we were really talking *about* properties of predicate values instead. Rather, his proposal was that the notions of semantic value and reference should come apart here. I think we must be open to that---and probably we should understand Russellianism or Millianism in a way that's compatible with it. What's central to those views shouldn't be the question whether semantic values are just referents (a Montagovian treatment of names says they aren't) but rather whether semantic values are *built up out of* more than just their referents: whether they have further extra-logical components (a Montagovian treatment of names can agree they don't). (Some ways of treating names as predicates would also be compatible with Russellianism, broadly understood.) * However, the extra semantic features needed to track coordination relations shouldn't be assimilated into this same notion of semantic value. It will prove useful to have a notion of an expression's "value" that best fits the traditional, uncoordinated way of doing semantics, as well as having semantic features that additionally encode coordination between values. This is all orthogonal to the question whether a term's value is just an entity from the domain, or it's a property, or a generalized quantifier meaning, or what have you. Expect at least three levels of semantic features here: referents, semantic values (which may or may not be different from referents), and whatever encodes the reflexive or coordination phenomena we've been considering. I call these last aspects of meaning "hyper-evaluative" to oppose them to the middle notion---which computer scientists also call "values." * In *some* way, reflexive or coordination phenomena have to do with how terms (or their occurrences) relate to each other, rather than with facts about the terms in isolation. Fine 2007 argues more specifically that it's essential to understand these phenomena as coming from a "relationist" semantics, rather than from semantic features that terms (or their occurrences) have intrinsically. This is a specific commitment of Fine's view, which we will see to be only one particular proposal about how to implement a hyper-evaluative semantics. I don't want it to be definitional of hyper-evaluativity that it must be explained with features not be possessable by terms (or their occurrences) taken individually. That will be hashed out differently between different implementations of these ideas. Natural language predicates like "it's manifest/certain/patent that ..." and other epistemic terms are good candidates to be hyper-evaluative. That is, I think we can understand a sense in which each of the (a) claims is true, but the (b) claims false, even when we're sure that Hesperus is Phosphorus. (15a) It's manifest that Hesperus = itself. (15b) It's manifest that Hesperus = Phosphorus. [see Fine 2007, pp. 48, 56 and 136n14 on "manifest consequence"] (16a) Hesperus is indubitably as massive as itself. (16b) Hesperus is indubitably as massive as Phosphorus. In the literature, ordinary attitude verbs are often taken to be hyper-evaluative too. Consider Mark Richard's phone booth case from 1983. [Mark Richard, "Direct Reference and Ascriptions of Belief", JPL 12 (1983), 425-52] In that case, the speaker doesn't realize that the woman he sees out the window is the same woman he's addressing on the phone. Among other things Richard uses this case to show, he claims it is natural to count the report (17) as true but neither (18) nor (19) as true: (17) I believe I can alert you to her danger. (18) I believe I can alert you to your danger. (19) I believe I can alert her to her danger. despite the fact that "you" and "her" in the speaker's mouth are coreferential. So Richard counts: (20) ... believes ...h...h... as reporting a reflexive attitude, and as sometimes differing in truth-value from: (21) ... believes ...y...h... even though 'y' and 'h' are directly referential, and in the context corefer. I presume Putnam would have agreed. Consider a variation of Richard's example, where a speaker is disposed to accept (19) but not (17). In the 1983 paper, Richard takes (19) to permit exportation, that is, to entail: (22) exists x: x=her & I believe I can alert x to her danger. and since the first 'her' occurs extensionally and corefers with 'you', that entails: (23) exists x: x=you & I believe I can alert x to her danger. Richard also grants that (23) entails (17). So we can go from (19) to (17), but not in the reverse direction: (19) I believe I can alert her to her danger. (17) I believe I can alert you to her danger. (That is, when (19) is true, (17) *is true*; it does not follow--and in the variant case I posited, it is not true that---the subject will be disposed to *accept* (17).) In Richard's 1990 book, however, his view is different. There he still allows (19) to export to (22) and so also to (23), but now he denies that we can always reimport from (23) to (17). (See p. 152-3.) That is, earlier he hadn't taken: (21) ... believes ...y...h... to report an *absence* of coordination in the subject's beliefs; but by 1990 he was taking it to (sometimes) report that. [Details: He says a "correlation" between the reporter's terms and the subject's terms *may* map y,h in a that-clause to h',h' in a sentence that the subject accepts. So when the subject would accept "I can alert her to her danger," (17) *may* be true, even though the subject wouldn't accept its complement clause. But it won't be true for every correlation. On the other hand, because a "correlation" is a function, it *must* map h,h in a that-clause to h',h' in an accepted sentence. So when the subject *wouldn't* accept 'I can alert her to her danger,' (22) may be true but (19) won't be. See pp.139ff, and pp216ff for subtleties about demonstratives vs names.] In the mid-80s, Nathan Salmon, Scott Soames, and David Kaplan were also wrestling with these questions. Kaplan introduced the device of "wired" structured propositions in talks he delivered at that time. [Forget/check whether this material was published.] King would go on to use this device in the 1990s. [citations] Salmon 1986 [Reflexivity] and Soames 1987a [Direct reference, propositional attitudes, and semantic content] and 1987b [Substitutivity] rejected the Richard/Putnam idea that: (24) ...h...h... by itself says anything reflexive, or that: (20) ... believes ...h...h... (25) ... believes ...x...x... attribute reflexive beliefs. It's only when some *binding element* occurs underneath the belief term, and binds the relevant argument places, as in: (26) ... believes (exists x: ...x...x... & x=h) or: (27) ... believes ((^x: ...x...x...) h) that a reflexive belief is attributed. If the argument places are only bound from the outside, as in: (28) exists x: ... believes ...x...x... they claim no reflexive belief is yet being attributed. [There are no formulas numbered 29 or 30.] Section 4 --------- We'll return to the history in a bit. Let's interpose some thoughts about the dispute we're seeing so far. My own sense of this dispute is that natural language does not uniformly support either side. I share Richard's intuitions about (17)-(19) in the case he describes, and also that this pair does not sound equivalent: (31) exists planet x and planet y: x = y and Johannes said he saw x rise, then y rise, then x set, then y set. (32) exists planet x and planet y: x = y and Johannes said he saw x rise, then y rise, then y set, then x set. (Richard 1990, p. 153). At the same time, I think *sometimes* recurrence of the same term in an attitude report does not contribute towards the report being about a reflexive attitude. In "Variabilism," Sam Cumming builds a case with the characters from "Love's Labour Lost." Biron is courting Rosaline, and is dancing with Katherine, who has successfully disguised herself as Rosaline. The third girl Maria reports: (33) Biron thinks Katherine is Rosaline. [Cumming, "Variabilism", Phil Review 18 (2008), 525-54; at p. 529.] And in fact, continuing the dialogue, we might report the fact that all the girls have been successful in their disguises, as: (34) Each girl fooled Biron, so he didn't know she was she. This continuation makes trouble for what Cumming is attempting; we'll return to it later. But for present purposes note that despite the recurrence of "she", we're precisely not reporting the presence or absence of reflexive knowledge. Biron hasn't been fooled so far as to stop attributing self-identity to anyone. Even when *reflexive pronouns* are used, I don't think this guarantees it's a report of a reflexive attitude. Every morning, I discover amazing small sculptures have been put together out of household junk while I slept. I'm keenly interested to meet the artist. Finally, one night I set up a hidden videocam. The next morning I discover *I* have been the one doing this, in fits of sleep-walking. Bemused, I report: (35) All along I had been hoping that I'd meet *myself* (can you believe it?) Here I'm *not* saying that all along I had a reflexive hope, a hope whose agent and quarry were conceived to be one and the same. So in some of these (20*) ... believes/hopes/knows ...h...h... cases I'm tempted to agree that a reflexive attitude is attributed; in other cases not. But neither do I agree with Salmon and Soames that it's only underneath a binding operator such as "exists x" or "^x" that we see the intuitive phenomena we're describing. One large worry, which I'll now develop, is that natural language scope barriers may constrain the presence of such operators more tightly than the intuitive phenomena. [Barker and Shan argue we can get binding in situ, without needing binding operators to move to any higher scope. That work is very exciting; but I will ignore it here.] For example, offhand it doesn't seem that "Bob" should have any higher scope here than "Jane" does: (36) Jane and Bob each gave Bob a present. Yet, if we want to formalize this in a way that exhibits the only kind of reflexivity Salmon and Soames allow, we'd have to do: (37) (^x: (^y: y gave x a present) each of Jane, x) Bob So it looks like we'd have to choose between (i) what's said in (36) not *saying* that it's the same Bob in the two cases, at least no more than this does: (38) Jane and Bob each gave Mr Smith a present. where "Mr Smith" happens also to name Bob. Or (ii) (36)'s really having a funny logical form. Or (iii) Salmon and Soames being wrong that two arguments can only be represented as coordinated when bound by a lambda or the like. Salmon and Soames push for (i). For now, I'll just say that's unsatisfying. There are ways for a belief to represent the world that underwrite a subject's willingness to existentially generalize on two argument places simultaneously; and ways that don't. Such willingness would ordinarily be present when accepting (36), as much as it would when accepting claims of the form: (7) (^x: Rxx) h As I've already said, we can't expect recurrence of the term "Bob" to *guarantee* the presence of this representational coordination (and neither can we expect difference of terms, as in (38), to guarantee its absence). But I'd expect this coordination to at least be *possible*, and indeed ordinary, for a belief one has in accepting (36). Salmon and Soames' strategy precludes that, unless (36) really does turn out to have the logical form (7). I want to make the unsatisfyingness of their strategy more vivid, by highlighting more limits on where we'd be able to employ it. Often it won't just be offhand scope judgments, like the thought that (36) doesn't have the logical form (37), that stand in the way. It will instead be widely-accepted linguistic generalizations. For example, binding terms like "every boy" cannot scope outside of the "there to be" construction in: (39) Sue wants there to be farmers at every boy's picnic, and hopes he will take pictures. There's no way to understand that sentence with "he" being bound by "every boy", as there would be if "every boy" could scope out over the conjunction. Of course we can explicitly express the latter judgment: (40) Every boy is such that Sue wants there to be farmers at his picnic, and hopes he will take pictures. But the surface form (39) doesn't permit a reading where the binding term assumes (or "moves to") the position it has explicitly in (40). So now suppose you want to attribute a belief of the following sort to Nathan: (50) [that] Sue wants there to be farmers at Jack's party, and hopes Jack will take pictures. and you want the attribution to be reflexive. You want to report Nathan as believing it's the same Jack that Sue's two attitudes concern. It's tempting to think you can do this just by reporting belief in (50). But the generalization we're considering tells us that (50) cannot be understood as: (51) (^x: Sue wants there to be farmers at x's party, and hopes x will take pictures) Jack. Why not? Because "Jack" in (50) is in the same position as "every boy" in (39), and our generalization tells us that's not a surface position whose occupant can assume the wide scope indicated in (40) and (51). The linguistic generalization we're relying on here may be disputable, but it isn't to be lightly shrugged off, either. Of course, Salmon and Soames *can* shrug off what I called "tempting" here. They'll just deny that (50) itself can be used to attribute a reflexive belief. A reporter may instead only *convey* that his subject *would accept some sentence* like (50). Those paths are well-worn and we won't follow them. I just observe that there is some intuitive cost to not being allowed to report a reflexive or coordinated belief with the surface form (50). As I said, it's tempting to think we can do that. Of course, in forms like (50) we *can* replace the second "Jack" with pronouns anaphoric on the first "Jack", and we can also do this when the first "Jack" is instead an indefinite: (52) Sue wants there to be farmers at a donkey's party, and hopes it will take pictures. But this observation is one linguists struggle to *reconcile with* the fact that no binding element at the surface position of "a donkey" is assuming the wide scope indicated in (40) and (51). That's what the whole business of explaining donkey-anaphora amounts to. There are other accepted scope barriers which testify in the same direction. For example, binding elements seem unable to scope outside of determiner phrases. To see this, consider: (53) Several politicians spy on someone from every city. We're interested only in the readings where "every city" takes wider scope than "someone." The reading where there's a single group of spying politicians is available: (54) several politicians p: (every city c: someone x from c: p spy on x) The reading where in every city, there's a different focus of some spying also seems available: (55) every city c: someone x from c: (several politicians p: p spy on x) But, as the reader should confirm, the reading where every city has a few, perhaps unfocused, spyings taking place does not seem available: (56) every city c: (several politicians p: (someone x from c, perhaps a different x for each p: p spy on x)) This generalization comes from Larson 1987 "Quantifying into NPs," and is widely accepted. The now predominant explanation is that "every city" cannot scope outside of the phrase "someone from every city." That phrase must as a whole either take scope over or under "several politicians." But now consider the following, read so as to state that there's a single group of politicians: (57) Several politicians spy on someone from each of Chicago and London, and get away with it because their privacy protections are inadequate. If this states there's a single group of politicians, not different groups for each city or each spying, then "someone from each of Chicago and London" isn't taking scope over "several politicians." And so, by the rule we extracted above, no binding element introduced at the surface position of "Chicago and London" can be assuming wide scope over "several politicians" either. And so---though clearly *there is anaphora* in the final clause on "Chicago and London"---the logical form we're working with cannot be: (58) (^c: (several politicians p: someone x from c: p spy on x, and get away with it because c's privacy protections are inadequate) each of Chicago and London The anaphora we see in the final clause of (57) is like the donkey-anaphora in (52). It's not to be explained by a wide-scope binding element like "^c" in (58). English's scope barriers speak against (57) being able to have the logical form of (58). As before, the linguistics here may be disputable, but I understand the view I'm setting out to be the mainstream, and any alternatives would need to be assessed carefully. [Will add more citations] Relative clauses provide a third scope barrier. Here: (59) Ralph knows that someone loves everyone. there is a reading where "everyone" takes wider scope than "someone"---that is, it may be a different lover for each of the everyones. But here: (60) Ralph knows someone who loves everyone. that reading is no longer available. The predominant explanation is that "everyone" cannot scope outside of the relative phrase "who loves everyone." That constraint will also govern the "when ..." clauses in the following: (61) The days when Jane criticizes every student are days when he's unhappy. This sentence invites the question "who's he?" because "he" can't be read as bound by "every student." "Every student" occurs in a surface position that can't be understood as moving to a wide enough scope to do that. We can explicitly express the missing reading: (62) Every student is such that the days when Jane criticizes him are days when he's unhappy. But (61) can't itself be understood that way. Now, as before, let's replace "every student" with a singular term: (63) The days when Jane criticizes Jack are days when he's unhappy. Here the "he" in the main clause *can* be anaphoric on "Jack," but that's a challenge theorists work to explain, because they assume it's *not* a case where any binding element at the surface position of "Jack" can be moving to wide enough scope to be doing all the needed binding. That is, just as (61) can't be understood as (62), (63) can't be understood as: (64) (^x: The days when Jane criticizes x are days when x is unhappy) Jack To all this, Salmon and Soames can shrug and say we need to live with there really being no reflexivity represented in: (36) Jane and Bob each gave Bob a present. (50) Sue wants there to be farmers at Jack's party, and hopes Jack will take pictures. (52) Sue wants there to be farmers at a donkey's party, and hopes it will take pictures. (57) Several politicians spy on someone from each of Chicago and London, and get away with it because their privacy protections are inadequate. (63) The days when Jane criticizes Jack are days when he's unhappy. And similarly, no reflexive attitudes reported in belief reports where those are the complement clauses. But as I said before, that is an intuitive cost. [Added: If one did want to accommodate reflexivity using only the resources Salmon and Soames allow, Fine 2007 makes another good complaint. Consider: (70) Cicero loves Tully and not: Tully loves Cicero. Which of these should we understand the reflexive logical form of (70) to be? (71) (^x: ^y: Lxy and not Lyx) Cicero Tully (72) (^y: ^x: Lxy and not Lyx) Tully Cicero There doesn't seem to be any motivated answer; but neither does it seem that there really should be *two* reflexive claims whose surface form is (70). Then again, Salmon and Soames are most naturally understood to be rejecting, or trying to explain away the intuitive phenomenon that Fine and I are trying to capture, not to explain it.] Section 5 --------- Perhaps the biggest disappointment with the Salmon and Soames strategy is that, once you start thinking about the kind of coordination between argument places we see in reflexive claims, it becomes very tempting to posit that between argument places in different propositions, too, as in: (74) Some days Jane criticizes Jack. Criticism makes him unhappy. (75) Alice is F. So Alice exists. Of course there can be cases where you consider the two *sentences* in (75) and are unsure whether it's the same term "Alice" recurring. Then it's not clear whether you're considering a valid argument. But I've never had sympathy for the worry that this shows we can never be sure about validity. (Perhaps we can't, but if so it's not for this reason.) I want to say: it's not *the sentences* I primarily judge to be valid. It's an argument *pattern*, and in the pattern I'm considering, it's *given* whether the same singular term occurs in each premise. The natural model for this way of thinking is one where argument places in different propositions can be de jure linked, in the way we've been thinking they can be coordinated in single reflexive propositions. But that's not something that any binding element like "^x" or "exists x" can accomplish. Section 6 --------- I don't think what we've seen Richard say about coordination can be the whole story. As I've said, I think claims like: (9) Rhh can receive both reflexive and non-reflexive readings. Or: we should at least want a formal notation that can express two readings here. More on this later. Also, I agree with Soames 1987b, which argues that the resources Richard offers block only *some* intuitively troubling Frege-problem cases. (We're considering here only those parts of Richard's view that have to do with reflexivity, not his whole theory from 1990. Also, his 1983 view applies to demonstratives and variables only, not to any directly referential terms like names that are insensitive to contexts and assignments. I ignore that here.) For example, as Soames reports, these propositions still come out equivalent on a direct reference view, even one enhanced with the ability to distinguish reflexive from non-reflexive propositions: (76) Superman is stronger than Clark Kent. (77) Clark Kent is stronger than Superman. (This was pointed out by Lewis in response to a lecture Kaplan gave. Compare our (1) and (3).) Also, if this is true ['h' and 'p' abbreviating 'Hesperus' and 'Phosphorus' throughout...]: (78) The ancients said that: Fh & not Fp. then although this won't follow: (79) The ancients said that: Fh & not Fh. this will: (80) The ancients said that: Fh and also that: not-Fh. And (80) seems intuitively nearly as troubling as (79). It helps little to be assured that, though it's true, (80) is a different proposition from: (81) The ancients said that: Fh and also that: not-Fp. (See also Soames's note 25: "It does not help to be told that the Ancients did not believe that Hesperus was not Hesperus, if it is granted that they did believe that Hesperus was not Phosphorus and that Phosphorus was Hesperus.") Soames also argues that Venus could truly report what the ancients said in (78) with: (82) The ancients said that: I am F & I am not F. but if we accept that, it seems to entail: (83) exists x: the ancients said that: x is F & x is not F. which is just the kind of report that Richard's theory is meant to block. (See Soames' McX case at p. 117, also the extension of the same idea to Venus on pp. 118-9.) I agree with Soames that (82) should be true relative to Venus' context, when understood so that the two occurrences of 'I' are uncoordinated. And I also accept the entailment to (83), but only when the coordination between the two occurrences of 'x' is broken. This is the same idea we saw in my continuation of Cumming's "Love's Labour Lost" case before: (34) Each girl fooled Biron, so he didn't know she was she. How might we formally represent these broken coordinations? We'll discuss this more detail later. For present purposes, one can think of it like this. Suppose G[y->x] is a complex expression where all free occurrences of y have been replaced with x, and those occurrences of x may be coordinated with other occurrences of x elsewhere in G. Here is how to convert G[y->x] into an expression that breaks any such coordinations with the other occurrences of x (but leaves all the y-replacing occurrences of x still coordinated with each other): (84) exists y: y=x & G or this (which would be better if we want to work in a free logic and allow x to be non-referring): (85) every y: y=x horseshoe G For example, let G be 'x is indubitably as smart as y'; then G[y->x] will be: (86) x is indubitably as smart as x and understand the two occurrences of x in the latter to possibly be coordinated. We get a version with the coordination broken by using schema (85). In this case, that amounts to: (87) every y: y=x horseshoe (x is indubitably as smart as y). Even if the two occurrences of x in (87) may still be coordinated, we suppose predicates will only be sensitive to coordination among their own arguments. (Fine resists this, see later.) And nothing has been done here to introduce a coordination between any occurrence of y and those of x. So G's own arguments are now uncoordinated. That's a way to break coordination, and we can understand something like it to have gone on in (82), (83), and (34). Of course I don't think any of these have the underlying logical form of (85). Rather, we will later discuss how to do this with a semantically simple operation. But (85) gives us a way to think about what's going on in terms that are now more familiar. The phenomenon illustrated in examples like (34) is very interesting. I understand it to be a case where a variable is still referentially bound by a quantifier but its coordination with other terms also bound by the quantifier has been broken. So: referential binding does not entail coordination. (Though it may always *introduce* coordination.) This is another respect in which I oppose the Salmon-and-Soames-inspired strategy of trying to account for the intuitive phenomena only in terms of binding operators. Resuming the main thread: I agreed with Soames that there should be true (uncoordinated) readings of (82) and (83): (82) [Venus speaking] The ancients said that: I am F & I am not F. (83) exists x: the ancients said that: x is F & x is not F. but I also think there should be false readings. Or at least, even if there aren't false readings of these forms *in English*, we have an intuitive understanding of a representational difference here, and we should want a formalism that can express false (coordinated) claims of these forms, or of some forms in this neighborhood, as well. I think many (all?) of the troubling Frege-Problem cases Soames says Richard's theory leaves unhandled will be handleable if we develop the resources to have cross-propositional coordination. One of Soames' major points is that "reports of propositions asserted are not semantically required to preserve the logical structure or cognitive perspectives of the sentences used to assert them" (p 118). I have sympathy for this. If we don't use resources dedicated to reporting coordination---as natural language may not, or anyway, may always have disambiguations that do not---then: (88) ... said ...h...h... can be used to correctly report the assertion of a sentence whose logical structure was: (89) ...h...p... In that respect, I agree the Richard view is too inflexible. But my interest is to explore semantic resources which *are* dedicated to reporting coordination, and whose correlates of (88) could not be used to correctly report an assertion of (89). So my sympathies are very much with the spirit of Richard's approach. Section 7 --------- Despite the linguistic excursions of Section 4, I'm not going to try to settle which natural language phenomena exploit or are sensitive to coordinated argument places. My concern is instead with what we need to buy into to *get* this representational capacity into a language, even a formal language. I think our cognitive systems use such a capacity, even if natural language does not. (But natural language probably does too.) As a theorist, I'd like to at least have a well-understood formalism for *happily expressing* the kinds of differences exhibited in the John/Alice case from Section 1. We'll consider various ways to semantically implement this: some from philosophy, more from linguistics, even more from computer science. Chris Barker (NYU Linguistics) and I are teaching a seminar this fall on "What Philosophers and Linguists Can Learn from Computer Science But Didn't Know to Ask." Of course philosophical logicians will already know nearby work in CS; but we're aiming to show how ideas familiar in CS directly bear on more mainstream, somewhat-less-technical inquiries in our fields, too. (Ideas that may not be familiar to everyday programmers, but are familiar in theoretical CS, especially the areas of functional programming and type theory.) The present inquiry is one such example. We'll see next how certain complex expressions in some programming languages are hyper-evaluative. And the techniques for doing semantics for such expressions are better-explored over there. (No matter where one looks, though, it's a challenge to abstract away from the details of particular implementations, and get at what the fundamental semantic commitments are. We won't even ourselves overcome this, but it's something we'll aim for.) Section 8 --------- Let's talk through a primitive programming language, to see how hyper-evaluativity arises in that setting. We'll make up the syntax ourselves, for pedagogical efficiency. But everything we do here is straightforwardly expressible in Scheme or many other languages. We'll begin with a language fragment that's purely "declarative" or "functional." That is, it just consists of expressions like 1+2 and more complicated things of that sort. The language's *interpreter*---that is, the software that processes programs we write in this language---takes care of *evaluating* the complex expressions we give it, and delivering us the result. But the language doesn't yet itself contain anything that counts as an *imperative*. 1+2 is not an order or command. The only imperatives on the scene are external to the language. For instance, we do order the interpreter to evaluate our program. It delivers the result that the language's semantics determines that to be. But the only thing the interpreter needs to interpret are complex expressions like 1+2. There aren't any orders or commands in the language that need to be interpreted in the same way. I emphasize this because later we will introduce imperative elements into the language, and the differences this brings along will be important. Many people's folk conception of computation is already imperatival. But it's important to realize that's just one kind of element that a language may have or lack; and important to pay close attention when it arrives on the scene. This element is intimately connected to the phenomenon of hyper-evaluativity that we're trying to understand. The most familiar paradigm of a purely functional programming language is Church's untyped lambda calculus. (Typed lambda calculi are also purely functional, but less familiar.) In the lambda calculus, everything is either a function, or an inert unevaluable simple. That makes things much more fun. But we'll keep things simple and boring here, and allow ourselves primitives like the natural numbers, arithmetical functions, and so on. We'll also allow ourselves a primitive syntax for forming ordered tuples: (1, 2, 3) is a 3-tuple, and (1, 2, 1+2) evaluates to the same 3-tuple. One can build these up by hand in the untyped lambda calculus, but as I said, we'll keep this as boring and straightforward as possible. Let's introduce variables into our programming language. Suppose I want to evaluate (1+2+10, 1+2+20) but I'm lazy and I don't want to rewrite the 1+2. So we'll give ourselves the ability to do this: let x be 1+2 in (x+10, x+20) This will evaluate to the 2-tuple (13, 23), just as you'd expect. We can also do fancier things like this: let x be 1+2 in let y be 10 in (x+y, x+20) which also evaluates to (13, 23). Now I'll mention one more subtlety before we move on. What if we do this: let x be 4 in let x be 1+2 in (x+10, x+20) This will evaluate to (13, 23), too. That is, the innermost binding of x trumps any outer bindings. We'll call this *shadowing*. It's basically the same as happens when you say in predicate logic "every x: (Fx & exists x: Gx)", reusing the same variable x. It *looks* a bit like something else that's going to happen later, but what comes later will be importantly different. What we're doing with these let-expressions is basically just supplying a value as an argument to a lambda abstract. That is, a claim like this: let x be EXPRESSION in BODY is doing nothing more than this: (^x: BODY) (EXPRESSION) It's just a syntax that's easier to think about---and extend. The next thing we introduce will be definitions of our own complex functions: let f be (function x: x+3) in (f(10), f(19+1)) This will evaluate to (13, 23) too. When f is applied to the value 10, it binds f's parameter x to 10, and then returns the evaluation of x+3, which is 13. Then when f is applied to the value 19+1, it binds f's parameter x to *that* value. There are theoretically interesting issues about whether it first evaluates 19+1 to 20, and then binds x to 20, or whether it leaves that evaluation to do until later. We'll suppose it does the evaluation first. So now f returns the evaluation of x+3, which is 20+3, that is 23. So now our evaluation of (f(10), f(19+1)) has reduced to (13, 23), and we're done. Functions can be arbitrarily complex. When appropriate, we'll break them up onto several lines, as here: let f be (function x: let y be x+3 in (x, y) ) in (f(10), f(19+1)) This will evaluate to the 2-tuple of 2-tuples ((10,13), (20,23)). Functions can take more than one variable: let f be (function x, y: x + y + 1 ) in (f(10, 2), f(20, 2)) This will evaluate to (13, 23). (In functional programming, multiple arguments are standardly implemented by "Currying," but that won't matter to what we're doing here.) We allow functions to see not just their own internal variables, but also any variables from their surrounding environment, too: let y be 3 in let f be (function x: x+y) in (f(10), f(20)) This will evaluate to (13, 23). If we shadow any variables (remember that?), then, just as before, the innermost binding wins. let y be 2 in let f be (function x: let y be 3 in x+y ) in (f(10), y, f(20)) evaluates to (13, 2, 23). Note that the y in the final line is evaluated as 2 not as 3. This is because that occurrence of y is not in the "scope" of the expression that rebinds y to 3. At that final line, the original binding to 2 is still in effect. This is basically the same as happens when you say in predicate logic: "every x: Fx & (exists x: Gx) & Hx." The "Hx" is evaluated with x bound again by the universal quantifier. Now we're ready to introduce our first important novelty. This is what's known in CS as "mutation" or a "side-effect." It will look like this: let y be 2 in let f be (function x: change y to y+1 then x+y ) in (f(10), y, f(19)) What this means is that, in the process of evaluating f, we *rewrite* the value then assigned to y. This is not the same as the re-binding of y that happened in the previous program, where we merely shadowed y. In the shadowing case, outside of the function f, the rebinding of y wasn't anymore in effect. But in the current case, when we rewrite the value assigned to y, the new value *sticks*. Until we change it again (or we leave the scope of y's original binding; but we won't do that). What this last program evaluates to will depend on whether the f(10) or the f(19) gets evaluated first. Let's settle on evaluations always going left-to-right. So first we evaluate f(10). When the interpreter gets to the line "change y to y+1", it begins by evaluating the expression y+1. This now has the value 3. We then rewrite the part of memory where we're holding the contents of y. So now y will have the value 3 instead of 2. We then evaluate x+y, which is 10+3, and we return 13. So now f(10) has evaluated to 13. We continue on to evaluate y. Now y is still 3! So now we've partially evaluated our final line to (13, 3, f(19)) and we have to evaluate the last f(19). When the interpreter gets to the line "change y to y+1", it again begins by evaluating y+1. Now this is 4. We then rewrite y to be 4 instead of 3. When then evaluate x+y, which is 19+4, and we return 23. So now our program has been fully evaluated to (13, 3, 23). Here we finally have introduced a fundamentally imperatival element into our language. It's not hard to *emulate* what's going on here while staying purely functional, but having the ability to change y like this as a native capacity of the language is a significant milestone. It's not entirely a good thing; nor is it entirely bad. But theoretically it makes for very important differences. Now we're ready to introduce our second important novelty. Note the second-to-last line in this program: let y be 2 in let x be y in let w alias y in (y, x, w) This program will evaluate to (2, 2, 2). So far, the "let w alias y" line seems to work much the same as the "let x be y" line. In each case, the newly-introduced variable ends up having the value 2, which is the same value y had. The difference between the two will only show up when we combine aliasing with mutation. Consider: let y be 2 in let x be y in let w alias y in change y to y+1 then (y, x, w) This will evaluate to (3, 2, 3). The interpreter begins by setting up our y, x, and w variables just as in the previous program. When the interpreter gets to the "change y to y+1" line, y, x, and w all begin with the value 2. Now the interpreter evaluates y+1, which is 3. It rewrites the part of memory where it's holding the contents of y. So now y is 3. x is still 2. The only relation x had to y was that x was bound to a value that was a function of the value y then had. In this case, it was just the function y, but it could have also been another function, such as y+1. But w on the other hand stands in a new, different relation to y. It doesn't just contain a *copy* of the value y had at some stage in the program's evaluation. Instead, we've introduced w to be an *alias* or synonym for y. So whatever the interpreter does to y will thereby also have been done to w. Since we've changed y's value to 3, w has also thereby been changed to 3, behind the scenes. We couldn't have said something like: ... let w alias y+1 in ... because the aliasing declaration needs to introduce a coordination between w and *another variable*, not just an expression that's a function of some variable. Well, we could allow that and let it mean something like this: ... let anonymous_variable be y+1 in let w alias anonymous_variable in ... but if we have no independent way to use the anonymous_variable, this wouldn't have any difference in practice from just using: ... let w be y+1 in ... The combination of aliasing and mutation that we're seeing here is ubiquitous in programming. You can do aliasing *without* mutation, but in practice there wouldn't be much point. Your aliasing declarations would have the same effects as your let bindings. (As we just said, there may just be more restrictions on what can appear on the right-hand side of an aliasing.) The combination of aliasing and mutation also introduces hyper-evaluativity into our programming language. To see this, we need to introduce one final tweak; but now we're back to only a conceptually modest step. Notice that doing this: let f be (function y: BODY ) in ... f(EXPRESSION) ... is essentially just doing this: let y = EXPRESSION in ... BODY ... Call those two programs Alpha and Beta. Now what we want is something that stands to the following: let w alias y in ... BODY ... in the same way that Alpha stands to Beta. We'll write it like this: let f be (function alias w: BODY ) in ... f(y) ... What does this new syntax let us do? Consider the following: let f be (function alias w: change w to w + 1 then w + 2 ) in let y be 1 in (f(y), y) When the leftmost f(y) is evaluated, the interpreter enters the function f, and instead of just binding w to *a copy of* y's value, it makes w be a temporary alias for y. So now when we change w's value to w + 1, that is y + 1, that is 2, the value of y will also thereby be changed, behind the scenes. The function call then returns the value w + 2, which at this stage in the program's evaluation, is 2 + 2. So now our final line has partially evaluated to (4, y). At this point y has the value 2 not the value 1 it started with. So our final line evaluates to (4, 2). Now for the main event. Here our function will take multiple arguments, an option we mentioned a while back. This time both of the arguments are alias arguments. let h be 1 in let p be 1 in let f be (function alias x, alias y: change x to x + 1 then let z be x + y in change x to x - 1 then z ) in (f(h, p), f(h, h)) What happens? We begin by evaluating f(h, p). The interpreter makes x be a temporary alias for h and y be an alias for p; at this point all of these variables will have the same value 1. We then change x (and so also h) to 2. We then evaluate the expression x + y, which at this stage is 2 + 1, that is 3. We then evaluate x - 1, which is 1, and for the hell of it we change x (and so also h) back to 1. That doesn't change the value of z, which is still 3. We return that value. So now we've partially evaluated the final line and we have (3, f(h, h)). At this stage both h and p are again 1. We now go evaluate f(h, h). This time we make x and y both be aliases for h. All three variables will have the same value 1. We then change x (and so also h, and so also y) to 2. We then evaluate x + y, which at this stage is 2 + 2, that is 4. We then change x (and so also h and y) back to 1. z remains at 4, and that's what we return. So now our final result is (3, 4). Notice what's happened. When we evaluated f(h, p) and f(h, h), h and p had the same value. They coreferred to the value 1. However, neither h nor p was "aliased" to the other. These variables were not semantically coordinated. They just *happened* to have the same value. (Well, it wasn't an accident, since the program is deterministic. But it's what I earlier called "semantically de facto" coreference.) And now the expressions f(h, p) and f(h, h) evaluate differently. That is, their extension is sensitive not merely to the values of their arguments, but also to what coordination exists between those arguments. If we had aliased y to h and then called f(h, y), it would have given us the same result as f(h, h). Here we see hyper-evaluativity in what I hope is a familiar shape. As I said, the machinery underwriting this is ubiquitous in programming. The standard semantics for what we've done here is what I'll discuss later as a "proxy semantics." The variable h isn't directly associated with the value 1, instead it's assigned an index into a heap of memory, and then that index is associated with the value 1 (by writing that value to that position in the memory heap). But even when that's what happens underneath the hood, the number 1 is still what we call "h's value." After all, if you evaluate h+1, the result is the number 2, not some position in a memory heap. With a proxy semantics, you will have enough structure to track hyper-evaluativity. You count two terms as coordinated or aliased when they don't just have the same value, but they're also associated with the same proxy. However, this is more structure *than we need* for hyper-evaluativity itself; and it opens us up to some philosophical doubts that we'll engage with later. Strictly speaking, you'd only need a proxy semantics to implement *mutation*. [I'm reading about other more subtle ways to implement mutation as well, with delimited continuations. But proxies are the standard way to do it.] And it's possible to have hyper-evaluativity *without* mutation---for instance, if the language had primitive hyper-evaluative functors or predicates. As we'll see in Section 13, we can give a semantics in that case without needing to bring proxies into the story. It's just that in practice, programming languages always do get their hyper-evaluativity from mutation. In some languages it's possible to inspect and manipulate not just h's value, but also the proxy h is assigned. Introducing this further capacity into a language has some pros and cons on the programming side. Philosophically, I think it just gets in the way. Philosophically, it's most useful to think about languages which are just expressive *enough* to introduce the representational capacity we're interested in, and no more. We especially want to avoid mixing up the representational capacity with its underlying implementation, if the details of that implementation aren't essential to the phenomenon. This is why I think it's useful to think about impoverished languages like the one I've presented. Sometimes the term "pointers" is used to talk about the aliasing/mutation machinery we've been looking at here. But strictly understood, I think "pointers" are specifically (i) the aliasing/mutation machinery implemented via a proxy semantics; and where (ii) the language also has the capacity to inspect and manipulate the underlying proxies. "References" is sometimes used as a more generic term, to mean the aliasing/mutation machinery, without necessarily including (ii), and perhaps also without commitment as to what the underlying implementation is. But for the most part, the usage seems to be pretty lax. Passing arguments to functions in the way we do in: let f be (function alias w: BODY ) in ... f(y) ... is called "passing by reference"; and passing arguments to functions in the way we do in: let f be (function y: BODY ) in ... f(EXPRESSION) ... is called "passing by value." Orthodox uncoordinated semantics only considers predication where the arguments are passed by value. The values in question may be extensional, or they may be intensional. It's all still passing by value. Alias-like relationships between the arguments have no effect on the result. On the other hand, coordinated or hyper-evaluative semantics says some predication involves passing arguments by reference, not by value. [Comment: Programming languages standardly have boolean predicates, which compare values for equality, being greater than, and so on. Languages that have native "pointers" or "references" standardly have (at least) two such equality predicates. One tests for whether two arguments have equal value. With such a predicate, this program: let x be 3 in equalvalue?(x, 1 + 2) would evaluate to the truth-value true. The other equality predicate tests for (something like) whether its two arguments are coordinated or aliased. Let's call this predicate "hyperequal?" With such a predicate, this program: let y be 3 in let x be y in let w alias y in (hyperequal?(y, w), hyperequal?(y, x), hyperequal?(y, x+0)) would evaluate to the 3-tuple whose first member is true, whose second member is false, and whose third member either fails to evaluate (because a complex expression is syntactically incapable of being the target of an aliasing), or is false. "hyperequal?" should not be assumed to be a metalinguistic relation. It's true this relation isn't just a function of the values of its arguments. But neither does it require an ability to refer to, quantify over, or take as values the language's own expressions. As I suggested in Section 3, a reasonable natural-language expression of the "hyperequal?" predicate would be something like: "is indubitably the same as." Confusingly, different real programming languages use the symbol "=" in different ways. Sometimes it's an binding operator, like our "let...be" in "let y be 3". Sometimes it's used as the "equalvalue?" predicate. Sometimes it's used as the "hyperequal?" predicate. Sometimes it's used as several of these, depending on context. Something else I've found confusing is that many languages have a contrast between "equivalence" and "pointer identity"; and this is related to, but *not the same as*, our contrast between equalvalue? and hyperequal? Once a language is able to mutate variables, it's often able to express mutable values as well. For example, consider: let a be 1 in let b be 1 in let f be (function alias w: (function x: x + w) ) in let alpha be f(a) in let beta be f(b) in ... Here alpha and beta are mutable "function closure" values. So long as a and b aren't mutated, alpha and beta will have the same extensions. So they're equivalent in a sense. However, this equivalence does not survive arbitrary mutations. If we say: ... change a to 2 then ... but leave b alone, then alpha and beta will have different extensions. In such a case, the two values alpha and beta might be regarded as (sometimes) equivalent but not numerically identical or "pointer identical." (This isn't a standard example of this contrast; normally the notion of equivalence isn't defined for function closures. But it's an example that builds only on resources we've explained here.) As the appearance of mutation and passing by reference in this example suggests, the contrast between equivalence and pointer identity is closely related to our contrast between equalvalue? and hyperequal? Consider the previous example extended like this: ... let gamma be alpha in let delta alias alpha in ... Here gamma and alpha *are* pointer identical: mutations to a won't disrupt the extensional equivalence of gamma and alpha. However, gamma and alpha aren't aliased or hyperequal. If we go on to mutate one of *those variables*, it does not affect the other: ... change gamma to 0 then ... On the other hand, delta and alpha are aliased or hyperequal (and so as a result are also pointer identical). End of comment] Section 9 --------- TODO: Summarize and connect these issues where appropriate to Kartunnen 1976, work deriving from Kamp and Heim, Landman's "pegs", Vermeulen's "stack semantics" for DPL, papers by Dekker, Aloni, de Bruijn, Haas-Spohn, Muskens. Other references welcome. Keep close track of when a device is needed, or used for, (i) donkey-like referential dependence; (ii) failed or confused reference (whether one-many or many-one); (iii) not just (i) and (ii) but also coordination/hyper-evaluativity. Summarize and connect to Fiengo & May's 1994 and 2006 books Forbes, "The indispensability of Sinn", Phil Review 99 (1990), 535-63 proposes that the sense of a name is just "the subject of THIS mental dossier (the one hereby being employed)." Recanati's more attractive idea that we think *with* dossiers, rather than by representing (even in sense) anything *about* them. I'm sympathetic with Fine's claim (pp. 67-8) that "mental files" are best understood as a book-keeping device for *tracking* facts about coordinated contents than as explanations of how coordination is achieved or what it consists in. Fine discussed further below. Re Cumming: I emphasized that what's going on when we have mutation is importantly different than mere binding, of the sort we have in lines like "let y be 3". Mutation might usefully be understood as shifting an assignment: in the way that happens, for example, in dynamic semantic treatments of indefinites or tense. I mentioned Cumming's paper "Variabilism" 2008 earlier. One main claim of that paper is that names are like variables in being bindable. Another main claim is that epistemic operators should be understood as assignment-shifting terms. Under the scope of "Biron believes", we should shift the assignments of names names to match Biron's doxastic outlook. I think this is best understood as simultaneous shadowing/rebinding of many names at once, rather than as mutation, but I'm not certain. In any event, as I hinted at before, I don't think Cumming's assignment-shifting strategy suffices to explain the phenomena he's looking at. In the same way that Maria can truthfully report: (33) Biron thinks Katherine is Rosaline. I think others can truthfully report: (34) Each girl fooled Biron, so he didn't know she was she. and in this case, so long as both feminine pronouns are bound by "each girl", there doesn't seem to be any opportunity to interpret the complement clause of "didn't know" in a way that assigns the pronouns different values. We need some account of what's going on in (34), like a hyper-evaluative account, and I'm thinking the machinery that explains (34) can explain (33) too. Section 10 ---------- Let's survey different ways a semantics might encode or keep track of coordination information. The first strategy is the "proxy semantics" we already mentioned. This strategy is dominant in CS and in many linguistic accounts. The proxies are related many-one to entities in the domain of quantification; and instead of assigning entities directly to our variables, we interpose the proxies. Hence we track coordination via a kind of "indirect reference." (This is importantly different from what Fregeans mean by the same term.) Two variables are coordinated when they're associated with one and the same proxy; they're coreferential when the proxies they're associated with (whether identical or not) are associated with one and the same object from the domain. In some treatments, these proxies are the variables themselves, in others they are integers, or indices into an array, and so on. (See King < 2007, "pegs", and so on.) Various views floated in the philosophical literature have at least this much structure, and so can be seen as instantiating this implementation strategy. (Richard's 1990 "Russellian Annotated Matrices," Larson and Ludlow's 1993 "Interpreted Logical Forms," and so on.) One source of discomfort with those strategies is that the choice of indices can seem arbitrary. This can be alleviated in various ways, for example by working with equivalence classes, but the discomfort is there. Fine 2007, p.11 and 27 also complains that views of this sort make semantic values too *typographic*. I share that discomfort, but in the end I doubt his own view can retain the moral high ground about this. I think he's going to end up vulnerable to similar discomforts. More about this later. In fact, I regard the difficulty here as an instance of a more general problem, akin to (or perhaps a form of) Benacerraf's Problem. This first bothered me intensely when thinking about nodes in graph theory. A node is not the same thing as its label---sometimes we work with unlabeled graphs, or with graphs where numerically the same object labels multiple nodes. Rather a node is something whose whole nature intuitively should be exhausted by its role in organizing how different edges relate to each other (and similar things should be said about edges). We'd like to say that there's no more to the root node of this directed graph: * -----> * -----> * | v * than its numerical differences from the other nodes in the graph, and its position in the graph. Questions about its identity or difference to nodes in other graphs, or about whether it's red or prime, should just have no meaning---or perhaps it should be different from nodes in any other graph, and lack all properties like redness and primeness. Some of the intuitions here may be negotiable. For example, I'd like to say that one and the same graph depicted above *is* a subgraph of different larger graphs, not just that it's isomorphic to parts of the larger graphs. Maybe it's not possible to consistently say such things. But if it were possible, that's what I'd like to say. But now if we look at a standard presentation of graph theory, we'll be told that a graph is a tuple of a set of nodes, which may be numbers or apples or anything you like, it doesn't matter, together with an edge relation on those nodes with such-and-such properties, a labeling function, and so on. A construction where the nodes may be apples is very different from the intuitive conception. If we were to pursue a proxy-like semantics for a hyper-evaluative language, what we'd really want would be for the proxies to be something like our intuitive conception of nodes. What we get instead are implementations like the standard set-theoretic implementation of graph theory. That's one source of discomfort with a proxy-like semantics. I think much can be done to address this discomfort, and make the choice of proxies more natural. (For example, de Bruijn indexing of the sort we use in Section 13 helps with this.) Another source of discomfort is the thought: why should I understand these sentences: (90a) Alice is as smart as Alice (91a) Superman is stronger than Clark Kent as interpreted by these semantics, to be saying things directly about Alice and Superman, albeit in a coordination-sensitive way. Why shouldn't I understand them instead as saying: (90b) A is somehow related to something that's as smart as what A is so related to [where A is the "Alice"-proxy]. (91b) S is somehow related to something that's stronger than what CK is so related to [where S and CK are the relevant proxies]. I'm not saying the semantics conflates the (a) and (b) sentences. It would give those object-level sentences different truth-conditions. [In CS terminology, the proxies are "denoted values" but need not be "expressed values." (Compare our contrast in Section 3 between "semantic values" and "referents.") And when proxies are expressed values, as in (90b), that won't mean the same as sentences like (90a) in which they're merely denoted.] But these theories do make such worries about the meanings they describe quite salient. At root these may just be familiar indeterminacy worries---and perhaps it is an adequate response to them to say we *just have* an intuitive understanding of saying things about Alice in a coordinated way, and this is how we mathematically model the entailment relations and so on of the intuitively-understood meanings. Still, when we're trying to make intelligible a representational phenomenon that's not part of the canonical theoretical toolbox, it would be nice not to confront this sort of worry so vividly. [Jeff: your 2007 view improves on the < 2007 view by seeming less typographical. But downsides: 1. you no longer have any prospect of links across propositions, whereas the earlier view might have 2. is your approach subject to the same constraints as the Salmon/Soames view criticized in Section 4? it'd be nice to get some idea how you'd introduce linking when there's donkey anaphora 3. maybe you're best understood as having moved from one sort of proxy (the variables themselves) to another (whatever constitutes the nodes in the trees) 4. the semantics isn't specified enough yet to code it, and I'm not sure it's going to be straightforward how to finish the job. In what way exactly is the semantic value of "Fxy v Gy" a function of the semantic value of "Fxy", the semantic value of "Gy", and the pattern of variables "x,y,x" (or whatever the pattern is..."Fxy" need not be atomic)? Do the first two arguments play a genuine role? How does it work? I think you'll likely end up pursuing one of the other strategies described here. Not that that's a problem; I'm just trying to get clear about where you stand now. ] My own current preference is for a semantics that takes a second strategy. I'll call the second strategy a "grouped assignment function" strategy. I'll explain this strategy in detail in Section 13. It rests on two basic ideas: first, instead of thinking of semantic values as relative to assignment functions, we can equally think of them as sets of assignment functions (the ones on which a sentence is true). Second, we can work with a finer-grained sort of assignment function. Instead of merely mapping a variable to an object, an assignment function will also group that variable together with other variables that are mapped to the same object. So instead of looking like this: w --> Jack x --> Jack y --> Jack z --> Alice our assignment functions will instead look like this: {x} --> Jack {w, y} --> Jack {z} --> Alice Here we don't assign any intermediate proxies to our variables. Instead, the assignment function's groupings do similar work. Mathematically, this is not *all* that different from the first strategy. It will still be vulnerable to Fine's typographical complaint. (At least, if we construe functions in the standard way as sets of pairs. Perhaps we shouldn't do that.) Sometimes I think it does better on the arbitrary proxy worry. (Though other times it doesn't look much different to me than the view where variables are their own proxies.) It doesn't invite talk of "indirect reference"; so the worries about (90a) vs (90b) don't arise. This semantic strategy seems no less nor more vulnerable to indeterminacy worries than any ordinary semantics. A third strategy for a coordinated semantics is based on the "combinatorial" or "variable-free" semantics explored by Bealer, Jacobson, Szabolcsi, and others. The basic idea behind combinatorial logic is that variables disappear on analysis. Instead of "^x: Fx", we have just the predicate "F", whose semantic value may be a function from objects to truth-values. Instead of "^x: ^y: Fyx" we have just a predicate "C F" whose semantic value is the application of a function or "combinator" C to the function that's the value of "F". What C does is invert the order of the arguments supplied to the function value of "F". It's possible to do combinatorial logic with a very spare inventory of primitive combinators, and define others like C in terms of them. The standard primitive combinators are S and K. It's not important what these mean. I'll just observe that we should be able to turn any standard combinatorial semantics into a coordinated semantics by replacing S with different versions, some of which are understood as introducing coordination, and others not, and some as retaining existing coordinations, and others not. There never are any *variables* whose occurrences get coordinated, but we instead operate on coordinated and uncoordinated argument places in the same way that a standard combinatorial semantics does. But this is only an expectation, uninformed by any attempt to work it out. I find these approaches very interesting, but won't pursue them further here. (Fine discusses these approaches at 2007, pp. 18-21.) A fourth strategy is the "relational semantics" Fine sketches in 2007. We'll discuss this next. Section 11 ---------- I understand Fine 2007 to have three thematic components. The first component says things like: * contents should *somehow* reflect coordination relations * we should keep track of such relations across bodies of contents, too (pp. 55-6, 77-8) * they should feed into some useful notion of consequence * they can do much of the work that Fregean senses were supposed to do (and do it better) There's little I disagree with about any of this. And I think other advocates of hyper-evaluative or coordinated semantics can, and will want to, subscribe to much of this too. A second component is various ideas in the metaphysics and epistemology of semantics, such as his distinction between semantic requirements and semantic facts, what he says about transparency, his ideas about how intersubjective coordination should be understood. I'm sympathetic to much of this as well; I separate it into a second component because I think these claims may be more negotiable for an advocate of hyper-evaluative semantics. The last component is the details of Fine's semantic implementation. This is what I understand his talk of "semantic relationism" to specifically mean. Fine argues that semantic features shouldn't be understood as built up out of any elements assigned to individual terms---not objects from the domains, nor proxy objects either (though his resistance to such approaches is left implicit). Or anything else of that sort. Instead, they should be understood as only being functions of *sequences or patterns* of terms. Fine gives only a few hints about how this should go. He sketches semantics for a fragment of an extensional language on pp. 25-31. On pp. 53-7 he introduces some changes: (i) now names are in the language; and (ii) now the semantics generates structured contents rather than just extensions. The rest of the semantics is left an exercise for the reader. And in a way that's very satisfying. But at the same time, it leaves me unsure whether I'm understanding even such elementary matters as the binary connectives in the way he intends. Well, this should at least be *a* plausible way to understand him: Orthodox semantics computes the semantic value of a complex formula "Fxy & Gy" in the following sort of way. [[ Fxy & Gy ]] depends in a certain way on each of [[ Fxy ]] and [[ Gy ]] <-- recursive bases As we recurse towards the base clauses, we work with smaller and smaller elements. We just have to apply base clauses to several such smaller elements. Fine's innovation is do the recursion differently. Instead it will look like this: [[ Fxy & Gy ]] relative to @ <-- @ is a pattern that encodes which occurrences of the same variable are coordinated (quantificational binding has the effect that not all of them will be; but for i present purposes, I'll suppress this) depends in a certain way on [[ Fxy, Gy ]] <-- at this step we calculate the semantic value of a *sequence*; we're not calculating the semantic values of several elements individually depends in a certain way on either of [[ Fxy, G, y ]] or [[ F, x, y, Gy ]] <-- here we have a choice of which way to continue the computation, both of which deliver the same final result; contrast the orthodox semantics, where we required *each* of two computations which depend (in different ways) on [[ F, x, y, G, y ]] <-- recursive base At the recursive base, [[ F, x, y, G, y ]] will be a set of sequences of property extensions (let them be functions to truth-values) and objects. This set will be constrained by the coordination between the two occurrences of y: every sequence in the set must have the same object in the two positions. An earlier step in the computation would look like this. Let's take the way that [[ Fxy, Gy ]] depends on [[ F, x, y, Gy ]]. That would come from a derived rule like this: [[ Fxy, Gy ]] = { | exists f,a1,a2: a = f(a1,a2) and in [[ F, x, y, Gy ]] } Note in such clauses we only need to supply *the objects* a1, a2 to the function f. This semantics is extensional. Differences in coordination are tracked in the semantic machinery, but no base predicate extensions depend on them. An idea fundamental to Fine's whole book, though, is that this won't always be so. Some predicates like "believes" (or functors like "the proposition that") will be hyper-evaluative. They will need to differ in their extensions when supplied with differently-coordinated arguments. It surprises me that Fine does so little to make this explicit. He does say on p. 57 that the coordination scheme now enters into semantic computations in a way it didn't before. But the discussion there makes it look like its only role is to determine what coordination links get into the structured proposition being generated. Also, he says: "There is no difference in what it takes for the sentences 'Cicero wrote about Cicero' and 'Cicero wrote about Tully' to be true, even though there is a difference in their coordinated content." (p. 59) without saying explicitly that for other sentences that is not the case. Some things we say are hyper-evaluative, and so their extension will be sensitive to coordination differences. (Acknowledging this will put some pressure on us to abandon talk of a sentence's "truth-conditions." In the general case, a predication's truth won't just depend on what its arguments are and what extra-semantic *conditions* those objects satisfy. It may depend also on how those arguments were coordinated in the predication.) Fine comes close to saying that "believes" is hyper-evaluative at p. 139n11. As I said, though, this is clearly fundamental to, and implicit in, most of his book. [Notes: 1. Fine introduces an interesting innovation on pp. 116-17 without making explicit how will fundamentally change the semantics hinted at earlier in the book. The idea is that 'believes[...]' can't be evaluated on its own, but "only in the context of other formulas with which it might be coordinated." That is, satisfying the open formula: (92) x believes Fa ... x believes Ga requires a subject to have *coordinated beliefs* about a's being F and a's also being G. What I think this demands of the semantics is that some of the semantic computations operate on multiple complex terms in a sequence (terms that aren't always adjacent) simultaneously. Though I find this idea interesting, on balance I'm disinclined to pursue it. Consider an attribution like: (93) Each of Sam and Bess believes Fa, and Sam also believes Ga. with the two occurrences of 'a' coordinated. Presumably the first belief-predicate "believes Fa" only occurs once, and if its semantic computation is coordinated with the computation of "believes Ga," then for Bess to satisfy the first predicate, the way she thinks of a will have to be intersubjectively coordinated with the way Sam (or she herself?) believes a is G. (Just as Fine says needs to be the case for Sam to satisfy the first predicate.) But I don't see why that must be so. Suppose Bess has a single name for a, and does not herself belief a is G. Her name is also intersubjectively coordinated with only one of Sam's two names for a, call it a1. Sam believes a is F with both of the names she has for a, but only believes a is G with her a2 name. In this case, I'd still expect the report (93) to be true, but it looks like Fine's strategy would preclude that. 2. Even for belief reports taken singly, rather than as part of a sequence, Fine is inclined to think they should be sensitive to more than just their own arguments and coordination among them. They should also depend on coordination relations to other attitudes not part of the present conversation (pp. 120-1). I don't know whether I agree with this. But if it's true, I observe it's easier for me to imagine how it might be implemented on, for example, a proxy semantics, than on Fine's own relationist semantics. 3. On p. 139n13, Fine says approvingly that the CS notion of "pointers may...[be] essential to the correct representation of logical form." But as we mentioned, pointers are always implemented via proxies; so they look like an alternative to Fine's relationist semantics, not a vindication of it. Perhaps what he means here would be better put by saying that some ground shared between different proposals about the right semantics for hyper-evaluativity is what's essential. With this I wholeheartedly agree. 4. Trivia: For Fine, different occurrences of the same variable will always be coordinated if they're simultaneously free. I say not. Also I think Fine never has different variables being coordinated. I will do so. However, he never gave an account of how to handle lambda abstraction. If he did, he might well go the same way I do. Fine does allow typographically distinct expressions that aren't variables to be coordinated through anaphora ("John...he"); and different occurrences of a single name may or may not be coordinated. End of notes] Section 12 ---------- Quibbles about some details aside, I don't really want to oppose Fine's relationist semantics. It's not my preferred machinery; but I'm not not sure *which* of the different ways to implement hyper-evaluativity will in the end prove most satisfying. And as I said before, much of his book will appeal to *any* advocate of hyper-evaluativity. I only have two serious worries about Fine's discussion. The first worry concerns Fine's arguments that no non-relationist semantics can properly account for his "antimony of the variable." This is the challenge to explain how the semantic roles of x and y can be the same in "x>0" and "y>0", but different in "x>x" and "x>y". In these arguments Fine says things like this: "The aim of [the orthodox] semantics...is to assign a semantic value to each (meaningful) expression of the language under consideration. Suppose that an expression E is syntactically derived from the simpler expressions E1,E2...En. Then the semantic value |E| of E is taken to be the appropriate function f(|E1|, |E2|...|En|) of the semantic values of the simpler expressions. Given semantic values for the lexical items of the language...the semantic value of each expression is then determined." (p. 25) The expression "the appropriate function" here is doing a lot of work. For of course we don't want "Rab" to come out having the same semantic value as "Rba." It matters not only *which* items are being combined, but also *how* they are being combined. That will determine which function of those semantic values of "R" and "a" and "b" we work with. And now it's not obvious to me what is allowed to count as a different way of combining and what is not. I don't take it as a given, for example, that the only options for "different ways of combining" are different permutations of those three semantic values to a single triadic function. Why should an orthodox semanticist be barred from saying that the way ">" combines with the semantic values of its arguments in "x>x" is different than the way it combines with the semantic values of its arguments in "x>y"? King endorses something like this in his 2007 p. 220n3 (with acknowledgment to discussions he and I had about it). What he says doesn't get developed, and maybe it's a move the orthodox semanticist shouldn't be allowed. But I'm not sure why that should be so. Or perhaps, when the orthodox semanticist goes down this path, what he ends up will just be something like Fine's view, perhaps in other clothing. I don't know. I'd like to know. My second worry has to do with Fine's account of the essential role coordination plays in learning. He writes: "We wish to explain how the hearer might be justified in inferring that Cicero is a Roman orator when he already knows that Cicero is Roman and is told 'Cicero is an orator,' though not when he is told 'Tully is an orator'... [I]n the first case, the proposition is not merely added to [the hearer's information] base but appropriately coordinated with the propositions in it---and, in particular, with the proposition that Cicero is Roman. But in the second case, the proposition is not coordinated with the other propositions in the base... It is evident that the inference to Cicero being a Roman orator will be justified in the first case, when the premises are coordinated, though not in the second case, when the premises remain uncoordinated." (p. 83) Now, consider: What *makes it* appropriate to coordinate incoming propositions with ones already in the base in one way rather than another? Is it up to the hearer to coordinate them however which way? Or is it a cognitive given that the propositions should be coordinated one way rather than another? What metaphysical picture should we have of this? I expect *there is* a difference between the cognitive experience of hearing incoming information coordinated one way with what you already believe, and hearing it coordinated another. But are we supposed to take that cognitive experience as a primitive explainer? I'd expect, instead, that we should have some picture of what *makes it correct* for the subject to hear information coordinated the way he does. Then on top of that we'd discuss justified mistakes, and so on. Maybe Fine has a picture of this in mind. What I'm expecting is a story that will look something like this: in addition to the set of coordinated beliefs in a subject's base, there are also some facts about the subject that amount to extra book-keeping machinery. For example, perhaps the sentences he originally acquired those beliefs by accepting (though so much detail won't be available in general). This extra book-keeping machinery might then ground the interpretation of an expanded sequence, consisting of sentences for his existing beliefs, together with the incoming sentence. And now how it's appropriate for his new belief to be coordinated with the old ones will derive from how the semantics interprets that expanded sequence. This very much fits the spirit of Fine's overall relationism. Alternatively, perhaps the book-keeping machinery will serve to ground facts about how the incoming sentence is inter-subjectively coordinated with attitudes had by the subject's interlocutors. In general, though, we want an account of learning that's available without other subjects. I'm not sure that extra "book-keeping machinery" will really be needed, but I expect it will. And if it is, it becomes a delicate question exactly what advantages Fine's approach still has over the "typographic" approaches he rejected. Won't the book-keeping machinery be a kind of mental typography? This is my second worry. It's just a worry. There are lots of choices to be made in developing what Fine says. But at any rate, this is what I meant earlier when I said I had doubts whether Fine's own view will retain the moral high ground about entanglements with typography. [Hovda, "Semantics as information about semantic values" mentions a worry like this as well (manuscript p. 8).] Section 13 ---------- Now I'll sketch my own preferred formalism and semantics. As I said earlier, I think it's quite open what is the philosophically most satisfying way to do this and I'm still studying the issue. (The semantics below are also rapidly evolving.) I'll think only about coordination of variables. For semantic purposes, I'm inclined to treat names as just variables which may (or may not) have extra constraints on their assignments. I think this formal strategy is neutral on questions about whether names have important semantic differences from variables, or whether names can be bound, or can have their assignments shifted by operators, as Cumming 2008 argues. I'll ignore context-sensitivity; it's orthogonal to the issues we're considering. A notational decision I've made is to make passing by reference be the default, and require a special symbol to express passing by value (and in so passing, breaking coordination). Let's use the prefix "$" for this. There's a choice where to put this symbol. Should we say: (lambda $x: BODY) y Or should we instead say: (lambda x: BODY) $y I've chosen to go the latter way. When an argument is supplied to a lambda abstract without $, its coordinations get carried along with it. For example, in: (94) Fx v (lambda z: Gz)x the argument G is supplied will still be coordinated with the argument F is supplied. This means that: (95) Fx v (lambda z: Hxz)x is not equivalent to: (96) Fx v (exists z: z=x & Hxz) If H is hyper-evaluative, it sees coordination between its arguments in (95), but not in (96). What (96) is equivalent to is: (97) Fx v (lambda z: Hxz)$x When a complex expression is supplied as an argument to a lambda abstract, the value passed in is not coordinated with any other terms, even ones happening to have the same value. We can also use the $ operation with atomic predications: F($x, $y, y) $x = y Although in many cases, such as the second, it will never make an extensional difference. I use "=" to express the familiar, fully evaluative relation of numerical identity. [Comments: 1. Where E is a formula with a single occurrence of y, which is both: (i) free, and (ii) wouldn't capture x, that is, it doesn't occur in any term of the form "lambda x: F", "exists x: F", or "every x: F", E[y->$x] should turn out equivalent to "exists y: y=x & E". When we see an expression of the form "...F($x,...)..." there will be a question about what "scope" to let the $x expand to: that is, there will be several Es such that that formula counts as an E[y->$x]. We adopt the convention of always interpreting that with the largest context E in which x occurs free. So: (98) exists x: Fx v G$x is: (99) exists x: (Fx v Gy)[y->$x] rather than: (100) exists x: Fx v (Gy)[y->$x] To write (100), we'd need to use instead: (101) exists x: Fx v (lambda z: G$z)x 2. The $ notation isn't able by itself to express coordination-breaking operations in full generality. Using subscripts to represent coordination, we might sometimes want to say: (102) exists x: F(x1,x1,x1,x2,x2) Here all five variables are referentially bound by the same quantifier, but they've been broken up into two coordination groups. To do this with my notation, one needs to combine $ and lambdas, for example: (103) exists x: (lambda z: F(x,x,x,z,z))$x 3. One doesn't want to evaluate (lambda y: ...)x substitutionally. For consider: (lambda y: Hxyz v (exists x: Rxy))x ---------------------- and suppose H or R may be hyper-evaluative. Let BODY be the underlined formula. We can't just evaluate BODY[substitute x for free ys], because then the second argument to R will be captured by the existential quantifier. We can't just evaluate BODY[substitute y for free x], assigning y the same value as x, because then we might lose track of coordinations between z and x. (The whole formula may be embedded inside another (lambda z: ...) x).) The best way to handle this is to evaluate BODY directly, but in a way that introduces a new coordination between x and BODY's ys. end comments] SYNTAX * Atomic predicates F,G,H,... of adicity >0 * Variables w,x,y,z,... * If F is a predicate of adicity n, and x1..xn are variables, then Fx1..xn is a sentence, that is, a formula with adicity 0. * If x and y are variables, then (x = y) and (x hyperequal y) are sentences. * Anywhere a variable x can appear, so too can $x. (Iterations are not allowed.) * If E1 and E2 are sentences, then so too are (not E1) and (E1 or E2). * If E is a sentence, and x1..xn are variables (free in E?), then (lambda x1..xn: E) is a predicate of adicity n where x1..xn are no longer free. * If E is a sentence, and x a variable (free in E?), then (exists x: E) is a sentence where x is no longer free. (E1 & E2), (E1 horseshoe E2), (every v: E), and so on can be defined in the usual way. We'll introduce the semantics in stages. First, we'll give a semantics for a orthodox, uncoordinated and extensional language (lacking $ and hyperequal). Then we'll discuss how to extend it to a hyper-evaluative language. STACKS de Bruijn 1978 had the idea to eliminate variable names as follows. The "lexical depth" of an occurrence of a bound variable is how many scope levels that occurrence is away from the operator that binds it. He counts the most local scope as 0, the next closest outer scope as 1, and so on. In this example: (104) ^y: (^x: y x) y 1 0 0 the lexical depths are indicated below the variables. Note that y has a lexical depth of 0 in its second occurrence, but a depth of 1 in its first occurrence, because it's there more deeply embedded (inside the "^x:..." term). Using this technique, we could eliminate arbitrary variable names and just replace every variable-occurrence with an indication of its depth from the operator it's bound by: (105) ^< >: (^< >: <1> <0>) <0> This has some computational and metalogical advantages. We're going to draw from this technique. Doing so isn't *necessary* for a hyper-evaluative semantics, but it makes some things cleaner. We won't require the object language to be written in form (105); instead we'll keep track in a "binding stack" of which variable symbols have been bound by which operators. We'll also have an "environment stack" that maps bound variables, specified by their lexical depth, into the domain. Variables that are never bound will be handled in the familiar way (these can be thought of as constants). These two stacks divide up the work of a standard assignment function into two components: first, mapping variable symbols to lexical depths; second, mapping lexical depths (and unbound variables) into the domain. When we move to doing hyper-evaluative semantics, this division of labor will be useful. We'll complicate the environment stack while leaving the binding stack the same. Let's settle some general issues about our stacks. We'll understand a stack of length n into X to be a function from 0..n-1 into X. If s is such a stack, we'll let #s indicate the length of s. We'll let s[0] be the element of X that s maps 0 to, and so on. It will be useful to refer to stack indices backwards from the end; so where #s = n, we'll let s[-1] be s[n-1], s[-2] be s[n-2] and so on. It will be convenient for us to number lexical depths differently than de Bruijn does. We'll use -1 for the most local scope, -2 for the next one, and so on. Since the environment stack needs to map into the domain not just the lexical depths of bound variables, but also free variables, we'll have the environment stack be a function from the natural numbers U the variables into the domain. I assume that no variables are natural numbers. (If they might be, then the domain of the environment stack should be a "tagged" or disjoint union of them, but I won't bother with that.) The length of the environment stack will still just be the number of integers it's defined on; the free variables don't affect its length. If s is a stack of length n into X, and x is a member of X, then "s push x" will be the stack of length n+1 that maps #s to x and is otherwise just like s. So: (s push x)[-1] = (s push x)[n] = x (s push x)[-2] = (s push x)[n-1] = s[n-1] = s[-1] ... (s push x)[-(n+1)] = (s push x)[0] = s[0] = s[-n] And "s ? x" will be the first of the indices -1, -2, ..., -n such that s[s ? x] = x, or x if there's no such index. That is, s ? x searches backwards from the end of the stack to return the first (negative) index mapped to x. The push and ? operations are blind to whether a stack also maps variables or anything other than integer indices into X. Finally, if is a sequence of integer indices and/or variables, and e is an environment stack, we'll understand e@ to be the sequence . ORTHODOX SEMANTICS A model M is a pair of a domain D and a lexicon L. The lexicon maps atomic predicates of arity n > 0 into sets of n-sequences of members of D. b will be a binding stack from integer indices into the set of variables. e will be an environment stack from integer indices U free variables into D. 1. Where F is an atomic predicate, [[ F ]] wrt M,b,e = L(F). The interpretation of "=" is the expected one. 2. Where F is a n-ary predicate, atomic or not, and x1..xn are variables, [[ Fx1..xn ]] wrt M,b,e will be: true if e@ is in [[ F ]] wrt M,b,e else false What's happening here is: we lookup each variable in the current binding stack b to see if it's recorded in the environment by its lexical depth (i.e., it's bound by some surrounding operator), or whether it's free. We map the environment onto the sequence of indices to get a sequence of objects from the domain, and check whether they are in the set which is the interpretation of the predicate. 3. Where E1 and E2 are sentences, [[ E1 v E2 ]] wrt M,b,e is: true if either [[ E1 ]] wrt M,b,e or [[ E2 ]] wrt M,b,e are true else false Similarly for [[ not E ]]. 4. Where E is a sentence, [[ exists x: E ]] wrt M,b,e is: true if some d in D is such that [[ E ]] wrt M, (b push 'x'), (e push d) is true else false 5. Where E is a sentence and x1..xn are variables (free in E?), [[ lambda x1..xn: E ]] wrt M,b,e is: { | d1, ... dn in D and [[ E ]] M, ((b push 'x1') ... push 'xn'), ((e push d1) ... push dn) is true } This language is first-order; we don't permit any operations on n-ary predicates except supplying them with n arguments. So x1..xn are bound by a single compound lambda operator; expressions like "(lambda x: (lambda y: Fxy))" aren't well-formed. Nonetheless, we still count 'xn' as having lexical depth -1 and 'x1' as having lexical depth -n. If E is a sentence and A is an assignment of E's free variables into D, then E counts as true on M and A just in case: [[ E ]] wrt M, a length 0 binding stack, and A taken as a length 0 environment stack is true. HYPER-EVALUATIVE SEMANTICS Now we let L map atomic predicates into sets of *grouped* n-sequences, and our environment will *group* the lexical depth indices and free variables that it maps into D. We'll have something like this: {-1} --> Alice {-2, y} --> Jack {x} --> Jack A grouped function (GF) from set A into set B can be understood as a pair of a function f from A into B and an equivalence relation ~ on A, such that for any a1,a2 in A, a1~a2 implies f(a1)=f(a2). We'll call the set of As equivalent to a1 under ~ the GF's GROUPING of a1. If C is a GF from A into B and C- is a GF from A- into B-, we'll say that C- RESTRICTS C just in case (i) A-,B- are subsets of A,B respectively; (ii) for any a in A-, C- and C assign a the same element of B-; and (iii) for any a1,a2 in A-, a1 and a2 are grouped together by C- iff they're grouped together by C. Our binding stack will be the same kind of ungrouped stack as before, and ? and push will work the same for it. With our new grouping environments, pushing operations raise the question of how the index of a newly pushed value should be grouped with existing elements in the stack's domain. We'll define "e push ungrouped x" to be the result of pushing x to the next index #e in e, such that #e isn't grouped together with any existing elements. We'll define "e push [i]" to be the result of pushing the existing value e[i] to the next index #e in e, such that #e is grouped together with i's existing group. And we'll define "e push *" to be the set of all possible pushings of an entity from D to the next index #e in e, with all legal groupings of #e together with existing groupings. That is, "e push *" is the set of all length #e+1 environment stacks that e restricts. Finally, we'll redefine e@ so that it's now ((e push [i]) push [j]) push [v]... Here's why we need to do this. Consider the formula: lambda y: Hxy This is a 1-ary predicate. But since H may be hyper-evaluative, this predicate's interpretation needs to attend to whether what's being passed to y is or isn't grouped with x. For instance: (lambda y: hyperequal? x y)x should be true, but (lambda y: hyperequal? x y)z should be false when z and x are ungrouped, even if z has the same value as x. This should also always be false: (lambda y: hyperequal? x y)$x What this means is that we can't let the semantic value of: lambda y: Hxy just be a set of 1-sequences of values for y. It has to be something more holistic. We're letting its semantic value instead be a set of grouping environments, defined over free variables like x as well as the lexical depth of y. In particular, the semantic value of: (lambda y: hyperequal? x y) will be the set of environments that group x and -1. Now our semantics goes as follows: 1. Where F is an atomic predicate, [[ F ]] wrt M,b,e = L(F), which will now be a set of grouping environments of length n into D. The interpretation of "=" and "hyperequal?" are the expected ones. 2. Where F is a n-ary predicate, atomic or not, and x1..xn are variables, [[ Fx1..xn ]] wrt M,b,e will be: true if e@ is restricted by some member of [[ F ]] wrt M,b,e else false If any of the xs is of the form $x, we use a variation. For example, suppose we have F$x1,x2,x3. Then the interpretation will be, not: true if ((e push [b?x1]) push [b?x2]) push [b?x3] is restricted by ... but instead: true if (e push ungrouped e[b?x1]) push [b?x2]) push [b?x3] is restricted by ... 3. Where E1 and E2 are sentences, [[ E1 v E2 ]] wrt M,b,e is: true if either [[ E1 ]] wrt M,b,e or [[ E2 ]] wrt M,b,e are true else false Similarly for [[ not E ]]. 4. Where E is a sentence, [[ exists x: E ]] wrt M,b,e is: true if some d in D is such that [[ E ]] wrt M, (b push 'x'), (e push ungrouped d) is true else false There are three natural options for quantifiers in a hyper-evaluative language. A "substitutional" quantifier would count: (exists y: hyperequal? x y) as true, because "hyperequal? x x" is true. A "purely evaluative" quantifier would validate existential generalizations only when a formula is satisfied by *every* way of referring to the same entity. With the quantifier understood in that way, the first comes out false but so too does: (exists y: x = y & not hyperequal? x y) because "x = x & not hyperequal? x x" is false. A third "anonymous reference" quantifier also makes the first come out false, but makes the second come out true. It's enough to validate existential generalizations of this sort when a formula is satisfied by a new, ungrouped ("anonymous") term referring to a given object. I've chosen to go with the "anonymous reference" quantifiers. If one went instead for the substitutional quantifiers, the purely evaluative quantifiers could be defined in terms of them, as follows: exists-eval x: E <==> exists-subs y: (every-subs x: x=y horseshoe E) <==> exists-subs y: (lambda x: E) $y 5. Where E is a sentence and x1..xn are variables (free in E?), [[ lambda x1..xn: E ]] wrt M,b,e is: { e+ | e+ is a member of e push * ... push * [n times] and [[ E ]] M, (b push 'x1' ... push 'xn'), e+ is true } Section 14 ---------- For a long time I've been an advocate of de re beliefs as a fundamental philosophical kind. (Though not of the satisfaction conditions for de re belief reports being such.) I've also long thought that there are interesting acquaintance requirements on these de re belief. I've also long had glimmerings of the ideas exposited above; but they were confused and partial. Now that I've gotten clearer in my thinking about hyper-evaluativity, I'm inclined to think it's the proper home for much of what I before wanted to hold about de re belief. I now think like this: fix an initial class of attitudes, call them the "acquaintance" class. There will then be a natural class of other attitudes that are coordinated with the starting class. And I do think what holds this class together is philosophically fundamental; the unity of such classes will be important to philosophy of mind and epistemology and philosophy of language and action theory and so on. However, did anything privilege the initial selection of the "acquaintance" attitudes? Are there any fundamental facts about whether, say, testimony-based attitudes should or shouldn't be included. I am now no longer very sure about this. Maybe there are. Maybe there aren't. At any rate, my own interest for now is to explore the connectedness relations that hold "de re beliefs" together, rather than questions about the right starting class. Jeshion once said (discussing Donnellan?) that the debate about de re attitudes threatened to just collapse into Frege's Problem. The way I'm now thinking of the issue, this is sounding more and more right. vim: ft=txt ts=8: