Hyper-Evaluativity
Jim Pryor
NYU

Rough Draft 2 - 11 June 2010
Feedback very welcome.


Abstract: Predicates are "hyper-evaluative" when they depend on more than just
the semantic values (be they intensional or more fine-grained) of their
individual arguments, but also on the way those arguments are "coordinated" or
"wired." I examine motivations and semantic implementations for such
predicates, drawing from linguistics and computer science.


Section 1
---------

It's hard to do epistemology while staying resolutely non-Fregean. Suppose John
believes and accepts:

(1)	Cicero praised Tully.

Why then does he refrain from inferring:

(2)	Someone praised himself.

Can we explain John's restraint, without bringing in some kind of
Fregean machinery? Can we explain how his restraint may be rational?

Presumably we'll also want to distinguish between the epistemic situation
John is in, and a situation that justifies him in accepting:

(3)	Tully praised Cicero.

But it's not easy to do this if we count (1) and (3) as the same
belief.

Let's not fuss right now with the semantics of attitude *reports*. My concern
is with how fine-grained the representational systems we think with themselves
have to be. Some philosophers like Nathan Salmon are resolutely non-Fregean about
the former but have much in common with Fregeans about the latter. (There are also
important disagreements.) Doesn't it look like *anyone* will have to go
*somewhat* in that direction? Won't we have to be somewhat Fregean about the
representational systems we think with, to explain why subjects who accept
(1) don't immediately also accept (3) and (2)?

I'll argue that the machineries usually associated with "Fregeanism"
are the wrong tools for prising (1) apart from (3) and (2). I won't get too
specific about what counts as "Fregeanism." On some definitions, the machinery
*I* want to sell you might itself be counted as a heterodox form of it.
On the other hand, my machinery is one that self-labeled Russellians/Millians
have invoked in good conscience. So whether it should
be counted as Fregean or as Russellian is a job for subtle lexicography. I'm
disposed to think it's neither.

To bring out the limits of orthodox Fregeanism with respect to (1)-(3),
let's set aside for the moment what it's *rational* for John to believe.
Instead think about what it could be *intelligible* from John's perspective to
believe---even when the beliefs so had *may fail* to be rational.

So, John is thinking about how talented various of his students are. He judges:

(4)	Alice is smarter than Betty, and also smarter than ... and also smarter
than Alice, and ...

Whoops, he lost track of who he was comparing there. So he made a mistake. Now
I'm supposing he really did think something that, in part, involved Alice being
smarter than Alice; it wasn't a case where he had Alyssa in mind but just said
"Alice" by mistake. That can happen too, but I'm supposing the former is what
happened here. In thinking (4) John may well be exhibiting a momentary
cognitive defect. But it does seem he could think it.

At the same time, he hasn't completely lost his senses. He's not exhibiting the
kind of irrationality he'd need to judge:

(5)	Alice is smarter than Betty, and also smarter than ... and also smarter
than herself, and ...

I trust this characterization of John's thinking sounds intuitively natural,
even if it's not clear yet what the difference could be between judging (4) and
judging (5).

We have here a difference in cognitive significance, and we'd like to know how
to think about it. It may be that a sober, fully rational version of John
wouldn't hear (4) or (5) as more informative than the other. But they
differ in cognitive significance for John, *as he is now*---and in theorizing
about the mind it's essential we think about and have good models for cases
where we fall short of being fully rational, too. So, questions of
John's *rationality* aside, how can we even allow for the *possibility* of the
judgments (4) and (5) differing for him?

Fregeanism doesn't offer much help here. It doesn't matter how descriptively
rich and specific John's way of thinking of Alice is: for he was deploying *one
and the same* mode of presentation of her in each argument place when thinking
(4). Well, perhaps we could make modes of presentation ephemeral, tied
specifically to the token occurrent uses now being made of them. In that case
the two thinkings of "Alice" could have different modes of presentation. But at
the same time we'd be making these modes of presentation unrepeatable and effectively
useless. They'd no longer help explain why the inference "Alice is smart, so
Alice exists" is rationally sanctioned: the sense of the first "Alice" would then
differ from that of the second, and it would be a cognitive risk to assume
these two "Alice" thinkings coreferred.

You may want to say that in (4), John must be using "Alice" as two different
names in his thinking. I don't encourage that, but I don't mind it either.
However, I see no reason to think there's any *qualitative* difference
between the ways he's thinking of Alice in the two cases; the only available
differences are so tied to the token uses that they'd get in the way of
modes of presentation doing the explanatory work they're supposed to do. So
however many "Alice" names we count here, a Fregean diagnosis doesn't look too
promising.

For arguments in the same spirit, see Kit Fine 2007's "two Bruces" example (pp.
36-7 and 71), and his "higgedly-piggedly" Mates Puzzle (pp. 129-31).


Section 2
---------

Let R be a dyadic predicate, and h and p be two coreferential names. A
familiar and plausible line of thought says the way we represent the world
when we assert or think this:

(6)	h Rs itself		[e.g. h illuminates itself]

or this, using "^x" to represent lambda abstraction:

(7)	(^x: Rxx) h

or this:

(8)	exists x: x=h & Rxx

involves representing some self-Ring to be going on. That is, what we say or
think in these cases *represents* the phenomenon of reflexivity. I'm gesturing
at an idea here; I don't mean that any of these assert something *about*
reflexivity. Fine 2007 pp. 39ff has an attractive distinction between
representing objects *as* the same, as he says we do in:

(9)	Hesperus = Hesperus.

and merely representing them *as being* the same, as we do in:

(10)	Hesperus = Phosphorus.

In the first case, he says, no one who understands the claim "can sensibly
raise" the question whether it is the same object involved. In the other case,
this question *can* be sensibly raised, even if we know the answer. Sometimes
I've found it helpful to say that in (6)-(9), the coreference is semantically
*de jure*, whereas in claims of the form:

(11)	Rhp

even the identity claim (10), it's at best semantically *de facto*. It's
natural to hear *some* kind of difference between these last claims and the
earlier ones---even for philosophers with predominantly Russellian intuitions.

I do however want to register some caution about whether claims like (9), or more generally,
anything of the form:

(12)	Rhh

should always be understood reflexively. This is disputed. And the John/Alice
case from Section 1 seemed to exhibit a difference between (12) and (6). The
view I'll eventually endorse says that (12) may be ambiguous between an
understanding which is reflexive, and equivalent to (6), and an understanding
which is not. When John has lost track of the fact that it's one and the same
student he's then comparing, he's judging the unreflexive form of (12). On the
other hand, when John thinks in a way that he recognizes as licensing the
inference to (8), his thought is reflexive.

When we assert or think:

(11)	Rhp

generally it seems we *don't* represent any self-Ring or reflexivity, even if we're
antecedently sure that h *is* p. Or at least, so there is some temptation to
say. It's natural to understand even this:

(13)	h=p & Rhp

as failing to represent reflexivity in the way that (6)-(8) do. But
whether that's correct must await our settling the logic for these
phenomena; as we'll see, it's not straightforward whether (13) should entail (7)
or (8).


Section 3
---------

The ideas just floated have been tempting to many philosophers who otherwise prefer
their propositions rather coarse-grained: built up of bare objects and
properties, with logical structure as the only glue or scaffolding.

Putnam characterized phenomena of the sort we're considering as *part* of logical structure:

"'Greek' and 'Hellenes' are synonymous. But 'All Greeks are Greek' and 'All
Greeks are Hellenes' do not *feel* quite like synonyms. But what has changed?
Did we not obtain the second sentence from the first by 'putting equals for
equals'? The answer is that the *logical structure* has changed." (pp. 153-4 in
Salmon & Soames, ed. 1988)
[Putnam, Synonymy and the Analysis of Belief Sentences, Analysis 14 (1954),
114-22]

So far, we've only heard suggestions of a distinctive logical structure, of
*representing* the world in a distinctive way that manifests reflexivity. That
doesn't yet mean there's a difference in anything's *truth-conditions*. But if
there really are representational differences here, it's natural to suppose
some things we assert or think could be sensitive to those differences. Are
there any predicates R such that (6) might differ in *truth-value* from (11) and
(14)?

(6)	h Rs itself
(11)	Rhp	[as before, h and p are coreferential, and we may suppose, have the
		same individual semantic values]
(14)	exists x: x=p & Rhx

If there are, I'll call such predicates "hyper-evaluative." This is a distinctive kind of
hyper-intensionality. If there are such predicates, they depend on more than
just the "values" of their individual arguments---be those values extensions,
or intensions, or even something finer-grained. They even depend on more than
just the cognitive associations or modes of presentation of their individual
arguments, if there be such. They depend in particular on how those arguments
are "coordinated" or "wired" together.

If there are semantic features that track coordination relations in this
way, they will be part of the language's semantics too. So why do I describe
this as being "hyper-" a term's semantic value, rather than as being a
hitherto unacknowledged aspect of semantic value?

*   It will help to first think about the Montague-inspired strategy of
taking names to (at least sometimes) have generalized-quantifier meanings. That
is, sometimes a name's semantic contribution is not just an entity from the
domain (say, Alice), but rather a property of predicate semantic values (say,
the property of containing Alice). Even on such views, it remains appropriate to call
Alice the name's bearer or "referent." Montague didn't propose that where we
thought we had been talking *about Alice*, we were really talking *about*
properties of predicate values instead. Rather, his proposal was that the
notions of semantic value and reference should come apart here. I think we must
be open to that---and probably we should understand Russellianism or Millianism
in a way that's compatible with it. What's central to those views shouldn't be the question
whether semantic values are just referents (a Montagovian treatment of names
says they aren't) but rather whether semantic values are *built up out of* more than
just their referents: whether they have further extra-logical components (a
Montagovian treatment of names can agree they don't). (Some ways of treating
names as predicates would also be compatible with Russellianism, broadly
understood.)

*   However, the extra semantic features needed to track coordination relations
shouldn't be assimilated into this same notion of semantic value. It will prove useful
to have a notion of an expression's "value" that best fits the traditional,
uncoordinated way of doing semantics, as well as having semantic features that
additionally encode coordination between values. This is all orthogonal to the
question whether a term's value is just an entity from the domain, or
it's a property, or a generalized quantifier meaning, or what have you. Expect
at least three levels of semantic features here: referents,
semantic values (which may or may not be different from referents), and whatever encodes
the reflexive or coordination phenomena we've been considering. I call these
last aspects of meaning "hyper-evaluative" to oppose them to the middle
notion---which computer scientists also call "values."

*   In *some* way, reflexive or coordination phenomena have to do with
how terms (or their occurrences) relate to each other, rather than with facts
about the terms in isolation. Fine 2007 argues more specifically that it's
essential to understand these phenomena as coming from a "relationist"
semantics, rather than from semantic features that terms (or their
occurrences) have intrinsically. This is a specific commitment of Fine's view,
which we will see to be only one particular proposal about how to implement a
hyper-evaluative semantics. I don't want it to be definitional of
hyper-evaluativity that it must be explained with features not be possessable by terms (or
their occurrences) taken individually. That will be hashed out differently
between different implementations of these ideas.


Natural language predicates like "it's manifest/certain/patent that ..." and other epistemic
terms are good candidates to be hyper-evaluative. That is, I think we can
understand a sense in which each of the (a) claims is true, but the (b) claims
false, even when we're sure that Hesperus is Phosphorus.

(15a)	It's manifest that Hesperus = itself.
(15b)	It's manifest that Hesperus = Phosphorus.
[see Fine 2007, pp. 48, 56 and 136n14 on "manifest consequence"]

(16a)	Hesperus is indubitably as massive as itself.
(16b)	Hesperus is indubitably as massive as Phosphorus.

In the literature, ordinary attitude verbs are often taken to be
hyper-evaluative too. Consider Mark Richard's phone booth case from 1983.
[Mark Richard, "Direct Reference and Ascriptions of Belief", JPL 12 (1983),
425-52]
In that case, the speaker doesn't realize that the woman he sees out the window
is the same woman he's addressing on the phone. Among other things Richard uses
this case to show, he claims it is natural to count the report (17) as true but
neither (18) nor (19) as true:

(17)	I believe I can alert you to her danger.
(18)	I believe I can alert you to your danger.
(19)	I believe I can alert her to her danger.

despite the fact that "you" and "her" in the speaker's mouth are coreferential.

So Richard counts:

(20)	... believes ...h...h...

as reporting a reflexive attitude, and as sometimes differing in truth-value from:

(21)	... believes ...y...h...

even though 'y' and 'h' are directly referential, and in the context corefer. I
presume Putnam would have agreed.


Consider a variation of Richard's example, where a speaker is disposed to
accept (19) but not (17). In the 1983 paper, Richard takes (19) to permit
exportation, that is, to entail:

(22)	exists x: x=her & I believe I can alert x to her danger.

and since the first 'her' occurs extensionally and corefers with 'you', that entails:

(23)	exists x: x=you & I believe I can alert x to her danger.

Richard also grants that (23) entails (17). So we can go from (19) to (17), but
not in the reverse direction:

(19)	I believe I can alert her to her danger.
(17)	I believe I can alert you to her danger.

(That is, when (19) is true, (17) *is true*; it does not follow--and in the
variant case I posited, it is not true that---the subject will be disposed to
*accept* (17).)

In Richard's 1990 book, however, his view is different. There he still allows
(19) to export to (22) and so also to (23), but now he denies that we can
always reimport from (23) to (17). (See p. 152-3.) That is, earlier he hadn't
taken:

(21)	... believes ...y...h...

to report an *absence* of coordination in the subject's beliefs; but by 1990 he
was taking it to (sometimes) report that.

[Details: He says a "correlation" between the reporter's terms and the
subject's terms *may* map y,h in a that-clause to h',h' in a sentence that the
subject accepts. So when the subject would accept "I can alert her to her
danger," (17) *may* be true, even though the subject wouldn't accept its
complement clause. But it won't be true for every correlation.
On the other hand, because a "correlation" is a function, it *must*
map h,h in a that-clause to h',h' in an accepted sentence. So when the subject
*wouldn't* accept 'I can alert her to her danger,' (22) may be true but (19)
won't be. See pp.139ff, and pp216ff for subtleties about demonstratives vs
names.]


In the mid-80s, Nathan Salmon, Scott Soames, and David Kaplan were also
wrestling with these questions.


Kaplan introduced the device of "wired" structured propositions in talks he
delivered at that time. [Forget/check whether this material was published.]
King would go on to use this device in the 1990s. [citations]


Salmon 1986 [Reflexivity] and Soames 1987a [Direct reference, propositional
attitudes, and semantic content] and 1987b [Substitutivity] rejected the
Richard/Putnam idea that:

(24)	...h...h...

by itself says anything reflexive, or that:

(20)	... believes ...h...h...
(25)	... believes ...x...x...

attribute reflexive beliefs. It's only when some *binding element* occurs underneath
the belief term, and binds the relevant argument places, as in:

(26)	...  believes (exists x: ...x...x... & x=h)

or:

(27)	...  believes ((^x: ...x...x...) h)

that a reflexive belief is attributed. If the argument places are only bound
from the outside, as in:

(28)	exists x: ... believes ...x...x...

they claim no reflexive belief is yet being attributed.


[There are no formulas numbered 29 or 30.]


Section 4
---------

We'll return to the history in a bit. Let's interpose some thoughts about
the dispute we're seeing so far.

My own sense of this dispute is that natural language does not uniformly
support either side. I share Richard's intuitions about (17)-(19) in the case
he describes, and also that this pair does not sound equivalent:

(31)	exists planet x and planet y: x = y and Johannes said he saw x rise,
then y rise, then x set, then y set.
(32)	exists planet x and planet y: x = y and Johannes said he saw x rise,
then y rise, then y set, then x set.

(Richard 1990, p. 153). At the same time, I think *sometimes* recurrence of the
same term in an attitude report does not contribute towards the report being about
a reflexive attitude. In "Variabilism," Sam Cumming builds a case with the
characters from "Love's Labour Lost." Biron is courting Rosaline, and is
dancing with Katherine, who has successfully disguised herself as Rosaline. The
third girl Maria reports:

(33)	Biron thinks Katherine is Rosaline.

[Cumming, "Variabilism", Phil Review 18 (2008), 525-54; at p. 529.] And in
fact, continuing the dialogue, we might report the fact that all the girls have
been successful in their disguises, as:

(34)	Each girl fooled Biron, so he didn't know she was she.

This continuation makes trouble for what Cumming is attempting; we'll return to
it later. But for present purposes note that despite the recurrence of "she",
we're precisely not reporting the presence or absence of reflexive knowledge.
Biron hasn't been fooled so far as to stop attributing self-identity to anyone.

Even when *reflexive pronouns* are used, I don't think this guarantees it's a
report of a reflexive attitude. Every morning, I discover amazing small
sculptures have been put together out of household junk while I slept. I'm
keenly interested to meet the artist. Finally, one night I set up a hidden
videocam. The next morning I discover *I* have been the one doing this, in fits
of sleep-walking. Bemused, I report:

(35)	All along I had been hoping that I'd meet *myself* (can you believe it?)

Here I'm *not* saying that all along I had a reflexive hope, a hope whose agent
and quarry were conceived to be one and the same.

So in some of these

(20*)	... believes/hopes/knows ...h...h...

cases I'm tempted to agree that a reflexive attitude is attributed; in
other cases not.

But neither do I agree with Salmon and Soames that it's only underneath a
binding operator such as "exists x" or "^x" that we see the intuitive phenomena
we're describing. One large worry, which I'll now develop, is that natural
language scope barriers may constrain the presence of such operators more
tightly than the intuitive phenomena.

[Barker and Shan argue we can get binding in situ, without needing binding
operators to move to any higher scope. That work is very exciting; but I will
ignore it here.]

For example, offhand it doesn't seem that "Bob" should have any higher
scope here than "Jane" does:

(36)	Jane and Bob each gave Bob a present.

Yet, if we want to formalize this in a way that exhibits the only kind of
reflexivity Salmon and Soames allow, we'd have to do:

(37)	(^x: (^y: y gave x a present) each of Jane, x) Bob

So it looks like we'd have to choose between (i) what's said in (36) not
*saying* that it's the same Bob in the two cases, at least no more than this
does:

(38)	Jane and Bob each gave Mr Smith a present.

where "Mr Smith" happens also to name Bob. Or (ii) (36)'s really having a funny
logical form. Or (iii) Salmon and Soames being wrong that two arguments can only
be represented as coordinated when bound by a lambda or the like.

Salmon and Soames push for (i). For now, I'll just say that's unsatisfying.
There are ways for a belief to represent the world that underwrite a subject's
willingness to existentially generalize on two argument places simultaneously;
and ways that don't. Such willingness would ordinarily be present when
accepting (36), as much as it would when accepting claims of the form:

(7)	(^x: Rxx) h

As I've already said, we can't expect recurrence of the term "Bob" to *guarantee* the
presence of this representational coordination (and neither can we expect
difference of terms, as in (38), to guarantee its absence). But I'd expect this
coordination to at least be *possible*, and indeed ordinary, for a belief one
has in accepting (36). Salmon and Soames' strategy precludes that, unless (36)
really does turn out to have the logical form (7).


I want to make the unsatisfyingness of their strategy more vivid, by
highlighting more limits on where we'd be able to employ it.

Often it won't just be offhand scope judgments, like the thought that (36)
doesn't have the logical form (37), that stand in the way. It
will instead be widely-accepted linguistic generalizations. For example, binding terms
like "every boy" cannot scope outside of the "there to be" construction in:

(39)	Sue wants there to be farmers at every boy's picnic, and hopes he will
take pictures.

There's no way to understand that sentence with "he" being bound by "every
boy", as there would be if "every boy" could scope out over the conjunction. Of
course we can explicitly express the latter judgment:

(40)	Every boy is such that Sue wants there to be farmers at his picnic, and
hopes he will take pictures.

But the surface form (39) doesn't permit a reading where the binding term
assumes (or "moves to") the position it has explicitly in (40). So now suppose
you want to attribute a belief of the following sort to Nathan:

(50)	[that] Sue wants there to be farmers at Jack's party, and hopes Jack
will take pictures.

and you want the attribution to be reflexive. You want to report Nathan as
believing it's the same Jack that Sue's two attitudes concern. It's tempting to
think you can do this just by reporting belief in (50). But the generalization we're
considering tells us that (50) cannot be understood as:

(51)	(^x: Sue wants there to be farmers at x's party, and hopes x will take
pictures) Jack.

Why not? Because "Jack" in (50) is in the same position as "every boy" in (39),
and our generalization tells us that's not a surface position whose occupant
can assume the wide scope indicated in (40) and (51).

The linguistic generalization we're relying on here may be disputable, but it
isn't to be lightly shrugged off, either.

Of course, Salmon and Soames *can* shrug off what I called "tempting" here.
They'll just deny that (50) itself can be used to attribute a reflexive belief.
A reporter may instead only *convey* that his subject *would accept some
sentence* like (50). Those paths are well-worn and we won't follow them. I just
observe that there is some intuitive cost to not being allowed to report a
reflexive or coordinated belief with the surface form (50). As I said, it's
tempting to think we can do that.

Of course, in forms like (50) we *can* replace the second "Jack" with pronouns
anaphoric on the first "Jack", and we can also do this when the first "Jack" is
instead an indefinite:

(52)	Sue wants there to be farmers at a donkey's party, and hopes it will
take pictures.

But this observation is one linguists struggle to *reconcile with* the fact
that no binding element at the surface position of "a donkey" is assuming the
wide scope indicated in (40) and (51). That's what the whole business of
explaining donkey-anaphora amounts to.

There are other accepted scope barriers which testify in the same direction.
For example, binding elements seem unable to scope outside of determiner
phrases. To see this, consider:

(53)	Several politicians spy on someone from every city.

We're interested only in the readings where "every city" takes wider scope than
"someone." The reading where there's a single group of spying politicians is available:

(54)	several politicians p: (every city c: someone x from c: p spy on x)

The reading where in every city, there's a different focus of some spying also
seems available:

(55)	every city c: someone x from c: (several politicians p: p spy on x)

But, as the reader should confirm, the reading where every city has a few,
perhaps unfocused, spyings taking place does not seem available:

(56)	every city c: (several politicians p: (someone x from c, perhaps a
different x for each p: p spy on x))

This generalization comes from Larson 1987 "Quantifying into NPs," and is
widely accepted. The now predominant explanation is that "every city" cannot
scope outside of the phrase "someone from every city." That phrase must as a
whole either take scope over or under "several politicians."

But now consider the following, read so as to state that there's a single group
of politicians:

(57)	Several politicians spy on someone from each of Chicago and London, and
get away with it because their privacy protections are inadequate.

If this states there's a single group of politicians, not different groups for
each city or each spying, then "someone from each of Chicago and London" isn't
taking scope over "several politicians." And so, by the rule we extracted
above, no binding element introduced at the surface position of "Chicago and
London" can be assuming wide scope over "several politicians" either. And
so---though clearly *there is anaphora* in the final clause on "Chicago and
London"---the logical form we're working with cannot be:

(58)	(^c: (several politicians p: someone x from c: p spy on x, and get away
with it because c's privacy protections are inadequate) each of Chicago and
London

The anaphora we see in the final clause of (57) is like the donkey-anaphora in
(52). It's not to be explained by a wide-scope binding element like "^c" in
(58). English's scope barriers speak against (57) being able to have the
logical form of (58). As before, the linguistics here may be disputable, but I
understand the view I'm setting out to be the mainstream, and any alternatives
would need to be assessed carefully. [Will add more citations]

Relative clauses provide a third scope barrier. Here:

(59)	Ralph knows that someone loves everyone.

there is a reading where "everyone" takes wider scope than "someone"---that is,
it may be a different lover for each of the everyones. But here:

(60)	Ralph knows someone who loves everyone.

that reading is no longer available. The predominant explanation is that
"everyone" cannot scope outside of the relative phrase "who loves everyone."
That constraint will also govern the "when ..." clauses in the following:

(61)	The days when Jane criticizes every student are days when he's unhappy.

This sentence invites the question "who's he?" because "he" can't be read as
bound by "every student." "Every student" occurs in a surface position that
can't be understood as moving to a wide enough scope to do that. We can
explicitly express the missing reading:

(62)	Every student is such that the days when Jane criticizes him are days
when he's unhappy.

But (61) can't itself be understood that way. Now, as before, let's replace
"every student" with a singular term:

(63)	The days when Jane criticizes Jack are days when he's unhappy.

Here the "he" in the main clause *can* be anaphoric on "Jack," but that's a
challenge theorists work to explain, because they assume it's *not* a case
where any binding element at the surface position of "Jack" can be moving to
wide enough scope to be doing all the needed binding. That is, just as (61)
can't be understood as (62), (63) can't be understood as:

(64)	(^x: The days when Jane criticizes x are days when x is unhappy) Jack

To all this, Salmon and Soames can shrug and say we need to live with there
really being no reflexivity represented in:

(36)	Jane and Bob each gave Bob a present.
(50)	Sue wants there to be farmers at Jack's party, and hopes Jack will take
pictures.
(52)	Sue wants there to be farmers at a donkey's party, and hopes it will
take pictures.
(57)	Several politicians spy on someone from each of Chicago and London, and
get away with it because their privacy protections are inadequate.
(63)	The days when Jane criticizes Jack are days when he's unhappy.

And similarly, no reflexive attitudes reported in belief reports where those
are the complement clauses. But as I said before, that is an intuitive cost.

[Added:
If one did want to accommodate reflexivity using only the resources
Salmon and Soames allow, Fine 2007 makes another good complaint. Consider:

(70)	Cicero loves Tully and not: Tully loves Cicero.

Which of these should we understand the reflexive logical form of (70) to be?

(71)	(^x: ^y: Lxy and not Lyx) Cicero Tully
(72)	(^y: ^x: Lxy and not Lyx) Tully Cicero

There doesn't seem to be any motivated answer; but neither does it seem that
there really should be *two* reflexive claims whose surface form is (70).

Then again, Salmon and Soames are most naturally understood to be rejecting, or
trying to explain away the intuitive phenomenon that Fine and I are trying to
capture, not to explain it.]


Section 5
---------

Perhaps the biggest disappointment with the Salmon and Soames strategy is that,
once you start thinking about the kind of coordination between argument places we
see in reflexive claims, it becomes very tempting to posit that between
argument places in different propositions, too, as in:

(74)	Some days Jane criticizes Jack. Criticism makes him unhappy.
(75)	Alice is F. So Alice exists.

Of course there can be cases where you consider the two *sentences* in (75) and
are unsure whether it's the same term "Alice" recurring. Then it's not clear
whether you're considering a valid argument. But I've never had sympathy for
the worry that this shows we can never be sure about validity. (Perhaps we
can't, but if so it's not for this reason.) I want to say: it's not *the
sentences* I primarily judge to be valid. It's an argument *pattern*, and in
the pattern I'm considering, it's *given* whether the same singular term occurs
in each premise. The natural model for this way of thinking is one where
argument places in different propositions can be de jure linked, in the way
we've been thinking they can be coordinated in single reflexive propositions. But that's
not something that any binding element like "^x" or "exists x" can accomplish.


Section 6
---------

I don't think what we've seen Richard say about coordination can be the whole
story. As I've said, I think claims like:

(9)	Rhh

can receive both reflexive and non-reflexive readings. Or: we should at
least want a formal notation that can express two readings here. More on this
later. Also, I agree with Soames 1987b, which argues that the resources Richard
offers block only *some* intuitively troubling Frege-problem cases. (We're
considering here only those parts of Richard's view that have to do with
reflexivity, not his whole theory from 1990. Also, his 1983 view applies to
demonstratives and variables only, not to any directly referential terms like
names that are insensitive to contexts and assignments. I ignore that here.)

For example, as Soames reports, these propositions still come out equivalent on
a direct reference view, even one enhanced with the ability to distinguish
reflexive from non-reflexive propositions:

(76)	Superman is stronger than Clark Kent.
(77)	Clark Kent is stronger than Superman.

(This was pointed out by Lewis in response to a lecture Kaplan gave. Compare our (1) and (3).)

Also, if this is true ['h' and 'p' abbreviating 'Hesperus' and 'Phosphorus' throughout...]:

(78)	The ancients said that: Fh & not Fp.

then although this won't follow:

(79)	The ancients said that: Fh & not Fh.

this will:

(80)	The ancients said that: Fh and also that: not-Fh.

And (80) seems intuitively nearly as troubling as (79). It helps little to be
assured that, though it's true, (80) is a different proposition from:

(81)	The ancients said that: Fh and also that: not-Fp.

(See also Soames's note 25: "It does not help to be told that the Ancients did
not believe that Hesperus was not Hesperus, if it is granted that they did
believe that Hesperus was not Phosphorus and that Phosphorus was Hesperus.")

Soames also argues that Venus could truly report what the ancients said in (78)
with:

(82)	The ancients said that: I am F & I am not F.

but if we accept that, it seems to entail:

(83)	exists x: the ancients said that: x is F & x is not F.

which is just the kind of report that Richard's theory is meant to block. (See
Soames' McX case at p. 117, also the extension of the same idea to Venus on pp.
118-9.)

I agree with Soames that (82) should be true relative to Venus' context, when
understood so that the two occurrences of 'I' are uncoordinated. And I also
accept the entailment to (83), but only when the coordination between the two
occurrences of 'x' is broken. This is the same idea we saw in my continuation
of Cumming's "Love's Labour Lost" case before:

(34)	Each girl fooled Biron, so he didn't know she was she.

How might we formally represent these broken coordinations? We'll discuss this
more detail later. For present purposes, one can think of it like this. Suppose G[y->x] is
a complex expression where all free occurrences of y have been replaced with x,
and those occurrences of x may be coordinated with other occurrences of x
elsewhere in G. Here is how to convert G[y->x] into an expression that breaks
any such coordinations with the other occurrences of x (but leaves all the
y-replacing occurrences of x still coordinated with each other):

(84)	exists y: y=x & G

or this (which would be better if we want to work in a free logic and allow x
to be non-referring):

(85)	every y: y=x horseshoe G

For example, let G be 'x is indubitably as smart as y'; then G[y->x] will be:

(86)	x is indubitably as smart as x

and understand the two occurrences of x in the latter to possibly be
coordinated. We get a version with the coordination broken by using schema
(85). In this case, that amounts to:

(87)	 every y: y=x horseshoe (x is indubitably as smart as y).

Even if the two occurrences of x in (87) may still be coordinated, we suppose
predicates will only be sensitive to coordination among their own
arguments. (Fine resists this, see later.) And nothing has been done here to
introduce a coordination between any occurrence of y and those of x. So G's own
arguments are now uncoordinated.

That's a way to break coordination, and we can understand something like it to
have gone on in (82), (83), and (34). Of course I don't think any of these have
the underlying logical form of (85). Rather, we will later discuss how to do
this with a semantically simple operation. But (85) gives us a way to think
about what's going on in terms that are now more familiar.

The phenomenon illustrated in examples like (34) is very interesting. I
understand it to be a case where a variable is still referentially bound by a
quantifier but its coordination with other terms also bound by the quantifier
has been broken. So: referential binding does not entail coordination. (Though
it may always *introduce* coordination.) This is another respect in which I
oppose the Salmon-and-Soames-inspired strategy of trying to account for the
intuitive phenomena only in terms of binding operators.


Resuming the main thread: I agreed with Soames that there should be true
(uncoordinated) readings of (82) and (83):

(82)	[Venus speaking] The ancients said that: I am F & I am not F.
(83)	exists x: the ancients said that: x is F & x is not F.

but I also think there should be false readings. Or at least, even if there
aren't false readings of these forms *in English*, we have an intuitive
understanding of a representational difference here, and we should want a
formalism that can express false (coordinated) claims of these forms, or of
some forms in this neighborhood, as well.

I think many (all?) of the troubling Frege-Problem cases Soames says Richard's
theory leaves unhandled will be handleable if we develop the resources to have
cross-propositional coordination.

One of Soames' major points is that "reports of propositions asserted are not
semantically required to preserve the logical structure or cognitive
perspectives of the sentences used to assert them" (p 118). I have sympathy for
this. If we don't use resources dedicated to reporting coordination---as
natural language may not, or anyway, may always have disambiguations that do
not---then:

(88)	... said ...h...h...

can be used to correctly report the assertion of a sentence whose logical
structure was:

(89)	...h...p...

In that respect, I agree the Richard view is too inflexible. But my interest is
to explore semantic resources which *are* dedicated to reporting coordination,
and whose correlates of (88) could not be used to correctly report an assertion
of (89). So my sympathies are very much with the spirit of Richard's approach.


Section 7
---------

Despite the linguistic excursions of Section 4, I'm not going to try to settle
which natural language phenomena exploit or are sensitive to coordinated argument
places. My concern is instead with what we need to buy into to *get* this
representational capacity into a language, even a formal language. I think our
cognitive systems use such a capacity, even if natural language does not. (But
natural language probably does too.) As a theorist, I'd like to at least have a
well-understood formalism for *happily expressing* the kinds of differences
exhibited in the John/Alice case from Section 1.

We'll consider various ways to semantically implement this: some from
philosophy, more from linguistics, even more from computer science.

Chris Barker (NYU Linguistics) and I are teaching a seminar this fall on "What
Philosophers and Linguists Can Learn from Computer Science But Didn't Know to
Ask." Of course philosophical logicians will already know nearby work in
CS; but we're aiming to show how ideas familiar in CS directly bear on more
mainstream, somewhat-less-technical inquiries in our fields, too. (Ideas that
may not be familiar to everyday programmers, but are familiar in theoretical
CS, especially the areas of functional programming and type theory.)

The present inquiry is one such example.

We'll see next how certain complex expressions in some programming languages
are hyper-evaluative. And the techniques for doing semantics for such
expressions are better-explored over there. (No matter where one looks, though,
it's a challenge to abstract away from the details of particular
implementations, and get at what the fundamental semantic commitments are. We
won't even ourselves overcome this, but it's something we'll aim for.)


Section 8
---------

Let's talk through a primitive programming language, to see how
hyper-evaluativity arises in that setting.

We'll make up the syntax ourselves, for pedagogical efficiency. But everything
we do here is straightforwardly expressible in Scheme or many other languages.

We'll begin with a language fragment that's purely "declarative" or
"functional." That is, it just consists of expressions like 1+2 and more
complicated things of that sort. The language's *interpreter*---that is, the
software that processes programs we write in this language---takes care of
*evaluating* the complex expressions we give it, and delivering us the result.
But the language doesn't yet itself contain anything that counts as an
*imperative*. 1+2 is not an order or command. The only imperatives on the scene
are external to the language. For instance, we do order the interpreter to
evaluate our program. It delivers the result that the language's semantics
determines that to be. But the only thing the interpreter needs to interpret
are complex expressions like 1+2. There aren't any orders or commands in the
language that need to be interpreted in the same way.

I emphasize this because later we will introduce imperative elements into the
language, and the differences this brings along will be important. Many
people's folk conception of computation is already imperatival. But it's
important to realize that's just one kind of element that a language may have
or lack; and important to pay close attention when it arrives on the scene.
This element is intimately connected to the phenomenon of hyper-evaluativity
that we're trying to understand.

The most familiar paradigm of a purely functional programming language is
Church's untyped lambda calculus. (Typed lambda calculi are also purely
functional, but less familiar.) In the lambda calculus, everything is either a
function, or an inert unevaluable simple. That makes things much more fun. But
we'll keep things simple and boring here, and allow ourselves primitives like
the natural numbers, arithmetical functions, and so on.

We'll also allow ourselves a primitive syntax for forming ordered tuples: (1,
2, 3) is a 3-tuple, and (1, 2, 1+2) evaluates to the same 3-tuple. One can
build these up by hand in the untyped lambda calculus, but as I said, we'll
keep this as boring and straightforward as possible.

Let's introduce variables into our programming language. Suppose I want
to evaluate (1+2+10, 1+2+20) but I'm lazy and I don't want to rewrite the 1+2.
So we'll give ourselves the ability to do this:

	let x be 1+2 in
	    (x+10, x+20)

This will evaluate to the 2-tuple (13, 23), just as you'd expect. We can also
do fancier things like this:

	let x be 1+2 in
	    let y be 10 in
		(x+y, x+20)

which also evaluates to (13, 23).

Now I'll mention one more subtlety before we move on. What if we do this:

	let x be 4 in
	    let x be 1+2 in
		(x+10, x+20)

This will evaluate to (13, 23), too. That is, the innermost binding of x
trumps any outer bindings. We'll call this *shadowing*. It's basically the
same as happens when you say in predicate logic "every x: (Fx & exists x:
Gx)", reusing the same variable x. It *looks* a bit like something else that's
going to happen later, but what comes later will be importantly different.

What we're doing with these let-expressions is basically just supplying a value
as an argument to a lambda abstract. That is, a claim like this:

	let x be EXPRESSION in
	    BODY

is doing nothing more than this:

	(^x: BODY) (EXPRESSION)

It's just a syntax that's easier to think about---and extend.

The next thing we introduce will be definitions of our own complex functions:

	let f be (function x: x+3) in
	    (f(10), f(19+1))

This will evaluate to (13, 23) too. When f is applied to the value 10, it
binds f's parameter x to 10, and then returns the
evaluation of x+3, which is 13. Then when f is applied to the value 19+1, it
binds f's parameter x to *that* value. There are
theoretically interesting issues about whether it first evaluates 19+1 to 20,
and then binds x to 20, or whether it leaves that evaluation to do until later.
We'll suppose it does the evaluation first. So now f returns the evaluation of
x+3, which is 20+3, that is 23. So now our evaluation of (f(10), f(19+1)) has
reduced to (13, 23), and we're done.

Functions can be arbitrarily complex. When appropriate, we'll break them up
onto several lines, as here:

	let f be (function x:
		    let y be x+3 in
			(x, y)
		) in
	    (f(10), f(19+1))

This will evaluate to the 2-tuple of 2-tuples ((10,13), (20,23)).

Functions can take more than one variable:

	let f be (function x, y:
		    x + y + 1
		) in
	    (f(10, 2), f(20, 2))

This will evaluate to (13, 23). (In functional programming, multiple arguments
are standardly implemented by "Currying," but that won't matter to what we're
doing here.)

We allow functions to see not just their own internal variables, but also
any variables from their surrounding environment, too:

	let y be 3 in
	    let f be (function x: x+y) in
		(f(10), f(20))

This will evaluate to (13, 23).

If we shadow any variables (remember that?), then, just as before, the
innermost binding wins.

	let y be 2 in
	    let f be (function x:
			let y be 3 in
			    x+y
		    ) in
		(f(10), y, f(20))

evaluates to (13, 2, 23). Note that the y in the final line is evaluated as 2
not as 3. This is because that occurrence of y is not in the "scope" of the expression that
rebinds y to 3. At that final line, the original binding to 2 is still in
effect. This is basically the same as happens when you say in predicate logic:
"every x: Fx & (exists x: Gx) & Hx." The "Hx" is evaluated with x bound again by
the universal quantifier.

Now we're ready to introduce our first important novelty. This is what's known
in CS as "mutation" or a "side-effect." It will look like this:

	let y be 2 in
	    let f be (function x:
			change y to y+1 then
			    x+y
		    ) in
		(f(10), y, f(19))

What this means is that, in the process of evaluating f, we *rewrite* the value
then assigned to y. This is not the same as the re-binding of y that
happened in the previous program, where we merely shadowed y. In the shadowing
case, outside of the function f, the rebinding of y wasn't
anymore in effect. But in the current case, when we rewrite the value assigned
to y, the new value *sticks*. Until we change it again (or we leave the scope
of y's original binding; but we won't do that).

What this last program evaluates to will depend on whether the f(10) or the
f(19) gets evaluated first. Let's settle on evaluations always going
left-to-right. So first we evaluate f(10). When the interpreter gets to the line
"change y to y+1", it begins by evaluating the expression y+1. This now has the value 3. We
then rewrite the part of memory where we're holding the contents of y. So now y
will have the value 3 instead of 2. We then evaluate x+y, which is 10+3, and we return 13. So now
f(10) has evaluated to 13. We continue on to evaluate y. Now y is still 3! So now
we've partially evaluated our final line to (13, 3, f(19)) and we have to
evaluate the last f(19). When the interpreter gets to the line "change y to
y+1", it again begins by evaluating y+1. Now this is 4. We then
rewrite y to be 4 instead of 3. When then evaluate x+y, which is 19+4, and we
return 23. So now our program has been fully evaluated to (13, 3, 23).

Here we finally have introduced a fundamentally imperatival element into our
language. It's not hard to *emulate* what's going on here while staying purely
functional, but having the ability to change y like this as a native capacity
of the language is a significant milestone. It's not entirely a good thing; nor
is it entirely bad. But theoretically it makes for very important differences.

Now we're ready to introduce our second important novelty. Note the
second-to-last line in this program:

	let y be 2 in
	    let x be y in
		let w alias y in
		    (y, x, w)

This program will evaluate to (2, 2, 2).

So far, the "let w alias y" line seems to work much the same as the "let x be
y" line. In each case, the newly-introduced variable ends up having the value 2,
which is the same value y had. The difference between the two will only
show up when we combine aliasing with mutation. Consider:

	let y be 2 in
	    let x be y in
		let w alias y in
		    change y to y+1 then
			(y, x, w)

This will evaluate to (3, 2, 3). The interpreter begins by setting up our y, x,
and w variables just as in the previous program. When the interpreter gets to
the "change y to y+1" line, y, x, and w all begin with the value 2. Now the
interpreter evaluates y+1, which is 3. It rewrites the part of memory where
it's holding the contents of y. So now y is 3. x is still 2. The only relation
x had to y was that x was bound to a value that was a function of the value y
then had. In this case, it was just the function y, but it could have also been
another function, such as y+1. But w on the other hand stands in a new,
different relation to y. It doesn't just contain a *copy* of the value y had at
some stage in the program's evaluation. Instead, we've introduced w to be an
*alias* or synonym for y. So whatever the interpreter does to y will thereby also
have been done to w. Since we've changed y's value to 3, w has also thereby
been changed to 3, behind the scenes.

We couldn't have said something like:

	...
	    let w alias y+1 in
		...

because the aliasing declaration needs to introduce a coordination between w and *another
variable*, not just an expression that's a function of some variable. Well,
we could allow that and let it mean something like this:

	...
	    let anonymous_variable be y+1 in
		let w alias anonymous_variable in
		    ...

but if we have no independent way to use the anonymous_variable, this wouldn't have
any difference in practice from just using:

	...
	    let w be y+1 in
		...


The combination of aliasing and mutation that we're seeing here is ubiquitous
in programming. You can do aliasing *without* mutation, but in practice there
wouldn't be much point. Your aliasing declarations would have the same effects
as your let bindings. (As we just said, there may just be more
restrictions on what can appear on the right-hand side of an aliasing.)

The combination of aliasing and mutation also introduces hyper-evaluativity
into our programming language. To see this, we need to introduce one final
tweak; but now we're back to only a conceptually modest step. Notice that doing
this:

	let f be (function y:
		    BODY
		) in
	    ... f(EXPRESSION) ...

is essentially just doing this:

	let y = EXPRESSION in
	    ... BODY ...

Call those two programs Alpha and Beta. Now what we want is something that
stands to the following:

	let w alias y in
	    ... BODY ...

in the same way that Alpha stands to Beta. We'll write it like this:

	let f be (function alias w:
		    BODY
		) in
	    ... f(y) ...

What does this new syntax let us do? Consider the following:

	let f be (function alias w:
		    change w to w + 1 then
			w + 2
		) in
	    let y be 1 in
		(f(y), y)

When the leftmost f(y) is evaluated, the interpreter enters the function f, and
instead of just binding w to *a copy of* y's value, it makes w be a temporary alias for y. So now
when we change w's value to w + 1, that is y + 1, that is 2, the value of y will
also thereby be changed, behind the scenes. The function call then returns the
value w + 2, which at this stage in the program's evaluation, is 2 + 2. So now
our final line has partially evaluated to (4, y). At this point y has
the value 2 not the value 1 it started with. So our final line evaluates to (4, 2).

Now for the main event. Here our function will take multiple arguments, an option
we mentioned a while back. This time both of the arguments are alias arguments.

	let h be 1 in
	    let p be 1 in
		let f be (function alias x, alias y:
			    change x to x + 1 then
				let z be x + y in
				    change x to x - 1 then
					z
			) in
		    (f(h, p), f(h, h))

What happens? We begin by evaluating f(h, p). The interpreter makes x be a temporary alias
for h and y be an alias for p; at this point all of these variables will
have the same value 1. We then change x (and so also h) to 2. We then evaluate
the expression x + y, which at this stage is 2 + 1, that is 3. We then evaluate
x - 1, which is 1, and for the hell of it we change x (and so also h) back to
1. That doesn't change the value of z, which is still 3. We return that value.

So now we've partially evaluated the final line and we have (3, f(h, h)). At
this stage both h and p are again 1. We now go evaluate f(h, h). This time we make x
and y both be aliases for h. All three variables will have the same
value 1. We then change x (and so also h, and so also y) to 2. We then evaluate
x + y, which at this stage is 2 + 2, that is 4. We then change x (and so also h
and y) back to 1. z remains at 4, and that's what we return.

So now our final result is (3, 4).

Notice what's happened. When we evaluated f(h, p) and f(h, h), h and p had the
same value. They coreferred to the value 1. However, neither h nor p was "aliased"
to the other. These variables were not semantically coordinated. They just
*happened* to have the same value. (Well, it wasn't an accident, since the program
is deterministic. But it's what I earlier called "semantically de facto"
coreference.) And now the expressions f(h, p) and f(h, h) evaluate differently.
That is, their extension is sensitive not merely to the values of their
arguments, but also to what coordination exists between those arguments.

If we had aliased y to h and then called f(h, y), it would have given us the
same result as f(h, h).

Here we see hyper-evaluativity in what I hope is a familiar shape. As I
said, the machinery underwriting this is ubiquitous in programming.

The standard semantics for what we've done here is what I'll discuss later as a
"proxy semantics." The variable h isn't directly associated with the value 1,
instead it's assigned an index into a heap of memory, and then that index is
associated with the value 1 (by writing that value to that position in the
memory heap). But even when that's what happens underneath the hood, the number
1 is still what we call "h's value." After all, if you evaluate h+1, the result
is the number 2, not some position in a memory heap.

With a proxy semantics, you will have enough structure to track
hyper-evaluativity. You count two terms as coordinated or aliased when they
don't just have the same value, but they're also associated with the same
proxy. However, this is more structure *than we need* for hyper-evaluativity
itself; and it opens us up to some philosophical doubts that we'll engage with
later. Strictly speaking, you'd only need a proxy semantics to implement
*mutation*. [I'm reading about other more subtle ways to implement mutation as
well, with delimited continuations. But proxies are the standard way to do it.]
And it's possible to have hyper-evaluativity *without* mutation---for instance,
if the language had primitive hyper-evaluative functors or predicates. As we'll
see in Section 13, we can give a semantics in that case without needing to
bring proxies into the story. It's just that in practice, programming languages
always do get their hyper-evaluativity from mutation.

In some languages it's possible to inspect and manipulate not just h's
value, but also the proxy h is assigned. Introducing this further capacity into a
language has some pros and cons on the programming side. Philosophically, I
think it just gets in the way. Philosophically, it's most useful to think about
languages which are just expressive *enough* to introduce the representational
capacity we're interested in, and no more. We especially want to avoid mixing
up the representational capacity with its underlying implementation, if the
details of that implementation aren't essential to the phenomenon. This is why
I think it's useful to think about impoverished languages like the one I've
presented.

Sometimes the term "pointers" is used to talk about the aliasing/mutation
machinery we've been looking at here. But strictly understood, I think
"pointers" are specifically (i) the aliasing/mutation machinery
implemented via a proxy semantics; and where (ii) the language also has the
capacity to inspect and manipulate the underlying proxies. "References" is
sometimes used as a more generic term, to mean the aliasing/mutation
machinery, without necessarily including (ii), and perhaps also without
commitment as to what the underlying implementation is. But for the most part,
the usage seems to be pretty lax.


Passing arguments to functions in the way we do in:

	let f be (function alias w:
		    BODY
		) in
	    ... f(y) ...

is called "passing by reference"; and passing arguments to functions in the way
we do in:

	let f be (function y:
		    BODY
		) in
	    ... f(EXPRESSION) ...

is called "passing by value." Orthodox uncoordinated semantics only considers
predication where the arguments are passed by value. The values in question may
be extensional, or they may be intensional. It's all still passing by value.
Alias-like relationships between the arguments have no effect on the result. On
the other hand, coordinated or hyper-evaluative semantics says some predication
involves passing arguments by reference, not by value.


[Comment:

Programming languages standardly have boolean predicates, which compare values
for equality, being greater than, and so on. Languages that have native
"pointers" or "references" standardly have (at least) two such equality predicates.
One tests for whether two arguments have equal value. With such a predicate,
this program:

	let x be 3 in
	    equalvalue?(x, 1 + 2)

would evaluate to the truth-value true. The other equality predicate
tests for (something like) whether its two arguments are coordinated or aliased. Let's
call this predicate "hyperequal?" With such a predicate, this program:

	let y be 3 in
	    let x be y in
		let w alias y in
		    (hyperequal?(y, w), hyperequal?(y, x), hyperequal?(y, x+0))

would evaluate to the 3-tuple whose first member is true, whose second member
is false, and whose third member either fails to evaluate (because a complex
expression is syntactically incapable of being the target of an aliasing), or
is false.

"hyperequal?" should not be assumed to be a metalinguistic relation. It's true this
relation isn't just a function of the values of its arguments. But neither does
it require an ability to refer to, quantify over, or take as values the
language's own expressions.

As I suggested in Section 3, a reasonable natural-language expression of the
"hyperequal?" predicate would be something like: "is indubitably the same as."

Confusingly, different real programming languages use the symbol "=" in
different ways. Sometimes it's an binding operator, like our "let...be" in "let y
be 3". Sometimes it's used as the "equalvalue?" predicate. Sometimes it's used as
the "hyperequal?" predicate. Sometimes it's used as several of these, depending on
context.


Something else I've found confusing is that many languages have a contrast between
"equivalence" and "pointer identity"; and this is related to, but *not the same as*,
our contrast between equalvalue? and hyperequal? Once a language is able to mutate
variables, it's often able to express mutable values as well. For example, consider:

    let a be 1 in
	let b be 1 in
	    let f be (function alias w:
			(function x: x + w)
		    ) in
		let alpha be f(a) in
		    let beta be f(b) in
			...

Here alpha and beta are mutable "function closure" values. So long as a and b
aren't mutated, alpha and beta will have the same extensions. So they're
equivalent in a sense. However, this equivalence does not survive arbitrary
mutations. If we say:
			...
			    change a to 2 then
				...

but leave b alone, then alpha and beta will have different extensions. In such
a case, the two values alpha and beta might be regarded as (sometimes) equivalent
but not numerically identical or "pointer identical." (This isn't a standard
example of this contrast; normally the notion of equivalence isn't
defined for function closures. But it's an example that builds only on
resources we've explained here.)

As the appearance of mutation and passing by reference in this
example suggests, the contrast between equivalence and pointer identity is
closely related to our contrast between equalvalue? and hyperequal?
Consider the previous example extended like this:

			...
			    let gamma be alpha in
				let delta alias alpha in
				    ...

Here gamma and alpha *are* pointer identical: mutations to a won't disrupt the
extensional equivalence of gamma and alpha. However, gamma and alpha aren't
aliased or hyperequal. If we go on to mutate one of *those variables*, it does
not affect the other:
				    ...
					change gamma to 0 then
					    ...

On the other hand, delta and alpha are aliased or hyperequal (and so as a
result are also pointer identical).

End of comment]


Section 9
---------

TODO: Summarize and connect these issues where appropriate to Kartunnen 1976,
work deriving from Kamp and Heim, Landman's "pegs", Vermeulen's "stack
semantics" for DPL, papers by Dekker, Aloni, de Bruijn, Haas-Spohn, Muskens.
Other references welcome.

Keep close track of when a device is needed, or used for, (i) donkey-like
referential dependence; (ii) failed or confused reference (whether one-many or
many-one); (iii) not just (i) and (ii) but also
coordination/hyper-evaluativity.

Summarize and connect to Fiengo & May's 1994 and 2006 books

Forbes, "The indispensability of Sinn", Phil Review 99 (1990), 535-63 proposes
that the sense of a name is just "the subject of THIS mental dossier (the one
hereby being employed)." Recanati's more attractive idea that we think *with*
dossiers, rather than by representing (even in sense) anything *about* them.
I'm sympathetic with Fine's claim (pp. 67-8) that "mental files" are best
understood as a book-keeping device for *tracking* facts about coordinated
contents than as explanations of how coordination is achieved or what it
consists in.

Fine discussed further below.

Re Cumming: I emphasized that what's going on when we have mutation is importantly
different than mere binding, of the sort we have in lines like "let y be 3".
Mutation might usefully be understood as shifting an assignment: in the way
that happens, for example, in dynamic semantic treatments of indefinites or
tense. I mentioned Cumming's paper "Variabilism" 2008 earlier. One main claim
of that paper is that names are like variables in being bindable. Another main
claim is that epistemic operators should be understood as assignment-shifting
terms. Under the scope of "Biron believes", we should shift the assignments of names
names to match Biron's doxastic outlook. I think this is best
understood as simultaneous shadowing/rebinding of many names at once, rather than
as mutation, but I'm not certain.

In any event, as I hinted at before, I don't think Cumming's
assignment-shifting strategy suffices to explain the phenomena he's looking at.
In the same way that Maria can truthfully report:

(33)	Biron thinks Katherine is Rosaline.

I think others can truthfully report:

(34)	Each girl fooled Biron, so he didn't know she was she.

and in this case, so long as both feminine pronouns are bound by "each girl",
there doesn't seem to be any opportunity to interpret the complement clause of
"didn't know" in a way that assigns the pronouns different values. We need some
account of what's going on in (34), like a hyper-evaluative account, and I'm
thinking the machinery that explains (34) can explain (33) too.


Section 10
----------

Let's survey different ways a semantics might encode or keep track of coordination
information.

The first strategy is the "proxy semantics" we already mentioned. This
strategy is dominant in CS and in many linguistic accounts. The proxies are
related many-one to entities in the domain of quantification; and instead of
assigning entities directly to our variables, we interpose the proxies. Hence
we track coordination via a kind of "indirect reference." (This is importantly
different from what Fregeans mean by the same term.) Two variables are
coordinated when they're associated with one and the same proxy; they're
coreferential when the proxies they're associated with (whether identical or
not) are associated with one and the same object from the domain.

In some treatments, these proxies are the variables themselves, in others they
are integers, or indices into an array, and so on. (See King < 2007, "pegs",
and so on.)

Various views floated in the philosophical literature have at least this much
structure, and so can be seen as instantiating this implementation strategy.
(Richard's 1990 "Russellian Annotated Matrices," Larson and Ludlow's 1993
"Interpreted Logical Forms," and so on.)

One source of discomfort with those strategies is that the choice of indices
can seem arbitrary. This can be alleviated in various ways, for example by working
with equivalence classes, but the discomfort is there. Fine 2007, p.11 and 27
also complains that views of this sort make semantic values too *typographic*.
I share that discomfort, but in the end I doubt his own view can retain the
moral high ground about this. I think he's going to end up vulnerable to
similar discomforts. More about this later.

In fact, I regard the difficulty here as an instance of a more general problem,
akin to (or perhaps a form of) Benacerraf's Problem. This first bothered me
intensely when thinking about nodes in graph theory. A node is not the same
thing as its label---sometimes we work with unlabeled graphs, or with graphs
where numerically the same object labels multiple nodes. Rather a node is
something whose whole nature intuitively should be exhausted by its role in
organizing how different edges relate to each other (and similar things should
be said about edges). We'd like to say that there's no more to the root node of
this directed graph:

	* -----> * -----> *
	|
	v
	*

than its numerical differences from the other nodes in the graph, and its
position in the graph. Questions about its identity or difference to nodes in
other graphs, or about whether it's red or prime, should just have no
meaning---or perhaps it should be different from nodes in any other graph, and
lack all properties like redness and primeness. Some of the intuitions here may be negotiable.
For example, I'd like to say that one and the same graph depicted above
*is* a subgraph of different larger graphs, not just that it's isomorphic to parts
of the larger graphs. Maybe it's not possible to consistently say such things.
But if it were possible, that's what I'd like to say.

But now if we look at a standard presentation of graph theory, we'll be told
that a graph is a tuple of a set of nodes, which may be numbers or apples or
anything you like, it doesn't matter, together with an edge relation on those
nodes with such-and-such properties, a labeling function, and so on. A
construction where the nodes may be apples is very different from the intuitive
conception. If we were to pursue a proxy-like semantics for a hyper-evaluative
language, what we'd really want would be for the proxies to be something like
our intuitive conception of nodes. What we get instead are implementations like
the standard set-theoretic implementation of graph theory.

That's one source of discomfort with a proxy-like semantics. I think
much can be done to address this discomfort, and make the choice of proxies
more natural. (For example, de Bruijn indexing of the sort we use in Section
13 helps with this.)

Another source of discomfort is the thought: why should I understand these sentences:

(90a)	Alice is as smart as Alice
(91a)	Superman is stronger than Clark Kent

as interpreted by these semantics, to be saying things directly about Alice and
Superman, albeit in a coordination-sensitive way. Why shouldn't I understand
them instead as saying:

(90b)	A is somehow related to something that's as smart as what A is so related to [where A is the "Alice"-proxy].
(91b)	S is somehow related to something that's stronger than what CK is so related to [where S and CK are the relevant proxies].

I'm not saying the semantics conflates the (a) and (b) sentences. It would give
those object-level sentences different truth-conditions. [In CS terminology,
the proxies are "denoted values" but need not be "expressed values." (Compare
our contrast in Section 3 between "semantic values" and "referents.") And when
proxies are expressed values, as in (90b), that won't mean the same as
sentences like (90a) in which they're merely denoted.]

But these theories do make such worries about the meanings they describe quite
salient. At root these may just be familiar indeterminacy worries---and perhaps
it is an adequate response to them to say we *just have* an intuitive
understanding of saying things about Alice in a coordinated way, and this is
how we mathematically model the entailment relations and so on of the
intuitively-understood meanings. Still, when we're trying to make intelligible
a representational phenomenon that's not part of the canonical theoretical
toolbox, it would be nice not to confront this sort of worry so vividly.


[Jeff: your 2007 view improves on the < 2007 view by seeming less
typographical. But downsides:
	1. you no longer have any prospect of links across propositions, whereas
the earlier view might have
	2. is your approach subject to the same constraints as the Salmon/Soames
view criticized in Section 4? it'd be nice to get some idea how you'd introduce
linking when there's donkey anaphora
	3. maybe you're best understood as having moved from one sort of proxy (the
variables themselves) to another (whatever constitutes the nodes in the trees)
	4. the semantics isn't specified enough yet to code it, and I'm not sure
it's going to be straightforward how to finish the job. In what way exactly is
the semantic value of "Fxy v Gy" a function of the semantic value of "Fxy", the
semantic value of "Gy", and the pattern of variables "x,y,x" (or whatever the
pattern is..."Fxy" need not be atomic)? Do the first two arguments play a
genuine role? How does it work? I think you'll likely end up pursuing one of
the other strategies described here. Not that that's a problem; I'm just trying
to get clear about where you stand now.
]


My own current preference is for a semantics that takes a second strategy. I'll
call the second strategy a "grouped assignment function" strategy.

I'll explain this strategy in detail in Section 13. It rests on two basic
ideas: first, instead of thinking of semantic values as relative to assignment
functions, we can equally think of them as sets of assignment functions (the
ones on which a sentence is true). Second, we can work with a finer-grained
sort of assignment function. Instead of merely mapping a variable to an object,
an assignment function will also group that variable together with other
variables that are mapped to the same object. So instead of looking like this:

	w --> Jack
	x --> Jack
	y --> Jack
	z --> Alice

our assignment functions will instead look like this:

	{x}	--> Jack
	{w, y}	--> Jack
	{z}	--> Alice

Here we don't assign any intermediate proxies to our variables. Instead,
the assignment function's groupings do similar work.

Mathematically, this is not *all* that different from the first strategy. It
will still be vulnerable to Fine's typographical complaint. (At least, if we
construe functions in the standard way as sets of pairs. Perhaps we shouldn't
do that.) Sometimes I think it does better on the arbitrary proxy worry.
(Though other times it doesn't look much different to me than the view where
variables are their own proxies.) It doesn't invite talk of "indirect
reference"; so the worries about (90a) vs (90b) don't arise. This semantic
strategy seems no less nor more vulnerable to indeterminacy worries than any
ordinary semantics.


A third strategy for a coordinated semantics is based on the "combinatorial" or
"variable-free" semantics explored by Bealer, Jacobson, Szabolcsi, and others.
The basic idea behind combinatorial logic is that variables disappear on
analysis. Instead of "^x: Fx", we have just the predicate "F", whose semantic
value may be a function from objects to truth-values. Instead of "^x: ^y:
Fyx" we have just a predicate "C F" whose semantic value is the application of
a function or "combinator" C to the function that's the value of "F". What C does
is invert the order of the arguments supplied to the function value of "F".
It's possible to do combinatorial logic with a very spare inventory of primitive
combinators, and define others like C in terms of them.

The standard primitive combinators are S and K. It's not important what
these mean. I'll just observe that we should be able to turn any standard
combinatorial semantics into a coordinated semantics by replacing S with
different versions, some of which are understood as introducing coordination,
and others not, and some as retaining existing coordinations, and others not.
There never are any *variables* whose occurrences get coordinated, but we
instead operate on coordinated and uncoordinated argument places in the same
way that a standard combinatorial semantics does. But this is only an
expectation, uninformed by any attempt to work it out.

I find these approaches very interesting, but won't pursue them further
here. (Fine discusses these approaches at 2007, pp. 18-21.)


A fourth strategy is the "relational semantics" Fine sketches in 2007. We'll
discuss this next.


Section 11
----------

I understand Fine 2007 to have three thematic components. The first component
says things like:
	* contents should *somehow* reflect coordination relations
	* we should keep track of such relations across bodies of contents, too
	  (pp. 55-6, 77-8)
	* they should feed into some useful notion of consequence
	* they can do much of the work that Fregean senses were supposed to do
	  (and do it better)
There's little I disagree with about any of this. And I think other advocates
of hyper-evaluative or coordinated semantics can, and will want to,
subscribe to much of this too.

A second component is various ideas in the metaphysics and epistemology of
semantics, such as his distinction between semantic requirements and semantic
facts, what he says about transparency, his ideas about how intersubjective
coordination should be understood. I'm sympathetic to much of this as well; I
separate it into a second component because I think these claims may be more
negotiable for an advocate of hyper-evaluative semantics.

The last component is the details of Fine's semantic implementation. This is
what I understand his talk of "semantic relationism" to specifically mean. Fine
argues that semantic features shouldn't be understood as built up out of any
elements assigned to individual terms---not objects from the domains, nor proxy
objects either (though his resistance to such approaches is left implicit). Or
anything else of that sort. Instead, they should be understood as only being
functions of *sequences or patterns* of terms.

Fine gives only a few hints about how this should go. He sketches semantics for
a fragment of an extensional language on pp. 25-31. On pp. 53-7 he introduces
some changes: (i) now names are in the language; and (ii) now the semantics
generates structured contents rather than just extensions. The rest of the
semantics is left an exercise for the reader. And in a way that's
very satisfying. But at the same time, it leaves me unsure whether I'm understanding
even such elementary matters as the binary connectives in the way he
intends. Well, this should at least be *a* plausible way to understand him:

Orthodox semantics computes the semantic value of a complex formula "Fxy & Gy"
in the following sort of way.

		[[ Fxy & Gy ]]

	depends in a certain way on each of

	 [[ Fxy ]]  and  [[ Gy ]]	    <-- recursive bases

As we recurse towards the base clauses, we work with smaller and smaller
elements. We just have to apply base clauses to several such smaller elements.

Fine's innovation is do the recursion differently. Instead it will look like this:

	[[ Fxy & Gy ]] relative to @  <-- @ is a pattern that encodes which occurrences
					  of the same variable are coordinated
					  (quantificational binding has the effect
					  that not all of them will be; but for i
					  present purposes, I'll suppress this)
	depends in a certain way on

		[[ Fxy, Gy ]]	<-- at this step we calculate the semantic value of a
				    *sequence*; we're not calculating the semantic values
				    of several elements individually

	depends in a certain way on either of

	[[ Fxy, G, y ]] or [[ F, x, y, Gy ]] <-- here we have a choice of which way
						 to continue the computation, both of
						 which deliver the same final result;
						 contrast the orthodox semantics, where we
						 required *each* of two computations
	which depend (in different ways) on

		[[ F, x, y, G, y ]]	     <-- recursive base

At the recursive base, [[ F, x, y, G, y ]] will be a set of sequences of
property extensions (let them be functions to truth-values) and objects. This
set will be constrained by the coordination between the two occurrences of y:
every sequence in the set must have the same object in the two positions. An
earlier step in the computation would look like this. Let's take the way that
[[ Fxy, Gy ]] depends on [[ F, x, y, Gy ]]. That would come from a derived rule
like this:

	[[ Fxy, Gy ]] = { <a, b> | exists f,a1,a2: a = f(a1,a2) and <f, a1, a2, b>
	in [[ F, x, y, Gy ]] }

Note in such clauses we only need to supply *the objects* a1, a2 to the
function f. This semantics is extensional. Differences in coordination are
tracked in the semantic machinery, but no base predicate extensions depend on them.

An idea fundamental to Fine's whole book, though, is that this won't always be
so. Some predicates like "believes" (or functors like "the proposition
that") will be hyper-evaluative. They will need to differ in their extensions
when supplied with differently-coordinated arguments. It surprises me that Fine
does so little to make this explicit. He does say on p. 57 that the
coordination scheme now enters into semantic computations in a way it didn't
before. But the discussion there makes it look like its only role is to
determine what coordination links get into the structured proposition being
generated. Also, he says:

	"There is no difference in what it takes for the sentences 'Cicero wrote
	about Cicero' and 'Cicero wrote about Tully' to be true, even though there is a
	difference in their coordinated content." (p. 59)

without saying explicitly that for other sentences that is not the case. Some things
we say are hyper-evaluative, and so their extension will be sensitive to coordination
differences. (Acknowledging this will put some pressure
on us to abandon talk of a sentence's "truth-conditions." In the general case,
a predication's truth won't just depend on what its arguments are and what extra-semantic
*conditions* those objects satisfy. It may depend also on how those arguments
were coordinated in the predication.)

Fine comes close to saying that "believes" is hyper-evaluative at p. 139n11. As
I said, though, this is clearly fundamental to, and implicit in, most of his
book.


[Notes:

1. Fine introduces an interesting innovation on pp. 116-17 without making
explicit how will fundamentally change the semantics hinted at earlier
in the book. The idea is that 'believes[...]' can't be evaluated on its own,
but "only in the context of other formulas with which it might be coordinated."
That is, satisfying the open formula:

(92)	x believes Fa ... x believes Ga

requires a subject to have *coordinated beliefs* about a's being F and a's also
being G. What I think this demands of the semantics is that some of the semantic
computations operate on multiple complex terms in a sequence (terms that aren't
always adjacent) simultaneously. Though I find this idea interesting, on
balance I'm disinclined to pursue it. Consider an attribution like:

(93)	Each of Sam and Bess believes Fa, and Sam also believes Ga.

with the two occurrences of 'a' coordinated. Presumably the first
belief-predicate "believes Fa" only occurs once, and if its semantic
computation is coordinated with the computation of "believes Ga," then for Bess
to satisfy the first predicate, the way she thinks of a will have to be
intersubjectively coordinated with the way Sam (or she herself?) believes a is G. (Just as Fine
says needs to be the case for Sam to satisfy the first predicate.) But I don't
see why that must be so. Suppose Bess has a single name for a, and does not
herself belief a is G. Her name is also intersubjectively coordinated with
only one of Sam's two names for a, call it a1. Sam believes a is F with both
of the names she has for a, but only believes a is G with her a2 name. In this
case, I'd still expect the report (93) to be true, but it looks like
Fine's strategy would preclude that.

2. Even for belief reports taken singly, rather than as part of a sequence,
Fine is inclined to think they should be sensitive to more than just their own
arguments and coordination among them. They should also depend on coordination
relations to other attitudes not part of the present conversation (pp. 120-1).
I don't know whether I agree with this. But if it's true, I observe it's easier
for me to imagine how it might be implemented on, for example, a proxy semantics,
than on Fine's own relationist semantics.

3. On p. 139n13, Fine says approvingly that the CS notion of "pointers
may...[be] essential to the correct representation of logical form." But as we mentioned,
pointers are always implemented via proxies; so they look
like an alternative to Fine's relationist semantics, not a vindication of it.
Perhaps what he means here would be better put by saying that some ground
shared between different proposals about the right semantics for
hyper-evaluativity is what's essential. With this I wholeheartedly agree.

4. Trivia: For Fine, different occurrences of the same variable will always be
coordinated if they're simultaneously free. I say not. Also I think Fine never
has different variables being coordinated. I will do so. However, he never gave
an account of how to handle lambda abstraction. If he did, he might well go the
same way I do. Fine does allow typographically distinct expressions that aren't
variables to be coordinated through anaphora ("John...he"); and different
occurrences of a single name may or may not be coordinated.

End of notes]


Section 12
----------

Quibbles about some details aside, I don't really want to oppose Fine's
relationist semantics. It's not my preferred machinery; but I'm not not sure
*which* of the different ways to implement hyper-evaluativity will in the end
prove most satisfying. And as I said before, much of his book will appeal to
*any* advocate of hyper-evaluativity.

I only have two serious worries about Fine's discussion.

The first worry concerns Fine's arguments that no non-relationist semantics can
properly account for his "antimony of the variable." This is the challenge to
explain how the semantic roles of x and y can be the same in "x>0" and "y>0", but
different in "x>x" and "x>y". In these arguments Fine says things like this:

"The aim of [the orthodox] semantics...is to assign a semantic value to each
(meaningful) expression of the language under consideration. Suppose that an
expression E is syntactically derived from the simpler expressions E1,E2...En.
Then the semantic value |E| of E is taken to be the appropriate function
f(|E1|, |E2|...|En|) of the semantic values of the simpler expressions.
Given semantic values for the lexical items of the language...the semantic
value of each expression is then determined." (p. 25)

The expression "the appropriate function" here is doing a lot of work. For of course we don't
want "Rab" to come out having the same semantic value as "Rba." It matters not
only *which* items are being combined, but also *how* they are being combined.
That will determine which function of those semantic values of "R" and "a" and
"b" we work with. And now it's not obvious to me what is allowed to count as a
different way of combining and what is not. I don't take it as a given, for
example, that the only options for "different ways of combining" are different
permutations of those three semantic values to a single triadic function. Why
should an orthodox semanticist be barred from saying that the way ">" combines
with the semantic values of its arguments in "x>x" is different than the way it
combines with the semantic values of its arguments in "x>y"?

King endorses something like this in his 2007 p. 220n3 (with acknowledgment to discussions
he and I had about it). What he says doesn't get developed, and maybe it's a move
the orthodox semanticist shouldn't be allowed. But I'm not sure why that should
be so.

Or perhaps, when the orthodox semanticist goes down this path, what he ends up
will just be something like Fine's view, perhaps in other clothing. I don't
know. I'd like to know.


My second worry has to do with Fine's account of the essential role
coordination plays in learning. He writes:

"We wish to explain how the hearer might be justified in inferring that Cicero
is a Roman orator when he already knows that Cicero is Roman and is told
'Cicero is an orator,' though not when he is told 'Tully is an orator'... [I]n
the first case, the proposition is not merely added to [the hearer's
information] base but appropriately coordinated with the propositions in
it---and, in particular, with the proposition that Cicero is Roman. But in the
second case, the proposition is not coordinated with the other propositions in
the base... It is evident that the inference to Cicero being a Roman orator
will be justified in the first case, when the premises are coordinated, though
not in the second case, when the premises remain uncoordinated." (p. 83)

Now, consider: What *makes it* appropriate to coordinate incoming propositions
with ones already in the base in one way rather than another? Is it up to the
hearer to coordinate them however which way? Or is it a cognitive given that the
propositions should be coordinated one way rather than another? What
metaphysical picture should we have of this?

I expect *there is* a difference between the cognitive experience of hearing
incoming information coordinated one way with what you already believe, and
hearing it coordinated another. But are we supposed to take that
cognitive experience as a primitive explainer? I'd expect, instead, that
we should have some picture of what *makes it correct* for the subject to hear
information coordinated the way he does. Then on top of that we'd discuss
justified mistakes, and so on.

Maybe Fine has a picture of this in mind. What I'm expecting is a story that
will look something like this: in addition to the set of coordinated beliefs in
a subject's base, there are also some facts about the subject that amount to
extra book-keeping machinery. For example, perhaps the sentences he originally
acquired those beliefs by accepting (though so much detail won't be available
in general). This extra book-keeping machinery might then ground the
interpretation of an expanded sequence, consisting of sentences for his
existing beliefs, together with the incoming sentence. And now how it's
appropriate for his new belief to be coordinated with the old ones will derive
from how the semantics interprets that expanded sequence. This very much fits
the spirit of Fine's overall relationism.

Alternatively, perhaps the book-keeping machinery will serve to ground facts
about how the incoming sentence is inter-subjectively coordinated with
attitudes had by the subject's interlocutors. In general, though, we want an
account of learning that's available without other subjects.

I'm not sure that extra "book-keeping machinery" will really be needed, but I
expect it will. And if it is, it becomes a delicate question exactly what
advantages Fine's approach still has over the "typographic" approaches he
rejected. Won't the book-keeping machinery be a kind of mental typography? This
is my second worry.

It's just a worry. There are lots of choices to be made in developing what Fine
says. But at any rate, this is what I meant earlier when I said I had doubts
whether Fine's own view will retain the moral high ground about entanglements
with typography.

[Hovda, "Semantics as information about semantic values" mentions a worry like
this as well (manuscript p. 8).]


Section 13
----------

Now I'll sketch my own preferred formalism and semantics. As I said earlier, I
think it's quite open what is the philosophically most satisfying way to do this
and I'm still studying the issue. (The semantics below are also rapidly evolving.)

I'll think only about coordination of variables. For semantic purposes, I'm
inclined to treat names as just variables which may (or may not) have extra
constraints on their assignments. I think this formal strategy is neutral on
questions about whether names have important semantic differences from
variables, or whether names can be bound, or can have their assignments shifted
by operators, as Cumming 2008 argues.

I'll ignore context-sensitivity; it's orthogonal to the issues we're
considering.


A notational decision I've made is to make passing by reference be the default,
and require a special symbol to express passing by value (and in so passing, breaking
coordination). Let's use the prefix "$" for this. There's a choice where
to put this symbol. Should we say:

	(lambda $x: BODY) y

Or should we instead say:

	(lambda x: BODY) $y

I've chosen to go the latter way.

When an argument is supplied to a lambda abstract without $, its coordinations get
carried along with it. For example, in:

(94)	Fx v (lambda z: Gz)x

the argument G is supplied will still be coordinated with the argument F is
supplied. This means that:

(95)	Fx v (lambda z: Hxz)x

is not equivalent to:

(96)	Fx v (exists z: z=x & Hxz)

If H is hyper-evaluative, it sees coordination between its arguments in (95),
but not in (96). What (96) is equivalent
to is:

(97)	Fx v (lambda z: Hxz)$x

When a complex expression is supplied as an argument to a lambda abstract, the
value passed in is not coordinated with any other terms, even ones happening to
have the same value.

We can also use the $ operation with atomic predications:

	F($x, $y, y)
	$x = y

Although in many cases, such as the second, it will never make an extensional difference.
I use "=" to express the familiar, fully evaluative relation of numerical
identity.


[Comments:

1. Where E is a formula with a single occurrence of y, which is both:

	(i) free, and
	(ii) wouldn't capture x, that is, it doesn't occur in any term of the
form "lambda x: F", "exists x: F", or "every x: F",

E[y->$x] should turn out equivalent to "exists y: y=x & E".

When we see an expression of the form "...F($x,...)..." there will be a
question about what "scope" to let the $x expand to: that is, there will be
several Es such that that formula counts as an E[y->$x]. We adopt the
convention of always interpreting that with the largest context E in which x
occurs free. So:

(98)	exists x: Fx v G$x

is:

(99)	exists x: (Fx v Gy)[y->$x]

rather than:

(100)	exists x: Fx v (Gy)[y->$x]

To write (100), we'd need to use instead:

(101)	exists x: Fx v (lambda z: G$z)x


2. The $ notation isn't able by itself to express coordination-breaking operations in full
generality. Using subscripts to represent coordination, we might sometimes want
to say:

(102)	exists x: F(x1,x1,x1,x2,x2)

Here all five variables are referentially bound by the same quantifier, but
they've been broken up into two coordination groups. To do this with my
notation, one needs to combine $ and lambdas, for example:

(103)	exists x: (lambda z: F(x,x,x,z,z))$x


3. One doesn't want to evaluate (lambda y: ...)x substitutionally.
For consider:

    (lambda y: Hxyz v (exists x: Rxy))x
               ----------------------
and suppose H or R may be hyper-evaluative. Let BODY be the underlined formula.
We can't just evaluate BODY[substitute x for free ys], because then the second
argument to R will be captured by the existential quantifier. We can't just
evaluate BODY[substitute y for free x], assigning y the same value as x,
because then we might lose track of coordinations between z and x. (The whole
formula may be embedded inside another (lambda z: ...) x).) The best way to
handle this is to evaluate BODY directly, but in a way that introduces a new
coordination between x and BODY's ys.


end comments]


SYNTAX

*   Atomic predicates F,G,H,... of adicity >0
*   Variables w,x,y,z,...
*   If F is a predicate of adicity n, and x1..xn are variables, then Fx1..xn is a
sentence, that is, a formula with adicity 0.
*   If x and y are variables, then (x = y) and (x hyperequal y) are sentences.
*   Anywhere a variable x can appear, so too can $x. (Iterations are not allowed.)
*   If E1 and E2 are sentences, then so too are (not E1) and (E1 or E2).
*   If E is a sentence, and x1..xn are variables (free in E?), then (lambda x1..xn:
E) is a predicate of adicity n where x1..xn are no longer free.
*   If E is a sentence, and x a variable (free in E?), then (exists x: E) is a
sentence where x is no longer free.

(E1 & E2), (E1 horseshoe E2), (every v: E), and so on can be defined in the usual
way.


We'll introduce the semantics in stages. First, we'll give a semantics for a orthodox,
uncoordinated and extensional language (lacking $ and hyperequal). Then we'll discuss how to extend it
to a hyper-evaluative language.


STACKS

de Bruijn 1978 had the idea to eliminate variable names as follows. The
"lexical depth" of an occurrence of a bound variable is how many scope levels
that occurrence is away from the operator that binds it. He counts the most
local scope as 0, the next closest outer scope as 1, and so on. In this
example:

(104)    ^y: (^x:  y  x)  y
                   1  0   0

the lexical depths are indicated below the variables. Note that y has a lexical
depth of 0 in its second occurrence, but a depth of 1 in its first occurrence,
because it's there more deeply embedded (inside the "^x:..." term).

Using this technique, we could eliminate arbitrary variable names and just replace every
variable-occurrence with an indication of its depth from the operator it's
bound by:

(105)    ^< >: (^< >:  <1>  <0>)  <0>

This has some computational and metalogical advantages.

We're going to draw from this technique. Doing so isn't *necessary* for a
hyper-evaluative semantics, but it makes some things cleaner. We won't require the
object language to be written in form (105); instead we'll keep track in a
"binding stack" of which variable symbols have been bound by which operators.
We'll also have an "environment stack" that maps bound variables, specified by
their lexical depth, into the domain. Variables that are never bound will
be handled in the familiar way (these can be thought of as constants). These
two stacks divide up the work of a standard assignment function into two
components: first, mapping variable symbols to lexical depths; second, mapping
lexical depths (and unbound variables) into the domain. When we move to doing
hyper-evaluative semantics, this division of labor will be useful. We'll
complicate the environment stack while leaving the binding stack the same.

Let's settle some general issues about our stacks.

We'll understand a stack of length n into X to be a function from 0..n-1 into
X. If s is such a stack, we'll let #s indicate the length of s. We'll let s[0]
be the element of X that s maps 0 to, and so on. It will be useful to refer to
stack indices backwards from the end; so where #s = n, we'll let s[-1] be
s[n-1], s[-2] be s[n-2] and so on.

It will be convenient for us to number lexical depths differently than de Bruijn
does. We'll use -1 for the most local scope, -2 for the next one, and so on.

Since the environment stack needs to map into the domain not just the lexical
depths of bound variables, but also free variables, we'll have the environment
stack be a function from the natural numbers U the variables into the domain. I
assume that no variables are natural numbers. (If they might be, then the
domain of the environment stack should be a "tagged" or disjoint union of them,
but I won't bother with that.) The length of the environment stack will still
just be the number of integers it's defined on; the free variables don't affect
its length.

If s is a stack of length n into X, and x is a member of X, then "s push x" will be the
stack of length n+1 that maps #s to x and is otherwise just like s. So:

    (s push x)[-1] = (s push x)[n] = x
    (s push x)[-2] = (s push x)[n-1] = s[n-1] = s[-1]
    ...
    (s push x)[-(n+1)] = (s push x)[0] = s[0] = s[-n]

And "s ? x" will be the first of the indices -1, -2, ..., -n such that
s[s ? x] = x, or x if there's no such index. That is, s ? x searches backwards from the
end of the stack to return the first (negative) index mapped to x. The push and
? operations are blind to whether a stack also maps variables or anything other
than integer indices into X.

Finally, if <i, j, v, ...> is a sequence of integer indices and/or variables,
and e is an environment stack, we'll understand e@<i, j, v, ...> to be the sequence
<e[i], e[j], e[v], ...>.


ORTHODOX SEMANTICS

A model M is a pair of a domain D and a lexicon L. The lexicon maps atomic
predicates of arity n > 0 into sets of n-sequences of members of D.

b will be a binding stack from integer indices into the set of variables. e will
be an environment stack from integer indices U free variables into D.

1.  Where F is an atomic predicate, [[ F ]] wrt M,b,e = L(F). The interpretation
of "=" is the expected one.

2.  Where F is a n-ary predicate, atomic or not, and x1..xn are variables,
[[ Fx1..xn ]] wrt M,b,e will be:
    true if e@<b?x1, b?x2, ... b?xn> is in [[ F ]] wrt M,b,e
    else false

What's happening here is: we lookup each variable in the current binding stack b to see if it's
recorded in the environment by its lexical depth (i.e., it's bound by some
surrounding operator), or whether it's free. We map the environment onto the
sequence of indices to get a sequence of objects from the domain, and check
whether they are in the set which is the interpretation of the predicate.

3. Where E1 and E2 are sentences, [[ E1 v E2 ]] wrt M,b,e is:
    true if either [[ E1 ]] wrt M,b,e or [[ E2 ]] wrt M,b,e are true
    else false
Similarly for [[ not E ]].

4. Where E is a sentence, [[ exists x: E ]] wrt M,b,e is:
    true if some d in D is such that [[ E ]] wrt M, (b push 'x'), (e push d) is true
    else false

5. Where E is a sentence and x1..xn are variables (free in E?), [[ lambda x1..xn: E ]] wrt M,b,e is:
    { <d1, ... dn> | d1, ... dn in D and [[ E ]] M, ((b push 'x1') ... push 'xn'), ((e push d1) ... push dn) is true }

This language is first-order; we don't permit any operations on n-ary
predicates except supplying them with n arguments. So x1..xn are bound by a
single compound lambda operator; expressions like "(lambda x: (lambda y: Fxy))"
aren't well-formed. Nonetheless, we still count 'xn' as having lexical depth -1
and 'x1' as having lexical depth -n.

If E is a sentence and A is an assignment of E's free variables into D, then
E counts as true on M and A just in case:
    [[ E ]] wrt M, a length 0 binding stack, and A taken as a length 
    0 environment stack is true.


HYPER-EVALUATIVE SEMANTICS

Now we let L map atomic predicates into sets of *grouped* n-sequences, and our
environment will *group* the lexical depth indices and free variables that it
maps into D. We'll have something like this:

	{-1}	--> Alice
	{-2, y}	--> Jack
	{x}	--> Jack

A grouped function (GF) from set A into set B can be understood as a pair of a function f from A
into B and an equivalence relation ~ on A, such that for any a1,a2 in A, a1~a2
implies f(a1)=f(a2). We'll call the set of As equivalent to a1 under ~ the GF's GROUPING of a1.

If C is a GF from A into B and C- is a GF from A- into B-, we'll say that C-
RESTRICTS C just in case (i) A-,B- are subsets of A,B respectively; (ii) for
any a in A-, C- and C assign a the same element of B-; and (iii) for any a1,a2
in A-, a1 and a2 are grouped together by C- iff they're grouped together by C.

Our binding stack will be the same kind of ungrouped stack as before, and ? and
push will work the same for it.

With our new grouping environments, pushing operations raise the question of
how the index of a newly pushed value should be grouped with existing elements
in the stack's domain. We'll define "e push ungrouped x" to be the result of
pushing x to the next index #e in e, such that #e isn't grouped together with
any existing elements. We'll define "e push [i]" to be the result of pushing
the existing value e[i] to the next index #e in e, such that #e is grouped
together with i's existing group. And we'll define "e push *" to be the set of
all possible pushings of an entity from D to the next index #e in e, with all
legal groupings of #e together with existing groupings. That is, "e push *" is
the set of all length #e+1 environment stacks that e restricts.

Finally, we'll redefine e@<i, j, v, ...> so that it's now ((e push [i]) push [j]) push [v]...
Here's why we need to do this. Consider the formula:

    lambda y: Hxy

This is a 1-ary predicate. But since H may be hyper-evaluative, this predicate's
interpretation needs to attend to whether what's being passed to y is or
isn't grouped with x. For instance:

    (lambda y: hyperequal? x y)x

should be true, but

    (lambda y: hyperequal? x y)z

should be false when z and x are ungrouped, even if z has the same value as x. This should also always be false:

    (lambda y: hyperequal? x y)$x

What this means is that we can't let the semantic value of:

    lambda y: Hxy

just be a set of 1-sequences of values for y. It has to be something more
holistic. We're letting its semantic value instead be a set of grouping
environments, defined over free variables like x as well as the lexical
depth of y. In particular, the semantic value of:

    (lambda y: hyperequal? x y)

will be the set of environments that group x and -1.

Now our semantics goes as follows:

1.  Where F is an atomic predicate, [[ F ]] wrt M,b,e = L(F), which will now be
a set of grouping environments of length n into D. The interpretation of
"=" and "hyperequal?" are the expected ones.

2.  Where F is a n-ary predicate, atomic or not, and x1..xn are variables,
[[ Fx1..xn ]] wrt M,b,e will be:
    true if e@<b?x1, b?x2, ... b?xn> is restricted by some member of [[ F ]] wrt M,b,e
    else false

If any of the xs is of the form $x, we use a variation. For example, suppose we have
F$x1,x2,x3. Then the interpretation will be, not:

    true if ((e push [b?x1]) push [b?x2]) push [b?x3] is restricted by ...

but instead:

    true if (e push ungrouped e[b?x1]) push [b?x2]) push [b?x3] is restricted by ...

3. Where E1 and E2 are sentences, [[ E1 v E2 ]] wrt M,b,e is:
    true if either [[ E1 ]] wrt M,b,e or [[ E2 ]] wrt M,b,e are true
    else false
Similarly for [[ not E ]].

4. Where E is a sentence, [[ exists x: E ]] wrt M,b,e is:
    true if some d in D is such that [[ E ]] wrt M, (b push 'x'), (e push ungrouped d) is true
    else false

There are three natural options for quantifiers in a hyper-evaluative language. A "substitutional" quantifier
would count:

    (exists y: hyperequal? x y)

as true, because "hyperequal? x x" is true. A "purely evaluative" quantifier would validate existential
generalizations only when a formula is satisfied by *every* way of referring to the same entity. With the
quantifier understood in that way, the first comes out false but so too does:

    (exists y: x = y & not hyperequal? x y)

because "x = x & not hyperequal? x x" is false. A third "anonymous reference" quantifier also makes the first come
out false, but makes the second come out true. It's enough to validate existential generalizations of this sort
when a formula is satisfied by a new, ungrouped ("anonymous") term referring to a given object.

I've chosen to go with the "anonymous reference" quantifiers. If one went instead for the substitutional quantifiers,
the purely evaluative quantifiers could be defined in terms of them, as follows:

    exists-eval x: E
	<==>
    exists-subs y: (every-subs x: x=y horseshoe E)
	<==>
    exists-subs y: (lambda x: E) $y


5. Where E is a sentence and x1..xn are variables (free in E?), [[ lambda x1..xn: E ]] wrt M,b,e is:
    { e+ | e+ is a member of e push * ... push * [n times] and [[ E ]] M, (b push 'x1' ... push 'xn'), e+ is true }


Section 14
----------

For a long time I've been an advocate of de re beliefs as a fundamental
philosophical kind. (Though not of the satisfaction conditions for de re belief
reports being such.) I've also long thought that there are interesting
acquaintance requirements on these de re belief.

I've also long had glimmerings of the ideas exposited above; but they were
confused and partial. Now that I've gotten clearer in my thinking about
hyper-evaluativity, I'm inclined to think it's the proper home for much of what
I before wanted to hold about de re belief. I now think like this: fix an
initial class of attitudes, call them the "acquaintance" class. There will then
be a natural class of other attitudes that are coordinated with the starting
class. And I do think what holds this class together is philosophically
fundamental; the unity of such classes will be important to philosophy of mind
and epistemology and philosophy of language and action theory and so on.

However, did anything privilege the initial selection of the "acquaintance"
attitudes? Are there any fundamental facts about whether, say, testimony-based
attitudes should or shouldn't be included. I am now no longer very sure about
this. Maybe there are. Maybe there aren't.

At any rate, my own interest for now is to explore the connectedness relations
that hold "de re beliefs" together, rather than questions about the right
starting class.

Jeshion once said (discussing Donnellan?) that the debate about de re attitudes
threatened to just collapse into Frege's Problem. The way I'm now thinking of
the issue, this is sounding more and more right.


vim: ft=txt ts=8: