9512.net
甜梦文库
当前位置:首页 >> >>

BELIEF-DESIRE COHERENCE


BELIEF-DESIRE COHERENCE

by Stephen Petersen

A dissertation submitted in partial ful?llment of the requirements for the degree of Doctor of Philosophy (Philosophy) in The University of Michigan 2003

Doctoral Committee: Associate Professor Eric Lormand, Chair Associate Professor James Joyce Assistant Professor Thad Polk Assistant Professor Jessica Wilson

c

Stephen Petersen 2003 All Rights Reserved

?

ACKNOWLEDGEMENTS

Tradition compels me to write dissertation acknowledgements that are long, effusive, and unprofessional. Fortunately for me, I heartily endorse that tradition. First, of course, I want to thank my chair, Eric Lormand. If our marathon advising meetings (of up to six hours) hadn’t guaranteed me many times the usual amount of mentoring a graduate student receives, then the sheer frequency of our sessions would have. I’ve learned an enormous amount of philosophy from him in that time, and though we disagree on some key points, it’s against a wide background of his views that I have simply absorbed. More importantly, I have also learned a great deal about how to do philosophy from Eric. In summary: over the years his in?uence on my philosophical development has run very deep. Besides that, he even came to my improv shows, and copied Muppet tapes for me. Next I want to thank Jessica Wilson. She stepped in a year and a half ago as my second advisor, to advise me on a ?eld that (she claims) is not her specialty. Her help was just what I needed, and I don’t think I could have ?nished without her. She is a natural mentor—she has a talent for mixing, in ideal proportions, genuine encouragement and enthusiasm with careful and constructive criticism. Jim Joyce agreed late in the game to be my third committee member, but already by then he had helped me a great deal both philosophically and professionally, just in his role as general department mensch. And Thad Polk, my cognate committee member from cognitive psychology, rose above the call of what’s too often a mere duty of formality. ii

He cheerfully tolerated my philosopher’s take on cognitive science, and even pointed me toward some of the real stuff. Many other professors expended a great deal of time and effort on my education; just about each member of the department has helped me at one point or another. I’d especially like to thank Ian Proops for a wonderful working relationship during my pre-candidacy, and continued moral support throughout my years here. David Hills and Jason Stanley also gave me much of their time and wisdom in my pre-candidacy days. Rich Thomason educated me about arti?cial intelligence where he could, and I loved debating philosophy with Dick Alexander over in the biology department. Louis Loeb and David Velleman served as model philosophy teachers for me. The philosophy department staff were always there for me, and I’m especially grateful to Sue London, Linda Shultes, and Michele Bushaw Smyk. Among other things, they all helped me get the full bene?t of the fantastic funding package that the philosophy department provided. I was always proud to tell the graduate student union here how generous the philosophy department is. On the next rung of the mentoring ladder are my fellow graduate students. I owe a special debt to my seniors in the program, notably Jim Bell, Karen Bennett, Jeanine Diller, Jeff Kasser, Evan Kirchhoff, Katie McShane, Angela Napili, Gerhard Nuffer, Greg Sax, Laura Bugge Schroeter, Nishi Shah, and Kevin Toh. And though of course playing at friendship and philosophical apprenticeship with Robert Mabrito was just a ruse toward foiling his diabolical plans, it turned out especially edifying. My cohort had great in?uence on me in both senses of ‘great’—especially Stephen Martin and Blain Neufeld. Among my “juniors” I have had friends and philosophical comrades like Steve Daskal, Remy Debes, David Dick, Alexa Forrester, Soraya Gollop, Charles Goodman, Liz Goodnick, Robert Gressis, Alex Hughes, Rob Kar, Hanna Kim, Matt Pugsley, Justin Shubow, Paul Sludds,

iii

Tim Sundell, and Gabriel Zamosc-Regueros. The last link of the university food chain are my many students, from whom I always learned a lot, and to whom I’m grateful. Then there are just plain friends, who deserve more credit for my completing this dissertation than they likely know. Tilt, the comic improv group I founded and “artistically direct”,1 was especially crucial to my survival; no matter how down I got, they made me laugh and got me to play, week after week. I grew especially close to Robert Gressis, Steve Kime, and Jenni Pickett through the years—thanks for those crucial buckets of ?sh. Another hobby was Project X. (All its members have been named, so no need to repeat them here, but extra thanks are in order.) Hillary Holloway, one of my best friends for ?fteen years now, had the good grace to share ?ve of those in Ann Arbor with me while she got her own PhD. She was especially supportive during those worst of times, in 2001 and 2002. So were, alternately, Jenifer Clark and Hanna Kim. There are many other friends who made Ann Arbor a better place; I’ll put them in a smaller-type footnote so that I don’t exceed an embarrassing three pages.2 Then there are the many friends from other times, or walks of life, or parts of the world.3 I’m sure I’ve tragically forgotten some. It’s okay; I suspect that, to paraphrase the oracle, they know who they are. Thanks also, of course, to my family. And that reminds me: thanks also to my therapists. The discovery of therapy was an unintended side-effect of the dissertation process. Finally, I’d like to thank the various telemarketers I’ve sued. After I ran out of departmental support, damages from their illegal activities funded my ?nal year.
The scare quotes are important. Tim Athan, Amy Burke, Anna Chen, Susan Chimonas, Jon Colman, Chris Consilvio, Alex DesForges, Eric Dirnbach, Anne Duroe, Rob Gray, Ben Hansen, Carmen Higginbotham, Jessica Hughes, Hana Ishizuka, Mark Krasberg, Kimberly Labut, Ji-Young Lee, Kevin Pimentel, Alix Schwartz, Kerry Sheldon, Melanie Sonnenborn, Becca Treptow, Mary Wagner, Mary Ann Wiehe, Wild Swan Theater, and Mina Yoo. 3 Most notably: John Ackermann, Andy Brownstein, Evan Cohen, Heather Cross, Bill Egbert, Brad Elliott, Alison Gima, Steve Meyers, Linda Perlstein, Eiko Sakai, Andy Sernovitz, Sharon Sidlow, Elijah Siegler, Aaron Thieme, and Al Weiss.
2 1

iv

TABLE OF CONTENTS

LIST OF FIGURES CHAPTER

I. INTERNAL PRAGMATIC EPISTEMOLOGY . . . . . . . . . . . . 1.1 1.2 Creatures and functions . . . . . . . . . . Functions and internal epistemology . . . 1.2.1 Internal functions . . . . . . . . 1.2.2 Epistemic norms and guidance . 1.2.3 Guidance and epistemic agency 1.2.4 Guidance and internal functions 1.2.5 A concrete example . . . . . . . Internal epistemology and learning . . . . 1.3.1 Learning and intelligence . . . . 1.3.2 Learning and induction . . . . . The normative status of the standard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.3 1.4

II. WISHFUL THINKING AND COHERENCE EPISTEMOLOGY . . 2.1 Wishful thinking . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1 Epistemic relativism and “anything goes” epistemology 2.1.2 The wishful thinking objection . . . . . . . . . . . . . 2.1.3 Wishful thinking as rational . . . . . . . . . . . . . . 2.1.4 Wishful thinking as irrational again . . . . . . . . . . 2.1.5 Thinkful wishing . . . . . . . . . . . . . . . . . . . . 2.1.6 A way out: spontaneous thoughts . . . . . . . . . . . 2.1.7 External constraints . . . . . . . . . . . . . . . . . . . Coherence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2.1 Rational desires . . . . . . . . . . . . . . . . . . . . . 2.2.2 Intelligence and rich aims . . . . . . . . . . . . . . . 2.2.3 Defaults and foundherence . . . . . . . . . . . . . . . 2.2.4 Coherence and humans . . . . . . . . . . . . . . . . .

2.2

?????????????????¤??????????¤???? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?????¤??????????¤???????????
v

ACKNOWLEDGEMENTS

ii vii

1 2 4 5 7 11 13 16 18 24 26 29 31 31 32 34 35 41 44 46 48 52 52 55 59 62

III. COMPUTATIONAL EPISTEMOLOGY . . . . . . . . . . . . . . . . 3.1 Belief-desire coherence (BDC) . . . . . . . . . . 3.1.1 Speci?cations . . . . . . . . . . . . . . 3.1.2 Machine learning . . . . . . . . . . . . 3.1.3 Formal coherence . . . . . . . . . . . . 3.1.4 The belief-desire coherence algorithm . 3.1.5 The architecture, and a toy example . . 3.1.6 Next steps . . . . . . . . . . . . . . . . Advantages and further issues . . . . . . . . . . . 3.2.1 DECO, ECHO, and BDC . . . . . . . . 3.2.2 Explanations, contradictions, and BDC 3.2.3 Folk psychology and BDC . . . . . . . 3.2.4 Emotions and BDC . . . . . . . . . . . 3.2.5 Ethics and BDC . . . . . . . . . . . . 3.2.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

64 64 65 67 69 71 73 77 78 78 83 87 90 93 96 97 97 98 100 106 109 112 116 118 123 124 132 134 140 146 157 159 159 162

3.2

IV. TRUTH AND INTERNAL EPISTEMOLOGY . . . . . . . . . . . . . 4.1 Can we aim at truth? . . . . . . . . . . . . . . . . 4.1.1 A warm-up argument . . . . . . . . . . 4.1.2 BDC, a priori beliefs, and the truth aim 4.1.3 Evidentialism . . . . . . . . . . . . . . 4.1.4 More general problems: content . . . . 4.1.5 More general problems: means to truth 4.1.6 Lormand’s responsible searching . . . . 4.1.7 Dogmatism and truth . . . . . . . . . . Should we aim at truth? . . . . . . . . . . . . . . 4.2.1 Skepticism and differences that matter . 4.2.2 BDC and truth’s value . . . . . . . . . 4.2.3 The intrinsic value of truth . . . . . . . 4.2.4 Truth conditions and truth’s value . . . 4.2.5 The instrumental value of truth . . . . . 4.2.6 Meta-pragmatism . . . . . . . . . . . . Concessions to truth . . . . . . . . . . . . . . . . 4.3.1 A priori beliefs and external aims . . . 4.3.2 Pragmatic evidentialism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

4.2

4.3

vi

LIST OF FIGURES

Figure 3.1 3.2 A somewhat sophisticated, but unintelligent, creature. . . . . . . . . . . A toy example of the BDCA. . . . . . . . . . . . . . . . . . . . . . . . 75 76

vii

CHAPTER I

INTERNAL PRAGMATIC EPISTEMOLOGY

Here are two great reasons for wanting to investigate intelligence: ?rst, because we would like to build machines that are intelligent; second, because we would like to be more intelligent ourselves. Those who focus on the ?rst reason tend to be in arti?cial intelligence, and those who focus on the second tend to be in normative epistemology. But people in both ?elds are interested in ways to make things smarter, and people in both spend at least part of their time trying to ?gure out just what it is to be smarter, anyway. In the next three chapters I will develop my positive answer to this question: a computational, internal, pragmatic, coherence epistemology. I call the theory, somewhat inaccurately and incompletely, belief-desire coherence (or BDC for short). 1 I save the ?nal chapter for those reluctant to give up a truth-based approach to normative epistemology. Now I do not know exactly what constitutes intelligence, but in this chapter I do think I can give some interesting necessary conditions. For something to possess intelligence I claim ?rst that it must be what I will call a creature. And second, for at least an interesting level of intelligence, that creature must be capable of learning in a certain way. Greater capacity for learning—on a measure to come later—correlates well with the intelligence
Inaccurate because the coherence is not restricted to belief-level cognitions and desire-level conations, as we will see; incomplete because the name fails to include the view’s computational nature, foundational impurities, and roots in an internal, learning approach to epistemology.
1

1

2 we are independently willing to attribute. Notably intelligent creatures (like pigeons, for example) can learn, adjusting their cognitive mechanisms toward improvement at some task. Especially intelligent creatures, like humans, can even learn how to think better: they can learn new ways to form beliefs and desires. Learning of any kind requires, I will argue, an internally available standard of correctness. In particular, learning to think better requires an internal standard of correctness for thought. I will argue this internal standard for learning must take the form of matching cognitions to conations—something like what we might call subjective pragmatic success. As we will see later, there is a natural choice for an algorithm to compute according to this standard (given basic cognitions and conations), and a natural architecture to implement this algorithm that as it happens is like our own. Thus the standard and its potential implementation should be of interest both to those in cognitive science and normative epistemology.

1.1 Creatures and functions
Only some things are even in the running for the label “intelligent”. Animals can be intelligent, and maybe someday machines could be, and maybe some sophisticated machines are now to some degree intelligent. But couches, rocks, and solar systems de?nitely are not; they are just not the kind of thing we would even consider potentially smart. I think that is because for a thing to count as intelligent, that thing must ?rst be a creature. Of course this is not a suf?cient condition; there are many things we would naturally call “creatures” that are not intelligent. Single-celled organisms, for example, do not seem to be intelligent in any recognizable sense, yet they are clearly creatures. I also would consider plants to be unintelligent creatures, though I am not sure whether that is squarely within ordinary usage.

3 At any rate here is a rough characterization of creatures that at least approximates the ordinary notion: to be a creature is to have at least one function that can be performed autonomously—that is, solely through the performance of sub-functions internal to the creature. I emphasize that this is rough, and in particular I am not sure about the “autonomous” clause. But I do not think this will matter in what follows; the only aspect I lean on philosophically is the functional approach. The intuition is that creatures are things that have what we can loosely call wants— they have preferences for the world to be a certain way, and they do things in the world to achieve that way. Plants want their leaves aimed at the sun; robins want to stay warm in the winter; humans want better local theater. The ?rst two can loosely be called “wants” because of functions those creatures possess and are capable of bringing about without assistance. By contrast the earth causes things nearby to fall toward it, but that is not even loosely a want of the earth’s, for it is not a function of the earth to cause the world to be this way. And though knives have the function of cutting things, this is not even loosely a want for knives, because they have no sub-functions designed to bring about achievement of this function autonomously. To attribute creaturehood to something, we have to see it as having a function, and we have to see it as able to perform that function on its own through sub-functions. It is this latter requirement that allows us to see the thing as “trying” to achieve the basic function, and thus as having a want. Since ‘want’ is not really the right word, though, I will often call such functions of creatures aims. Note that this de?nition does not require creatures to be members of some biological natural kind, and I take this to be an advantage. Suppose with impressive new technology we managed to reverse-engineer a nightingale, and then build a robot that mimicked its functional behavior in all important details. By this characterization, that robot can be just as creaturely as its prototype. To say otherwise is, to my mind, to be biologically chau-

4 vinistic. You might object that the robot bird cannot autonomously bring about its basic functions, since it required us to build it in the ?rst place. But the biological nightingale had to be designed as well, by natural selection. (I will use ‘design’ in this inclusive sense throughout.) Things cannot be required to bring about their own existence, or set their own fundamental aims, in order to count as creatures. Or you might instead object that by this standard, my thermostat is a creature. It has the function of keeping the room a certain temperature, which it performs autonomously via an internal function to correlate a simple thermometer (usually a coil of thin metal) and a switch for the heater. Here I bite the bullet: yes, my thermostat is a very primitive creature, according to my characterization. There is a reason that Daniel Dennett, for example, picks the thermostat for “as degenerate a case of an intentional system as could conceivably hold our attention for more than a moment.”2 We can attribute primitive intentional states to it that we cannot for, say, a radio. I think the line I have drawn for creatures explains why. It provides a lower limit for things toward which we can take what Dennett calls the “intentional stance”. For similar reasons we can attribute primitive intentional states to the new vacuum cleaners that scoot about the ?oor on their own, and maybe for my email program’s Bayesian spam ?lter, and so on.

1.2 Functions and internal epistemology
The simple notion of a creature already gives us purchase on internal approaches to epistemology. Before I show how, I should emphasize that taking an internal approach to epistemology need not make one an internalist; there may be several worthwhile epistemic norms, some internal and some external.3 I only wish to argue that one clear reason to
Dennett (1981) p235. See for example Alston (1993) or DePaul (2001) for pluralistic views of epistemology. The core of DePaul’s argument is short and sweet enough to present here: if knowledge is better than true belief, then there is some epistemic norm other than truth!
3 2

5 examine epistemic norms demands an approach like the one that has traditionally been called internal. So my proposal opposes externalism to the extent that externalism denies the existence of any interesting internal epistemic standard, but does not oppose external epistemic projects in general. 1.2.1 Internal functions Entities gain functions by design, which is (roughly) a process of giving feedback toward a goal. The goal can be explicitly represented in the designers, as when primitive humans made knives out of rocks by chipping them. In effect they gave the shape of the rock feedback by encouraging some shapes and discouraging others, toward their goal of sharper rocks. The resulting knife thereby has the function of cutting things. Such notions of design depend on intentionality, but there are naturalistic approaches to functions that do not require the explicitly goal-directed feedback of an intelligent designer. The goal can also be implicit in the feedback, as for example when the environment gives feedback to genome types through natural selection. More precisely (but still roughly): we properly say the effect of a thing is a function when the explanation for the thing’s existence depends in the right way on successful achievement of that effect by it or others of its ancestral line. 4 Achieving the effect gives positive feedback by encouraging the existence of that thing or its replicated progeny. If feedback designs a functionally organized entity with enough sophistication, that entity can develop its own internal functions, and we have the makings of a creature. For example, I have mentioned the thermostat has an internal function to correlate a crude thermometer and a heater switch. Optimizing a function completely internal to the thermostat
This etiological-selectional account of functions is based on the more precise work of Ruth Garrett Millikan, as in Millikan (1984). I think, though, that my account might also be amenable to the “homeostatic” characterization of functions, and perhaps to the “instrumental” characterization as well. For references see the helpful Manning (1998).
4

6 allows it to optimize the external function of regulating room temperature for which it was designed. Similarly, evolution designed hermit crabs with an internal function that correlates food-like splotches in their visual ?eld with trajectories of motor-nerve stimulation to their pincers. This allows them to optimize their external function of getting food (which in turn allows them to optimize the function of passing on genes). 5 Similarly again, we humans may have been designed by evolution to have functions for gathering (mostly) true beliefs. These functions may be optimized via an internal function that correlates with true beliefs—such as, oh, say, maximizing some kind of belief-desire coherence. In general let’s say that “internal” functions are such that you would have to change the thing’s intrinsic properties to change its success in achieving that purposed effect; “external” functions are such that you could alter success in achieving the relevant effect just by changing some relational properties of the thing. In each of the examples above, there is only a contingent connection between the success of the internal function and the success of the external one. Someone might be holding a lighter under the thermostat’s thermometer, or it could be wrapped in an ice pack. The switch could be connected to a broken heater, or not connected to a heater at all. If so the thermostat could still optimize its internal function—correlating the thermometer with a switch—and yet not optimize its external function of keeping the room some temperature. The hermit crab, too, could be sitting before an image of food that leads it to optimize a trajectory of its pincer, only to clunk into a TV screen. And, I suggest, we humans could be deceived by an evil demon, maximizing internal coherence while holding mostly false beliefs. In all these cases, the internal function can work properly and still bring about effects that diverge from the purpose for which that internal function was designed.
5

I borrow the example loosely from Churchland (1995) pp93–95.

7 1.2.2 Epistemic norms and guidance Now let me explain how I think this notion of an internal function can clarify issues in normative epistemology. The job of the normative epistemologist, as I see it, is to provide standards of evaluation for epistemic states (like thoughts and thinking processes). What is important about this job? One crucial answer is that such norms can give us direction for improving our thinking. That is, a key reason to determine what makes a better or worse epistemic state is that we would then like to attain the better ones. Put more simply, we want to be better at ?guring things out; we want to be more intelligent. An important task for the normative epistemologist, then, is to ?gure out just what is involved in getting better at ?guring things out. If I am interested in improving my own thinking, then ?rst I will only be happy with epistemic norms that can guide me toward this thinking. I do not want a norm that has no effect on me. Of course a logic teacher can have a norm for thinking, and guide me toward it using arguments, grades, and such. Or it may be that I fall under the in?uence of an astrologer instead, who guides me toward those norms for thinking. I might be guided by any number of such norms. But to guide myself toward better thinking (as I would see it) I need an internal standard. This demand, I think, is crucial to all epistemic projects that we would intuitively call internal: the guidance intuition internal epistemic norms can guide thinkers from within toward achievement of that norm. Although intuitive, the guidance intuition is not totally helpful, because this talk of “guidance” and “from within” is, so far, unclear. To clarify some, contrast a classic standard of evaluation for mental states such as knowledge. This standard endorses beliefs that are true, as a necessary component. But

8 according to another common intuition from internal epistemology, the imperative to “believe what is true” cannot guide a thinker in the way we would like. 6 Chapter IV will assess this intuition, and the possibility of truth as an internal norm toward which we are guided by, say, internally available evidence. For now, let’s just run with the intuition that the truth norm cannot guide us, and see where it leads us. Why couldn’t such a norm guide us? Again intuitively: because to be guided by a norm from within, I need to be able to judge how I am doing with respect to it. I cannot adjust my thinking toward the direction of achieving this norm when I have no measure of where or how my thinking diverges from it. I do not have access to which of my beliefs are true and which false. If I somehow did, I could weed out the false ones. But as it stands the norm to “believe what is true” is not advice I can use, because it is not advice that could alter my mental behavior in the right way. The norm may actually be acting on me, through reliable mechanisms I have had from birth. But in order for a norm to guide me from within, I need at least a measure of error with respect to this norm that is “accessible” to me. This, too, is a common intuition from epistemology: the accessibility intuition error with respect to internal epistemic norms are accessible to the thinker. Again, on its own this is intuitive but not all that helpful, because it is not clear what it means for the error to be “accessible”. The guidance intuition can help us get a grip on this, however. Put it this way: a standard skeptical hypothesis has it that the same psychological state of a thinker could have mostly true beliefs in one world, and drastically false beliefs in another world (a world with deceitful demons, say).7 If we acknowledge this hypothesis to be coherent, then we
I assume that for a belief to be true involves a relation like correspondence with what that belief represents. 7 Here and throughout I mean these psychological states to be “narrowly” construed, preferably as func6

9 acknowledge that the world can differ in ways relevant to achieving the proposed epistemic standard of knowledge, without some corresponding difference in the psychological state. So the mental system does not always have an internal difference to correlate with achievement of the proposed standard. My resources for improvement are the same in both cases, though my error with respect to the standard diverges greatly. In fact, any proposed epistemic standard that demands some contingent relation obtain between the thinker’s psychological state and something external to it will fail the guidance intuition. For since the standard is contingently relational, all the intrinsic properties of the psychological state could stay the same while its success in obtaining the relation to the things external could change.8 Therefore, again, error with respect to this standard cannot be cognitively accessible to the thinker. The word ‘accessible’ gets tossed around and debated in the literature on internal epistemology, but this ?rst gloss is simple and innocent: for error with respect to a standard to be cognitively accessible, it must have at least some impact on the psychological state of the thinker. Since contingent relational properties have been ruled out, we are left concluding that only properties intrinsic to a thinker’s psychological state can be candidates for epistemic norms that follow the guidance intuition—that is, candidates for epistemic norms we are intuitively calling “internal”. Difference in success at attaining those norms, at any rate, will be guaranteed to have some cognitive impact on the thinker. Or, put another way, there will be no difference in psychological states’ success relative to such norms without some difference in the mental states themselves. Or put still another way, we arrive at this other common intuition of internal epistemology:
tional states determined only by proximate inputs and outputs. Also I will use, following common but somewhat strange usage, ‘mental state’ for individual states such as a belief or feeling, and ‘psychological state’ for the total suite of mental states in a thinker. 8 As Jessica Wilson points out, this talk of “intrinsic” and “contingently relational” properties is best understood as heuristic, here; what is really at issue, as we will see, is whether the standard supervenes on the psychological state.

10 the supervenience intuition success relative to internal epistemic norms supervenes on the (narrow) psychological state of the thinker.9 So far, then, with an intuitive approach to the criterion that internal epistemic norms must “guide” thinking, we have explained two other common intuitions about the division between internal and external epistemic projects. This is a good sign that we are onto something telling about internal epistemology. The supervenience intuition has independent intuitive force, I think, arising from an epistemic intuition I think still more primitive: the demonic intuition two thinkers forming the exact same thoughts based on the exact same experiences (narrowly construed) are surely according to some standard doing exactly as good an epistemic job—even if one is getting its experiences from a deceitful demon, and the other from ordinary objects. The thinker experiencing ordinary objects would presumably have much more knowledge than the other, though, so it seems a standard of actual knowledge acquisition cannot satisfy the demonic intuition. Though the demonic intuition probably lies beneath the supervenience intuition, the latter is too weak to capture the full intuition behind the former—and the reason is telling. Suppose, for example, that a thinker did somehow manage to have psychological states guaranteed to change with the environment. Maybe, for example, that thinker magically has a tiny patch of constant visual information that would be a slightly different color in each other possible world. Then though any epistemic evaluation of this thinker would satisfy the supervenience intuition (trivially), it would not seem to be enough to guarantee that the norm guide the thinker, as required by the guidance intuition. Suppose, for example, the psychological state of a thinker deceived by a demon differed from a counterpart
9

James Pryor has a similar supervenience formulation of internalism in Pryor (2001) p104.

11 in having a tawny hazel patch where the other has a dusty rose. How could that guide the deceived thinker toward the truth, for example? Intuitively it cannot; and intuitively, therefore, that difference would not be enough to ground a different evaluative stance toward them. We would be left thinking, as in the demonic intuition, that we should evaluate both thinkers the same; the difference between them seems epistemically irrelevant. 10 1.2.3 Guidance and epistemic agency So what kind of difference, intuitively, is epistemically “relevant”, or signi?cant enough to warrant different evaluative stances? I think the most common and natural response to the demonic intuition is that even if some thinker is doing a horrible job getting knowledge in the demon world, we must hold that thinker blameless for this failure. We could not reasonably expect the thinker to do any better. Such intuitions about “blame” can incline one toward a deontological account of internal epistemology, and indeed it is probably the most prevalent approach. 11 But here we must remember that an epistemologist might approach epistemic evaluation in several ways. It could be done deontologically, by asking whether epistemic duties have been ful?lled in thinking. Or it could be done as in virtue ethics, by asking whether the thinker exhibits the right epistemic virtues. Or, it is rarely remembered these days, it could be done consequentially—by asking how the resultant epistemic states fare relative to other possible mental states on some epistemic good. Perhaps there are other alternatives. Most internal epistemologists have gone the ?rst route, and several are leaning more toward the second. But I think there is an important reason to keep the third in view: namely, it does
This “color patch” point is inspired by an argument of David Hills’ (about personal identity in Hume). Alvin Goldman makes a laundry list of deontic internal epistemologies at Goldman (1999) p116 that I will not rehearse. Earl Conee and Richard Feldman, in their own effort to sever deontology from internal epistemology, say that they “suspect that deontological arguments for internalism are more the work of internalism’s critics than its supporters” (Conee and Feldman (2001) p 239 n20), though they too concede that there surely are at least some.
11 10

12 not rely on a problematic notion of epistemic agency. What is problematic about epistemic agency? First, it is often left mysterious. By a “mysterious” epistemic agent, I mean a thinker or part of a thinker that has an unexplained (and apparently unexplainable) ability to “decide” on one thought over another when presented with alternatives. An epistemic deontology that relied on a libertarian notion of agency would presumably involve such mystery, for example. The language of deontic epistemology commonly evokes the image of a homunculus inside the head, surveying the potential justi?ers that lie “before” it. The homunculus then mysteriously either pays attention to these justi?ers and thus thinks in a permissible and responsible way, or else neglects these justi?ers in a way that makes the thinker morally liable. If left mysterious, such homunculi would of course be anathema to any naturalistic approach to epistemology. We can be con?dent, for example, that work in arti?cial intelligence will not ultimately rely on them. So in order not to foreclose on the possibility of AI, we should see if we can do without them. More generally, to the extent that we have such naturalization as our goal, we should try to explain how decisions get made in physical terms. Naturally it may be that an unmysterious epistemic agency is compatible with deontic epistemology. But there is reason to worry here too; if we were to reduce the mystery by explaining the causal processes behind a decision, then our inclination to blame may reduce with it. Deontic approaches to epistemology seem to rely on a doctrine of doxastic voluntarism, and doxastic voluntarism seems more problematic the more the “voluntary” mechanism gets reduced. Even a doxastic form of compatibilism becomes less palatable when applied to the causal mechanisms underlying the choices themselves. 12 Finally, epistemic agency causes problems for the accessibility constraints that internal
12 The locus classicus on doxastic voluntarism as an objection to deontic epistemology is Alston (1988). For a recent overview of such issues, see Steup (2001).

13 versions of deontic epistemology require, as Goldman (1999) deftly argues. (I will not rehearse the arguments here.) So although I have just gestured at them, it seems there are enough problems with agential approaches to internal epistemology to warrant looking elsewhere. 1.2.4 Guidance and internal functions Indeed, Goldman allows that a rationale other than “the guidance-deontological conception” (as he calls it) might help internal epistemology. He also points out that the deontic and guidance conceptions of epistemology come apart. He considers deontic epistemology without the guidance conception, but dismisses it (on behalf of his opponent) because it would not be suf?cient to motivate internal epistemology. 13 But he does not consider whether a guidance conception alone might lead to internalism, and forget the deontology. This of course is just the route I would like to consider. 14 One attractive feature of epistemic agency is that it provides a clear sense of a bad choice relative to some norm, which in turn makes good intuitive sense of the norm’s guidance. But the functional notion of guidance and accessibility sketched so far in section 1.2.2 does not ultimately rely on this intuitive talk of choices. It does not even need talk of an “agent” that “knows” what facts obtain, as in typical accessibility constraints. 15 Instead, as we will see, we can think of accessible guidance in terms of internal functions. We explain doing well or doing poorly relative to this norm not in terms of duty violations, but in terms of effective performance of an internal function. Clearly this version of guidance does not rely on mysterious epistemic agency, since it could apply as well to thermostats and hermit crabs as it does to us. On the reasonable hypothesis that our intelGoldman (1999) p130 and p116, respectively. Conee and Feldman go this route as well, I later discovered, and they argue in Conee and Feldman (2001) that it is suf?cient to overcome Goldman’s objections. 15 Like Goldman’s “KJ”; Goldman (1999) p117.
14 13

14 ligence is different only in degree of sophistication from such simple functional systems, this is a good result.16 Remember that the idea linking the guidance intuition and the supervenience intuition was that error with respect to a standard needs to have some impact on the psychological state of a thinker in order for the standard to guide that thinker into correcting the error. This provides an alternate explanation for why the supervenience intuition was too weak. Where the deontic epistemologist might say that a mere difference in a tiny color patch is intuitively not enough to ground differences in epistemic blame, we can say instead that a difference in color patch does not by itself constitute an internal measure of error. That is because it does not have the right causal effect, we suppose, on the rest of the cognitive system. It does not provide feedback toward performing some function. I propose to understand epistemic guidance as a feedback mechanism. It is hard to see how different color patches could provide guidance in this feedback sense, and systems that differ only with respect to these patches are functioning equally well. We have seen that some creatures are such that they possess mechanisms designed to adjust their behavior in one direction or another, in order to improve their performance at some or other of their functions. The thermostat and hermit crab are two such. They possess internal functions. Furthermore, some creatures—roughly, the intelligent ones—have mechanisms that can adjust their behavioral dispositions, by adjusting their own mental computations. These mind-altering mechanisms are themselves part of the mind, and are themselves functions. Furthermore they too are internal functions; since the “behavior” of these functions takes place entirely within the cognitive system, their effects are internal to the creature. To alter their success or failure would require altering the creature itself, rather than merely its surroundings.
16

In particular, I suppose, on the computational theory of mind hypothesis.

15 Of course there is also external guidance, from the external feedback of external functions. After all, people may in fact be guided toward (mostly) true beliefs by the external facts and some reliable mechanisms that they possess, just as thermostats in normal circumstances are in fact guided toward keeping the room at a comfortable temperature by their reliable mechanisms. (Poorly-designed thermostats will not sell, and so get negative feedback from quality assurance or marketing.) So a functional construal of “guidance” need not exclude what we have called external functions. Indeed external epistemology has greatly bene?ted from functional approaches.17 But remember the intuitions that drove us to investigate internal epistemology in the ?rst place. Evaluating a thinker according to its performance of external functions would satisfy neither the supervenience intuition nor the demonic intuition. The thermostat could perform its external function better or worse without altering its intrinsic properties at all. Similarly, a thinker badly deceived could be gaining much less knowledge with no change in its intrinsic properties. External guidance would not even satisfy the guidance intuition, since its motivation was from self -improvement and not environment-improvement. Of course internal functions may serve external functions, as we have seen. For the individual to produce some effect may help the individual-plus-external to create some other effect. But when the individual asks what it can do, to answer that some other entity (the individual-plus-external) should adhere to this-and-such norm is not yet an answer. Internal epistemology demands an answer at the level of what the individual should do, which is (I have argued) to demand a goal that can provide internal feedback.
I would understand “reliable mechanism” accounts like Goldman (1986) to be functional approaches; so of course are more explicitly functional external accounts like Plantinga (1993).
17

16 1.2.5 A concrete example Here is an example to illustrate how such a standard of correctness for thought might actually be implemented in some machine. Paul Thagard, who is also interested in the area intersecting philosophy and arti?cial intelligence, has developed a computational model of explanatory coherence called ECHO. It takes inputs of percepts (“data”), explanatory relations, analogical relations, and contradictory propositions, and then uses these to calculate the most coherent explanation of the data available.18 The result is an assignment of a degree of belief (or disbelief) to each of the propositions being considered. This is impressive work, but I think the most fascinating aspect is the list of preset parameters that ECHO uses to calculate its conclusions. For example, the model has a preset “simplicity impact” parameter. ECHO diminishes the explanatory coherence of conclusions in proportion to the number of propositions required for the explanation, but leaves it up to this parameter how much to reduce the weights for greater complexity. There is a similar “analogy impact” parameter. ECHO also has a preset “skepticism” parameter that automatically decays each degree of belief by a certain small amount each cycle, a “data excitation” parameter for the default degree of belief assigned to the percepts, and a “tolerance” parameter that essentially adjusts the system’s aversion to contradictions against its willingness to consider competing hypotheses. (At the extreme, a high-tolerance system can endorse contradictory hypotheses. Conversely, in a low-tolerance system, a hypothesis with even a slight edge in degree of belief will quickly cause disbelief in any competing hypothesis.) The correct value for each of these parameters is just the kind of thing tremendously interesting to epistemologists, of course. How much should we weigh simplicity in our theory preference—a great deal, or maybe none at all? If the theory exhibits analogy to
18

Thagard (1992) chapter 4.

17 other theories, is this a positive point in its favor, or a happy coincidence? How skeptical should we be about our conclusions—more like Sextus Empiricus, or more like G. E. Moore? Do our percepts warrant default, foundational justi?cation? If so, how much? Just how bad is it to lend some credence to two contradictory propositions at once? ECHO cannot itself give answers to these questions. Thagard makes a provocative comment on the topic, though: In a full simulation of a scientist’s cognitive processes, we could imagine better values being learned. Many connectionist models do not take weights as given, but instead adjust them as the result of experience . . . it should be possible to extend the program to allow training from examples.19 That is, we would like ECHO (or, more properly, a system that incorporates ECHO) to adjust these parameters itself, learning to get better at its coherence calculations. The problem is, what is meant by “better”? By what standard will we compare the results of one array of parameter settings against another? One obvious answer is: adjust the parameters until ECHO endorses the true hypotheses. As programmers, we have an idea of what ECHO’s conclusions should be given its inputs. We can tweak and ?ne-tune the parameters until its conclusions match ours. In the end, we might suspect ECHO roughly re?ects our own parameter settings, and this could give some insight into our cognition. But of course ?nding these settings for ECHO would not settle any major questions of philosophy, for it is a presumption of normative epistemology that our own settings for such parameters may not be right. Put another way, we take ourselves to have the ability to learn better epistemic “parameter settings”. And we would like to be able to model this ability in a machine that would incorporate ECHO. This is simply the guidance intuition.
19

Thagard (1992) p81.

18 How could ECHO learn its own parameter settings? We could build on top of it a mechanism that could read its output, compare it to an optimal state, and adjust its parameter settings accordingly. Of course that requires specifying what the optimal state is. In other words, a system incorporating ECHO would need an internally accessible standard for correcting its parameters. Only with such a background goal in mind could it learn on its own to adjust them one way or the other. Similarly, the presumption that we can improve our thinking via internal re?ection involves a presumption that we can adjust our thinking toward some internally accessible goal state. I hope this example has made such a claim plausible; now I will try to argue for a more precise version, through the notion of learning.

1.3 Internal epistemology and learning
Only some creatures possess the kinds of mechanisms required to adjust their own cognitive dispositions. These are, I will claim, the creatures capable of learning. And this capacity to learn, I claim, is necessary for a thing to possess any interesting degree of intelligence. Let’s take the last claim ?rst: intelligence requires a capacity to learn. Think again of the thermostat. Not to put too ?ne a point on it, that is intuitively a pretty dumb creature. As Dennett points out for thermostats, they do not have a notion of temperature, or the heater, or anything of the kind. They just keep a thermometer and a switch in sync, as they were designed to do. They are oblivious to whether they are measuring room temperature or outdoor temperature, and oblivious to whether they are activating a heater or a coffee maker. They are unintelligent creatures, like algae or bacteria. But imagine giving a thermostat, as Dennett suggests, further ways to detect temperature in a room, and further ways it can bring about a change in that temperature—say that it also measures air density,

19 and it can open and close windows. We start to think of it as “smarter”, then. It does not rely so much on its external environment being a certain speci?c way (the boiler working, the internal thermometer re?ecting actual room temperature) for success at achieving its basic function. It is more adaptable. And it is that greater adaptability, I think, that gives it a greater degree intelligence than its simpler cousin. Here I take it for granted that intelligence is a matter of degree. Dolphins and chimps are smarter than octopuses, and those are smarter than squirrels, and those are smarter than ?atworms, and those (it turns out) are smarter than plants. I suggest that roughly the degree of adaptability in ful?lling creaturely aims is what explains the degree of intelligence we attribute. Naturally the fancy thermostat is still at the bottom of this spectrum; maybe it is at about the level of a very simple multicellular organism, or a decent plant. To make something truly smart—like the exalted ?atworm—requires that the creature have a subfunction speci?cally designed to take cues from the local environment in order to improve its performance of some other aim. More speci?cally, the creature should be able to learn how to perform its aim better given that environment.20 Creatures that learn are creatures that meet our intuitive standard of adaptability for intelligence. And to possess the ability to learn, I claim, is to have a cognitive mechanism that provides feedback to other cognitive mechanisms, altering them toward the achievement of the creature’s aims. Creatures that can learn, in other words, are creatures that have internal epistemic norms (as they are construed in section 1.2.1). Take the case of a mouse learning to run a maze. The mouse has an aim to eat—to get the food (the quicker the better!). We have found that a mouse can improve its performance at this aim over time. And importantly, this improvement is not accidental. We can tell a causal story about how its relative ability to achieve its aim causes it to revise its performance in the next attempt;
20 So it is no coincidence that John Pollock’s recipe for building a person (in say Pollock and Cruz (1999) p179) involves adding machinery for learning—“inductive mechanisms”—at the very ?rst step.

20 attempts that result in distance from the goal are inhibited, and attempts that result in proximity are encouraged. Something internal to the mouse is adjusting its probability of turning left at this corner, right at the next, and so on. The mouse apparently is able to change its dispositions to react to similar circumstances. There is some kind of feedback mechanism shaping its behavior. Again we might be tempted to think of the cheese reward as the feedback, and thus look at the learning as a matter of external functions. Other classic cases of operant conditioning spring to mind, where the feedback takes the form of shocks, or morphine, or what have you. But, naturally, for the environment to affect a creature’s behavior is not suf?cient for learning. The environment affects creatures like plants, too—creatures incapable of learning. Where the sun is affects how the sun?ower positions itself, but we do not think that is a case of learning. The sun?ower has one standing disposition to react to the position of the sun. Of course the environment does provide feedback to unintelligent biological creatures through natural selection. The genes of the creature, perhaps, improve their function according to this feedback by creating creatures more capable of passing them on. But no one token plant or genome improves its functioning at some task, and thus none learns in this way. That is why they are not so adaptable. 21 And we can well imagine a creature very like our mouse but that never learns better ways to get the cheese, despite identical environmental feedback. The mouse does indeed improve at this external function, but only thanks to an internal one shaping its dispositions. So for a creature to learn a task, it must change its own dispositions to behave (construed broadly to include cognitive behavior) in the direction of improvement at the task. The internal measure of error required for this process amounts to an attempt to assess the
Compare Dennett’s “tower of generate and test” in e.g. Dennett (1994). Plants and the like are merely what he calls “Darwinian” creatures, while those capable of some learning count as “Skinnerian” creatures and above.
21

21 difference between the creature’s current state and some state the creature would “like” to be in—a state the creature has a function to reach. That is, for the creature to have a mechanism to change its own dispositions for the better ful?llment of its aims requires the creature to be built to evaluate its own success relative to a goal, and to use its relative success as feedback for altering its dispositions. The evaluation of success relative to a goal, ?nally, consists of two things: an implicit representation of the goal towards which the creature is trying to adjust, and an implicit representation of the current level of success with respect to that goal. Put another way, creatures that learn have at least primitive conations and cognitions. The former are, roughly, representations of how the creature wants things to be (like desires and wishes in humans); the latter are roughly representations of how the creature takes things to be (like beliefs and suppositions in humans). Creatures that learn have representations of both the world-mind and mind-world directions of ?t. 22 In fact the ability to learn may explain where representations start to split off into these two directions of ?t. Ruth Garrett Millikan points out that Simple animal signals are invariably both indicative and imperative. Even the dance of the honey bee, which is certainly no simple signal, is both indicative and imperative. It tells the worker bees where the nectar is; equally, it tells them where to go. The step from these primitive representations to human beliefs is an enormous one, for it involves the separation of indicative from imperative functions of the representational system.23 The gap between bee dances and human beliefs certainly is large, but there are intermediate steps along the way. On my account, animals that can learn to perform their tasks better
For discussions of direction of ?t see Humberstone (1992) or Searle (1983); I base my thoughts largely on Velleman (2000). Incidentally I occasionally use cognates of ‘cognition’ to refer to the thinking process as a whole, rather than just the non-conative—as for example in “cognitive system”. I hope this causes no confusion. 23 Millikan (1989) p99.
22

22 can have non-propositional conations and cognitions in virtue of their internal feedback mechanisms. Learning is standardly conceived as “the acquisition of some true belief or skill through experience.”24 But to explain learning in terms of gaining belief gets things backwards, I think, since clearly there are creatures with only proto-beliefs (nonpropositional cognitions) that are capable of genuine learning. I think instead the notion of learning can help explain the notion of primitive cognitions, which when propositional content is added can help explain the notion of beliefs. But surely, you might object, the mouse did not have even an implicit goal to run the maze when it began to learn, and yet it learned to run the maze just ?ne. Quite right—but at least at ?rst, running the maze is not what the mouse was learning to do. The mouse was learning to get food, and this goal it does internally represent. It also represents how it is doing with respect to that goal. When it has food, it reinforces the behavior that led to the achievement of this goal in chemically recognizable ways. Or more accurately, when it thinks it has food, it reinforces what led to the apparent achievement of this goal. The mouse’s behavior would presumably change just the same way if we could somehow directly stimulate its “I’ve-eaten” brain centers when it arrives at the target spot. Later the mouse may actually have an implicit goal to run the maze; it has learned this new capacity or function in the service of one of its basic aims. And we would be surprised if, without any food or shocks, the mouse happened to learn just the path through the maze we wished for it. We would put this down to accident, not learning, since it is hard to see how it could be in any sense “deliberate” on the part of the mouse. To count as learning the new ability must serve some previous, internally representable aim of the creature. After all, it is not coincidence that operant conditioning always involves rewards and punishments like food and shocks.
24

Gallistel and Glymour (1998).

23 The upshot is this: creatures are things that we can reasonably say have their own goals, and we rank creatures on a scale of intelligence according to how adaptable they are in achieving their goals. An especially useful tool in adapting, and so a sign of intelligence, is an ability to learn. As construed here, such learning is always done in the service of some internally representable goal of the creature’s; or rather, the ability to learn in part

the function of performing that behavior through internal feedback that provides guidance

is also therefore to learn a new way to . The mouse learns a new way to get food, for example, by learning to run the maze. (And this new way to get food is in turn a new way to survive, and a new way to pass on genes.) So when we humans want in particular to learn to think better, we want to learn new cognitive behavior, and to gain new or different cognitive functions. This learning too must be guided by internal feedback; it must be guided by a background and internally accessible goal state. Such a goal state must be either another sub-aim (that perhaps itself can be improved through learning according to the feedback of an aim higher up) or else a basic aim (that cannot be so modi?ed). In humans these basic aims might properly be called “intrinsic ends”. For example, perhaps it is a basic aim of ours to get food, or avoid pain. 25 And the point for normative epistemology is that learning to think better, like learning to do anything, requires improving according to the internal feedback of matches between cognitions and conations. In other words, the ultimate internally available standard for better thinking—the standard that would permit us to learn better thinking—is the better (apparent) satisfaction of our ends.
As I will argue in chapter II, though, any one such aim can be overruled by suf?ciently many others, according to a comprehensive calculation of coherence.
25

?

?

toward the performance of some higher aim or sub-aim

of the creature’s. To learn to

?

constitutes having an internally representable goal. To learn new behavior

is to gain

?

24 1.3.1 Learning and intelligence Let’s look brie?y at a series of fanciful examples. On the one hand, the examples should demonstrate that our natural inclinations for evaluating thinkers incline us to prefer the ones that learn in the sense outlined. On the other hand, the examples will help illustrate an objection to my construal of learning: namely, that if we construe learning as a process of induction instead, then learning does not require background aims. I disagree, and will argue in section 1.3.2 that the inductive process too depends on a creature’s internally representable aims. But ?rst, the examples. Consider two brains in vats, Amanda and Bill. They were both captured by aliens shortly after birth. In the vats their afferent electrical and chemical impulses are adjusted magni?cently by the clever aliens to simulate for them a life very like the one they would have led had they not been captured. Amanda does what we would ordinarily consider a good epistemic job. Of course her earnest cognitive work is not getting her many truths—but we are likely to think that, at least in some sense, she is no worse a thinker for that. Bill, meanwhile, is a pathological wishful thinker. Whenever he desires that , he comes to believe that —or at least he attempts to form and sustain that belief as far as possible. Bill is doing no worse than Amanda when it comes to getting truths, we can suppose, but we are still likely to think that he is the worse thinker of the two.26 Why? What is so wrong with wishful thinking that we are still tempted to fault Bill, even though presumably no epistemic strategy will get him (empirical) truths? Is wishful thinking simply an irreducible epistemic sin? I do not think so. For consider two more brains in vats, Carl and Denise. These two were lucky enough to be captured (at birth) by
The example is reminiscent of one in Pryor (2001), though he points out that internal epistemologists have appealed to such brain-in-a-vat examples since at least Cohen (1984).
26

§

§

25 relatively bene?cent aliens, who attempt to make up for their atrocity by replicating for their abductees what they think must be an ideally pleasant environment. The aliens go about this by determining Carl’s and Denise’s desires, and then creating for them the right inputs to their brains for them to believe that their desires are satis?ed. The aliens are very good at this.

whenever he desires that . Can we blame him? It seems he has good reason to think that he lives in a world where all his desires are satis?ed. Every time he has wanted something to be the case, he has soon come to believe it is the case. He hopes there is a ferris wheel behind him, and when he turns around, sure enough he has such an experience. He hopes there will be peace on earth, and comes to read in the paper that there is. And so for each of his desires, until eventually he does not have to turn around or read the paper—he just makes a reasonable assumption. He also wants there to be undetectable pixies, and comes to believe in them too. He does not give himself explicit inductive reasons to believe in the pixies, though; instead, it is simply a result of his wishful thinking mechanism. Denise, on the other hand, will not form the corresponding belief when she notices a desire. She waits until she has the required experience—which she inevitably has, except in cases like her desire for undetectable pixies. She does not have experience of those, of course, and so she does not believe in them. Which brain is smarter? I think Carl’s. It has adapted better to a (simulated) environment radically different from ours. Both Carl and Denise have good (subjective) reason to believe there are undetectable pixies, and only Carl is savvy to it. Put it this way: suppose instead of being brains in vats, they were both actually in a world where desires that thereby brought it about that , just as a law of nature. Denise would be ignorant of this law, while Carl would have learned it. But we can suppose that world is experientially

¨

As a result Carl has developed a wishful thinking mechanism, forming the belief that

¨

¨

¨

26 identical to what they get in the vats, and so they should come to think the same way. In either situation Denise should notice the potential success of a wishful thinking mechanism, just as Bill should notice its failure. Just what is meant by “success” and “failure” here? Well of course that is the question. It clearly does not have to do with procuring truth. I have already suggested it has to do instead with how well their thinking manages to serve their further goals—or rather, how well their internal learning functions manage on the standard of matching the cognitions to the conations.27 This explains both why Amanda is a better thinker than Bill, and why Carl is a better thinker than Denise. It also explains why wishful thinking is wrong both in Bill’s simulated world, and in our own. Wishful thinking is unintelligent if a better adaptation to the (apparent) local environment would do without it, where a “better adaptation” is understood in terms of apparent instrumental success. 1.3.2 Learning and induction But an alternative explanation for these brain-in-vat intuitions is to say, for example, that Amanda and Carl are simply better at predicting experience. They have formed inductive inferences (implicitly or explicitly) that Bill and Denise have not. 28 Amanda predicts her experiences reasonably well, while Bill predicts his favorite curry for dinner and often gets pasta instead. Carl predicts there will be a ferris wheel behind him, while Denise fails to. Understanding learning this way would also account for the greater degree of adaptability, and thus greater felt intelligence in Amanda and Carl. And their skill at predicting
Wait a minute! Isn’t Denise doing exactly as well as Carl when it comes to this standard? No. For example, Denise does not get to believe in pixies. Closer to home, she wants there to be money in her bank account, but wastes time actually checking to see—time she could have spent better. Denise also wants people about whom she has no information to be happy, but does not believe they are. And so on. 28 In the background I am picturing formal versions of inductive learning—such as updating degrees of belief according to Bayesian rules, or a “logic of inquiry” approach like that explored in Kelly (1996). These approaches are of interest to me since they have notable potential for arti?cial intelligence. But I think my points would apply equally to other versions of inductive learning.
27

27 experience is a measure independent of background goals, we might think. Either they are correctly predicting experience or they are not, whether or not they want to. And, we might think, they are smarter to the extent they can catch on to the environment around them, and predict it—whatever their goals are. On this version of the inductive learning proposal, predicting experience is a primitive epistemic good. But consider two more brains in vats (why not). Erin’s and Fred’s brains are the victims of a renegade, demonic alien who likes toying with them. In particular the alien has it rigged so that whenever it detects a prediction of experience on the brains’ parts, it provides an experience to contradict it. As a result Erin’s experience-prediction mechanisms have long ago withered away as merely frustrating. She has found other ways to cope, such as “predicting” experiences that she does not want. Meanwhile Fred continues to predict his experiences, and gets them wrong every time. Again, one of these brains is failing to learn, and I think that brain—Fred’s—is the less intelligent one. Of course this lack of intelligence may simply be from hardwiring; perhaps Fred cannot help but attempt to predict his experiences, because he is just built that way. He is built, that is, not to be able to adapt to a circumstance where predictions do not serve him. Of course we can construe Fred’s failure to learn as another failure to induct, this time a failure to induct from his failure to predict. And I agree that learning can often, or perhaps always, usefully be understood as an inductive process. What I do not agree with is that through induction we can understand learning independently of a creature’s aims. Put it this way: what is Erin correctly inducting? In what sense did Erin learn that her experience-predictors fail? To learn that her prediction mechanism fails her, she would need a mechanism overlooking it and adjusting it according to its performance. Suppose what is monitoring the experience-predictor spits out a signal like 1 when an experience matches a prediction, and spits out 0 for a mismatch. Why should Erin’s brain be set to

28 encourage the experience predictor when it reads a lot of 1’s, and inhibit it upon reading 0’s? Why not the other way around? Presumably this is because of Erin’s ends. She tries to predict for a reason; prediction is meant to serve higher aims, and when it does not she is adaptable enough to learn not to use it. If some creature in our world adjusted such a mechanism toward getting more mismatches, we intuitively would not call this learning or intelligence, even though it is induction. That is because in this world matching predictions serve aims, and mismatching ones do not. Put in terms of induction, there have to be at some point preset categories that Erin simply, primitively inducts upon—maybe ones like “pain-producing” or “food-obtaining”. The categories on which she is primitively designed to induct set the standards of success and failure for her derivative inductive mechanisms. Another way to look at it is through the impetus to inquiry. Suppose we can correctly describe all learning as induction, and suppose also that creatures do not start biased with a small number of primitive, preset categories. Instead suppose there is a huge multitude of categories for inducting, such as “stimulation type 39 of optic nerve 781”, and suppose the induction starts with an indifferent distribution of probabilities for these categories. Then the creature’s probability distribution will have huge entropy (as they call it)—that creature will be completely uncertain about everything, and has no way to pick one hypothesis over another. And the question then becomes: what drives the creature to change those probabilities, investigating some hypotheses rather than others, in an effort to reduce this entropy? Presumably it will be built to inquire only into certain matters of interest to it. In summary, then: intuitions from brain-in-a-vat examples lend credence to the idea that an ability to learn by adjusting to the apparent environment is a mark of better thinking. And interpreting this learning as an inductive process does not excuse us from the conclusion that the internal standard of learning is a matching of cognitions to conations.

29

1.4 The normative status of the standard
The argument so far takes this skeletal form: only creatures—things in possession of the right kind of functional makeup—can be intelligent. Our intuitive attributions of intelligence track the adaptability of the creatures to their environment, and the capacity to learn is especially powerful in providing such adaptability. The capacity to learn, furthermore, is the possession of an internal function to adjust behavioral dispositions (including cognitive ones) through feedback in the direction of the creature’s aims. This internal feedback gives the creature, implicitly, primitive conations and cognitions—representations of how the creature wants the world to be, and how the creature thinks the world is. In particular creatures with the capacity to learn to think better (like humans) must do so according to some internal feedback mechanism. The resulting picture satis?es major intuitions of internal epistemology without obviously relying on any mysterious version of epistemic agency. So the proposed internal standard of correctness for thought is that of apparent goal satisfaction. According to this proposal, internal epistemology is pragmatic epistemology.29 In the next chapters, we will see how external constraints on goal satisfaction, both from the conative and cognitive side, restrict the kinds of thinking that would best bring it about. (For example, wishful thinking is not an effective way to think, even by this standard.) I will also show how such a pragmatic epistemology can do without a naive, instrumental version of practical reason. The result will be a “foundherentist” approach to mental computation. The approach has a natural algorithm to make it more precise,
Crass and shocking, I know. But strangely, many epistemologists who wish to connect up with arti?cial intelligence are pragmatists. Thagard for example defends a pragmatic approach to scienti?c realism in Thagard (1988). Patricia and Paul Churchland are pragmatists. And hidden away in Pollock and Cruz (1999) are claims like “the ultimate objective [when evaluating cognitive architectures] is not truth, but practical success” (p175). There could be a merely sociological explanation for this tendency, though I think I have given a good philosophical one.
29

30 and that algorithm has a natural architecture to instantiate it. The result will help patch normative epistemology into cognitive science, which I think is all that can be asked of “naturalized” epistemology. Meanwhile, what is the normative status of this proposal? For one thing, I am attempting to describe in more detail the thinking humans do in virtue of being intelligent. That involves describing the standard according to which I think humans must, ultimately, adjust even their thought-forming mechanisms. If they must so adjust, where is the normativity? In a system’s relative success or failure at achieving this standard. Such failure or success, for example, can explain our intuitive attributions of rationality and irrationality in the brain-in-vat cases—intuitions fundamental to internal epistemologists. In attempting to describe intelligence, I am engaged in a normative endeavor, since calling something intelligent is at least partially evaluative. For another thing—or, for another way to view the same thing—I am making a recommendation about how to go about building smart robots. Finally, I am indirectly endorsing a philosophical program, advocating a more hearty acceptance of epistemic pragmatism (properly understood). I have claimed essentially that given the kind of creatures we are, and given what it means for such creatures to learn, we cannot help but be epistemic pragmatists. This is no horrible thing; rather, it should be embraced, for it means the furthering of all our rational goals. And philosophical problems that ignore pragmatic implications—or look for a standard beyond them—should be reexamined for their point.

CHAPTER II

WISHFUL THINKING AND COHERENCE EPISTEMOLOGY

So far I have argued for an internal, pragmatic epistemology that at least has potential for being naturalized. In this chapter I argue that achieving such an internal standard must involve achieving a coherence among all thoughts, both cognitive and conative. First, though, I look at a completely natural objection to make at this point against any internal pragmatic epistemology: namely, that it seems to endorse wishful thinking. My response to this objection leads smoothly to the topic of coherence—and thus to a more speci?c form of internal pragmatic epistemology that will bring us still closer to a computational model.

2.1 Wishful thinking
A pragmatic epistemology holds that thoughts are better according to whether they contribute to desire satisfaction for the thinker. (Contrast, incidentally, a pragmatic metaphysics, which holds that truth—rather than something like justi?cation—depends in some way on instrumental success. I will often speak loosely of “pragmatism” when I mean the epistemological variant.) An internal pragmatic epistemology (like mine) construes desire satisfaction internally, so that a thinker’s desire is “satis?ed” when the thinker believes

31

32 that the desire has been attained, whether or not it actually has been. Thoughts are better when they contribute to apparent desire satisfaction, or what we might call “subjective happiness”. But this just sounds like a solipsistic and tender-minded incitement to epistemic irresponsibility, amounting to advice along the lines of “believe what you want, and evidence be damned.” Since I have a desire to be an adventurous pirate, any beliefs I might form to the effect that I am one appear by internal pragmatism to be justi?ed. In other words, the position seems to countenance and even encourage that worst of fallacies, wishful thinking. But any normative epistemology that endorses such an obvious fallacy hardly deserves the name. Whims would count as reasons; if you want there to be pixies, a belief in pixies is thereby justi?ed, and if the next moment you wish there were elves instead, a belief in elves is now best. If two people or cultures have contradictory desires, then the contradictory beliefs are both justi?ed; for that matter if within one person can exist both

Pragmatism of this sort seems to entail that “anything goes” when it comes to evaluating thoughts; it seems, in effect, to give up on normative epistemology altogether. 2.1.1 Epistemic relativism and “anything goes” epistemology In fact wishful thinking seems to come part and parcel with a general permissive relativism that gets associated with pragmatism. But it is easy to con?ate the following three objections to an epistemology: 1. It condones wishful thinking. 2. It is relativist.
Here and throughout I am using ‘justi?ed’ in what is intended to be an ecumenical way, as a kind of shorthand for “good according to some epistemic norm.”
1

? 

?

a desire that

and that

, then the associated contradictory beliefs are both justi?ed. 1

33 3. It is such that “anything goes”. Stephen Stich is sensitive to similar objections in his argument for an external pragmatic epistemology.2 Though wishful thinking is not such a concern for his position, he carefully separates the other two points. Then he grants that his position leaves epistemic evaluation highly relative to circumstances and goals, and does not consider it a problem for his view. He points out that even truth-based, consequentialist approaches will involve similar relativism.3 But then, according to the “anything goes” objection, for pragmatism to commit to relativism “simply gives up on the project of distinguishing good cognition from bad.”4 Wishful thinking gets implicated because it appears to be the inevitable result of such epistemic promiscuity—no belief is intrinsically better than any other, so when faced with the task of choosing one, believe what you want. Stich’s response: Pragmatism does not give up on the project of assessing cognitive processes. Quite the contrary. Epistemic pragmatism offers an account of cognitive evaluation that is both demanding and designed to produce assessments that people will care about. . . . It will near enough never be the case that pragmatism ranks all contenders on par.5 This passage applies to internal pragmatists, too—for only some beliefs will lead to apparent desire satisfaction! At least, then, I have reasons not to believe some things, such as things that I do not want to believe. Therefore not anything goes. Still, this hardly rules out wishful thinking. The epistemic pragmatist (Stich included) does claim that no belief is intrinsically better than any other. He does not conclude
Stich (1990). For one thing, what cognitive processes garner truth will be relative to whether or not you are in, for example, a perverse demon world. For a more grounded example, it seems that whether to be an intellectual maverick or a conservative in your academic ?eld might well depend, in attempting to maximize truth, on the current relative percentage of mavericks and conservatives in your community. See Stich (1990) pp136–140. 4 Stich (1990) p141. 5 Stich (1990) p141.
3 2

34 that you should therefore believe what you want, but he does conclude that you should therefore believe what “would be most likely to achieve those things that are intrinsically valued,” usually by the believer.6 When the internal pragmatist replaces ‘achieve’ with ‘apparently achieve’, wishful thinking seems to follow. 2.1.2 The wishful thinking objection It is not hard to see how. Compare two belief-forming mechanisms: one forms beliefs according to ordinary practices of evidence evaluation and the like, and the other always wishfully thinks. That is, it automatically forms the belief that for any desire that , and gives the belief weight that trumps any other belief formed by any other mechanism should there be competition. We are to evaluate these systems according to a standard of apparent desire satisfaction, which is roughly to evaluate the ratio of matches between desires that and beliefs that . It seems clear that the wishful thinker would have a serious advantage in this evaluation.7 Naturally such a pure wishful thinker would never live very long—at least, a human one wouldn’t. When hungry he will simply form the belief that he has eaten rather than get food, and when thirsty he will simply form the belief that his thirst has been quenched. He would never move, and be dead within days (faster if there are tigers). This is little defense against wishful thinking for internal pragmatism, however. First, the wishful thinker may not desire a long life, and our evaluation of his cognitive system can only be done relative to such desires (and perhaps we shouldn’t blame him, if he can truly live his short days in such ecstasy!). Even if he did desire a long life, he need merely believe that he will enjoy one, according to an internal pragmatic standard. Luckily for him, if he does have
Stich (1990) p131. Contrast Bill, the wishful thinker in a vat from chapter I—we suppose in that case that at some point he is not able to maintain his fanciful beliefs when faced with blatant evidence to the contrary. He is not a good learner because he does not notice that his beliefs are always being thwarted in this way.
7 6









35 the desire to live a long life, he also believes that he is living one. He thus is remarkably effective in gaining apparent desire satisfaction, and so by internal pragmatic standards his cognitive system is top-notch. True, he may not be able to contain all this epistemic goodness for very long. That is, he may not be able to garner very much of it during his short lifetime. But ?rst, it is not clear that some quantity of desire-satisfaction is the appropriate goal to judge cognitive systems by; a cognitive system that brought about tiny bits of desire satisfaction every year or so over the period of eons is not obviously preferable to a high concentration of less total subjective happiness over a shorter amount of time. And anyway, perhaps a somewhat less pure wishful thinker would last longer at the expense of mere 98% desire satisfaction, say—and perhaps somewhere on that spectrum is a maximum of subjective happiness. Still, even if pure wishful thinking is not the ideal strategy for subjective happiness, it looks like wishful thinking will be a heavy factor in the race for the pragmatic trophy.8 2.1.3 Wishful thinking as rational One possible response for the internal pragmatist is to include the wishful thinking fallacy with the other purported “fallacies” that epistemic pragmatists such as Stich and Gilbert Harman claim are not, in the big pragmatist picture, irrational practices after all. Take for example the documented phenomenon of “belief perseverance”, where undermining the evidence that led to a belief fails to undermine the subject’s conviction in that belief. This strikes us as an irrational thinking strategy that could land one in real, pragmatic trouble. But this tendency may simply be an unfortunate side effect of a more general cognitive strategy that turns out to be on balance pragmatically wise, given limitations like the size of our brains and the time required to remember things: namely, the strategy to
8 Here, in the wishful thinking section, is a good place for a disclaimer: although many pragmatists use the view to defend religious faith, I have nothing like that up my sleeve. In fact I am an atheist.

36 retain a conclusion and discard the reasoning behind it as (typically) a waste of effort. We can all think of cases where we recall the conclusion of a paper or editorial or theorem without recalling offhand how that conclusion was reached. Instead we keep a kind of placeholder: “there were reasons for this belief I found convincing at the time.” And we might on re?ection endorse such economization. The belief perseverance phenomenon is a foreseeable negative side effect of this strategy—a negative side effect that simply gets outweighed by its pragmatic bene?t. Thus it turns out in the grander scheme of things belief perseverance can be seen as part of a rational belief strategy. 9 A paper by Dion Scott-Kakures might help the internal pragmatist in this way with wishful thinking.10 Scott-Kakures draws on contemporary psychological studies to form a general account of “motivated believing”—a category meant to include both wishful thinking of the type we are examining, and “unwelcome” believing such as hypochondria, where feared things are believed. A uni?ed and psychologically respectable account of these, Scott-Kakures claims, is one according to which motivated believing is in the service of the realization of the subject’s goals and values, and according to which there is no bright line between motivated and (what we’re apt to regard as) unmotivated or accuracydriven cognition.11 The putative justi?cation of wishful thinking, as in the case of belief perseverance, is fairly simple. We have to make evaluations and decisions under conditions of uncertainty and limited resources, using probabilistic reasoning. A standard and perfectly reasonableseeming practice in such evaluation is to set con?dence thresholds that need to be exceeded
9 See Stich (1990) pp152–3, which relies in turn on Harman (1986) pp37–42. Here and throughout I use ‘rational’ ecumenically, to mean “good thinking according to some epistemic standard.” 10 Scott-Kakures (2000). 11 Scott-Kakures (2000) p350.

37 before an evaluation can be considered complete. Now let us take on board, as the pragmatist would, a “pragmatic hypothesis testing account”, according to which how intensively and in what manner a subject tests a hypothesis will re?ect her values and interests. Cognition, on this view, is suited for the securing of rewards or bene?ts and the avoidance of what is undesired.12 As a descriptive claim about rational agents, it is surely uncontroversial that for example we will spend more time evaluating the truth of beliefs that are genuinely important to us than, say, truths about Sebastian Cabot. And it seems rational we might do so. But from this principle of rational belief evaluation—take costs and bene?ts of the investigation into account—seems to follow methods that are not different in kind from extreme cases of wishful thinking. Some clear costs to the investigation are the time and effort spent. But crucially, also among the costs of the investigation are costs for error—getting a false positive or false negative. Of course, [the cost of error] is not surprising, since most of us are apt to agree that believing truths typically confers some bene?t. But what is distinctive about such accounts is the claim that, with respect to many questions, the cost of false positives, on the one hand, and the cost of false negatives, on the other, will not be identical; such error costs are asymmetric. The costliest error, the “primary” error, is the error the subject is preponderantly motivated to avoid. This error is ?xed by the aims, values, and interests of the cognizer. Thus, in so far as a subject’s hypothesis testing is sensitive to the avoidance of the costliest error, it serves to bring about what that subject values.13
Scott-Kakures (2000) p362. In a footnote to this passage Scott-Kakures quotes Stich: “the pragmatic account urges that we assess cognitive systems by the likelihood of leading to what their users value” (Stich (1990) p136). 13 Scott-Kakures (2000) p363. He cites some psychological studies to back up the claim about asymmetric error costs.
12

38 The putative defense of wishful thinking as rational, then, goes like this. Since error costs are asymmetric, it is thus (pragmatically) rational to place con?dence thresholds for

con?dence threshold for believing the drink offered him is not poisonous; a false positive for the “it’s poisoned” hypothesis would be too costly. The queen with a treacherous court, on the other hand, has worse costs associated with a false negative for the hypothesis, and thus should rationally require a lower con?dence threshold for believing her drink is poisoned. In the case of wishful thinking: suppose you desire that . To see if your desire is satis?ed, you evaluate the hypothesis that . But like all hypotheses, you undertake this evaluation probabilistically and under limited resources. Since you desire that , you will

puts it, “there is an intrinsic cost built into the rejection of a desired hypothesis: negative

rationally made it more likely you will conclude , just as a result of desiring that . 15
Scott-Kakures (2000) p366. It is possible to read some of this view back into William James, though I leave the real exegesis to the reader. Here are two of the most ripe quotations:
15 14

The question next arises: Are there not somewhere forced options in our speculative questions, and can we (as men who may be interested at least as much in positively gaining truth as in merely escaping dupery) always wait with impunity till the coercive evidence shall have arrived? It seems a priori improbable that the truth should be so nicely adjusted to our needs and powers as that. In the great boarding house of nature, the cakes and the butter and the syrup seldom come out so even and leave the plates so clean. Indeed, we should view them with scienti?c suspicion if they did (James (1896) p201). Though this next passage is aimed toward justifying religious views in particular, the general pragmatist point is again like Scott-Kakures’s (the emphasis is as in the original): Scepticism, then, is not avoidance of option; it is option of a certain particular kind of risk. Better risk loss of truth than chance of error—that is your faith vetoer’s exact position. . . . To preach scepticism to us as a duty until “suf?cient evidence” for religion be found, is tantamount therefore to telling us, when in presence of the religious hypothesis, that to yield to our





 

in fact

). Set your con?dence thresholds accordingly, and lo and behold, you have



when in fact ) has a higher cost for you than a false positive (concluding that

when

 

affect.”14 Thus all other error costs being equal, the false negative (concluding that

 

be disappointed, disheartened, or otherwise saddened if you believe

. As Scott-Kakures







 



coming to believe

or

asymmetrically. Someone dying of thirst should have a low



39 A critic might object that these cases are not truly ones of wishful thinking. In these cases the believer does test the hypothesis, even if the results are somewhat biased (and perhaps, rationally so). The believer is showing at least some interest in “how things really are.” In genuine wishful thinking of the type that should worry pragmatists, the critic might say, a believer does not test hypotheses at all, but directly forms the beliefs that conform to his desires. A pragmatist could respond to this critic that the “genuine” wishful thinking cases differ only in degree, and not kind, from the “rational” ones. Demanded con?dence thresholds can get lower and lower for the preferred result, until any degree of con?dence is virtually guaranteed to pass the threshold. The cases of genuine wishful thinking may just be an extreme of what is a generally rational approach to hypothesis testing. Such a response hardly shows wishful thinking to be rational, however. First of all, it is not clear that pragmatic hypothesis testing as a general method is rational. ScottKakures’s cases include what would normally be considered paranoia (“unwanted”, not wishful, believing) and unreasoned optimism—as for example the study he cites in which “the correlation between an individual’s judgment of his own physical attractiveness and others’ judgments of the same individual’s attractiveness is .24.” 16 It is hard to call such cases rational. But even if there is a good case for the rationality of pragmatic hypothesis testing, and even if genuine wishful thinking is different only in degree from acceptable cases, that might show only that the border of irrationality is fuzzy—not that there isn’t one. Setting a margin of error so wide that the hypothesis is guaranteed to pass is surely irrational by anyone’s standard. Perhaps though, like the belief perseverance cases, genuine wishful thinking is an irrational result of a generally rational mechanism that has occasional unfortunate but outfear of its being error is wiser and better than to yield to our hope that it may be true (James (1896) p205).
16

Scott-Kakures (2000) p348.

40 weighed negatives. Just as we do not consider it rational to hold onto a belief after its evidence has been undermined, but do endorse the mechanism that sometimes brings about such results, so we might consider it irrational to think wishfully, but endorse the mechanism that occasionally leads to it. Such an argument may be tempting, but there is an important disanalogy between the wishful thinking and belief perseverance phenomena. Incidents of genuine wishful thinking do not seem to be a necessary potential cost of adopting pragmatic hypothesis testing; rather, they look more like a hijacking for ulterior motives of an otherwise rational mechanism. Pragmatic hypothesis testing says: adjust your con?dence intervals based on the cost of mistakes (and these intervals may be asymmetric if the costs are). This procedure is rational as long as actual costs are re?ected in the calculation. It is not rational for me to have a minimally low con?dence threshold for tests of whether I am a pirate, for example, no matter how much I want to be a pirate. It is rational to have the acceptable margin of error somewhat skewed in favor of the pirate hypothesis, but only by a minimal amount geared to the proper disappointment should I discover I am not one. The internal pragmatist who wishes to defend wishful thinking as rational may have recourse to one other response. Talk of actual costs is too external. Perhaps cases of genuine wishful thinking appear irrational because we do not share the epistemic perspective of the believer; instead we take the perspective of an outsider who knows the truth and sees it sadly neglected by a deluded and biased belief. As Scott-Kakures says, from an external perspective, possessed of the truth, it is easy to survey the situation of the motivated believer and to characterize her reasons for believing

17

Scott-Kakures (2000) p370.



as bad reasons. But the engaged-believer has no such guarantee, and she has plenty to lose.17

41 That is, we do not share the desires and epistemic pressures of the agent apparently thinking badly, and therefore cannot sympathize with her plight. But Scott-Kakures notes that the agent herself—even when she later shares our third-party perspective—might look back on such decisions and still, remembering the pressures involved, endorse the testing process she actually underwent. Perceived costs can be higher or lower than real costs, of course, and in the moment all the believer can do is act according to the perceived costs. 2.1.4 Wishful thinking as irrational again Scott-Kakures concludes “a policy that weighs the costs of errors so as to bias the subject against the costliest mistake is far from irrational.”18 But still, there will be genuine wishful thinking cases that even Scott-Kakures feels compelled to call irrational. Suppose I believe I am a pirate because my perceived cost of not being a pirate is enormous, the end of the world; to discover I am a mere philosophy student would be a fate worse than death. No cost could be higher for me, then; if I am not a pirate all is hopeless anyway, so there is only negligible further cost to believing falsely that I am. Given my perceived asymmetric costs, it may seem Scott-Kakures is committed to saying it is rational for me to set the minimally low threshold for belief that I am a pirate. He does not want to bite this bullet, however—and who can blame him? In such cases he wants to say instead that thinkers like me are being irrational by “paying scant attention to errors about which they really do care.”19 But notice this response is prima facie a bit weird: that I am paying scant attention to errors about which I actually care strongly suggests that though maybe I believe I desire being a pirate above all else, in some important sense I actually don’t desire it as much as I believe—not enough to warrant the bizarrely skewed con?dence margins. Perhaps the
18 19

Scott-Kakures (2000) p370. Scott-Kakures (2000) p370.

42 claim here is that I have inaccurate beliefs about the costs, and these inaccurate beliefs are skewing my hypothesis testing inappropriately. Perhaps I believe (falsely) that being a pirate is the only way to make a living these days, and I desire vehemently to make some kind of living for myself. Naturally I conclude that what I most desire is to be a pirate. But in this case my desire is a simple result of mistaken means-end reasoning. I just desire a decent job of some kind, and this does not (in today’s economy, from what I hear) amount to a desire to be a pirate. In this sense perhaps I do not really desire to be a pirate after all. For one thing, though, this response is not easily available to the internal pragmatist. More importantly, the mistake in my reasoning might be that straightforward, but then again it might not be—it seems to me I might really desire very strongly to be a pirate, and my belief about this desire might not be mistaken. Scott-Kakures seems strangely con?dent that no one could have such desires, since he thinks anyone abusing pragmatic hypothesis testing in this way is “paying scant attention to errors about which they really do care.” Perhaps this con?dence comes from the view that no rational agent could care that strongly about such things, as a fact about rational agents; in other words, perhaps Scott-Kakures feels it is not my belief about the means to my desire, but rather my desire itself, that is mistaken or irrational. This need for evaluation of desires is crucial in responding to the wishful thinking objection. Realistic desires, intuitively, are not the type that require wishful thinking in the ?rst place. But irrational desires are not only awkward for internal pragmatism; any pragmatism that straightforwardly evaluates cognitive processes by their success at bringing about desire satisfaction will have these dif?culties. Consider an external pragmatism like Stich’s, and suppose an agent intrinsically values ignorance and error. Such an agent would have “justi?ed” beliefs that are systematically bad by any intuitive standard. More to the immediate point, invoking irrational desires as a response to genuine wish-

43 ful thinking is awkward for Scott-Kakures’s account. If we can call my pirately desires irrational, we can speak similarly of anyone’s perceived costs from our third-person perspective. Perhaps wishful thinking would be rational along Scott-Kakures’s lines if we could always count on having appropriately rational desire structures. But meanwhile Scott-Kakures cannot have it both ways. If my thinking I am a pirate is irrational on grounds of mistaken desires, then why not say the same for more innocent cases, like those inclined to think themselves better-looking than they are? Perhaps it is also irrational to want so much to be good-looking that you are willing to bias your hypothesis testing. Perhaps in general it is irrational to want something that much. Or perhaps the rational thing to do in belief formation is always to weigh the evidence without taking your (possibly irrational) desires into account at all. This is certainly a common intuition, and we are led back to it merely from the innocent idea that some desires—such as wanting to be a pirate—are irrational. In summary, Scott-Kakures’s paper looked like it provided the best shot for defending epistemic pragmatism against wishful thinking charges. But it does not seem to work. His argument only shows that it can be marginally more justi?ed to form the belief that given such a desire. First of all, this conclusion may not be right in the speci?c way needed to justify some wishful thinking. Though costs for error can be asymmetric, and con?dence margins rationally set appropriately, intuition suggests the particular cost of negative affect from disappointment should not bias against a false negative at all, let alone enough to guarantee belief. But even if Scott-Kakures is right, he only shows there are some cases where some bias from wishful thinking is permissible; in the end Scott-Kakures agrees, too, that genuine wishful thinking is irrational.



44 2.1.5 Thinkful wishing If internal pragmatism leads inevitably to the conclusion that it is rational for me to believe I’m a pirate, given such a desire, then this can only reduce internal pragmatism to absurdity. But from what has been said so far, wishful thinking still appears to be a way to form justi?ed beliefs according to the internal pragmatist. In fact there is still worse news for the internal pragmatist. Even with an answer to the wishful thinking problem, there will be another problem in the opposite direction. For here is another cognitive method that appears to garner “justi?ed” beliefs by pragmatist lights: suppose you have a belief that . Then form the desire that . For example, suppose you believe armageddon is approaching. Then simply desire that armageddon be approaching. Voil` , your beliefs have led to desire satisfaction, and thus are justi?ed according to intera nal pragmatic epistemology. Call such an approach to cognition thinkful wishing. 20 Note that this problem is not unique to the internal pragmatist. Instead of matching desires to beliefs, someone could thinkfully wish in the external sense by desiring things to be just as they are, and voil` , that person’s beliefs would be justi?ed. External epistemic pragmatists a like Stich would presumably need an answer to this concern just as much as the internal epistemic pragmatist does. True, thinkful wishing is not a belief-forming mechanism (since it leaves the beliefs alone and ?ddles with the desires instead), but it is a thought-forming mechanism that will bring justi?cation to beliefs according to the internal pragmatist. Ruling out thinkful wishing on the grounds that pragmatism looks for good belief -forming mechanisms seems unfairly ad hoc. Beliefs are in the service of desire-satisfaction, to a pragmatist, and beliefs affecting desires in this way are apparently doing a great job at bringing about desire
Thanks particularly here to Eric Lormand for discussion of thinkful wishing; we independently found interest in this symmetry, but the phrase is his.
20





45 satisfaction. Perhaps an internal pragmatist could claim that beliefs must bring about the satisfaction of particular, antecedent desires to be justi?ed—not just bring about any old instance of desire satisfaction. But by this understanding of apparent desire satisfaction, it seems, desires must be immutable; beliefs cannot rationally bring about change in desires. For if they did, they would presumably fail to lead to the satisfaction of the old, particular desires that were revised, and thus be unjusti?ed.21 If we allow that beliefs can (rationally) in?uence desires, and that such beliefs can be justi?ed, then chronological priority of desires over a belief will not generally be required for the justi?cation of that belief. For example, I desire to be a pirate. But I have contravening beliefs such as that this is the 21 century, or that those ships are outdated and would never survive the coast guard, or that the pirate life is not in reality so glamorous after all, or that parrots are too expensive. Intuition suggests these contravening beliefs should alter my desire in some way; they should show that my desire is not so realistic. But a pragmatist interested in satisfying particular, antecedent desires would have to say instead that those contravening beliefs are relatively unjusti?ed, because they frustrate forming the belief that I am a pirate. Of course a pragmatist would want to say, intuitively, that these beliefs are justi?ed in part because they lead me to concentrate on desires that are more likely to be ful?lled. In other words they contribute to general desire satisfaction. But if general desire satisfaction is the pragmatic goal—if the beliefs as a whole are to be measured in terms of their contributions toward the desires as a whole—then thinkful wishing will be as much of a problem as wishful thinking.
21 This argument applies to the adjusting of intrinsic desires as well as to intermediate, means-end ones— if people can re?ectively change their intrinsic desires over time. The same might not apply to a view that involves antecedently ?xed aims of the thinker. A pragmatism grounded in these may fare better against the thinkful wishing charge, and indeed my own view takes something like this route.

"!

46 2.1.6 A way out: spontaneous thoughts

there isn’t already a match between the content of what is desired and the content of what

all else being equal, an unfortunate state of affairs for a thinker—especially according to an internal pragmatist chie?y concerned with apparent desire satisfaction. To alleviate the situation requires either modifying the belief or modifying the desire. Only after deciding which should be modi?ed can the thinker apply methods of modi?cation. Such methods naturally include the “short-cuts” of wishful thinking and thinkful wishing. But I suggest that the method for deciding which to modify will contain within it suggestions for how to modify the chosen thought that do not rely on such shortcuts. When confronted with a tie between giving up the belief and giving up the desire, the thinker should look for tiebreakers. These tiebreakers could, and in practice do, vary case-by-case with the content of the thoughts in question. In the case of the pirate, content-wise nearby beliefs and desires include those about ships, robbery on the high seas, swords, parrots, and the like. On plausible views about mental content, my belief that I am not a pirate is in part determined by its place in the inferential network relative to these other beliefs; to believe I am not a pirate might just be something like the tendency to activate some combination of such other beliefs and images. Though nearby thoughts may not in this way come constitutively with the sparring propositional attitudes, at any rate the thinker should not have to look far for such tiebreakers. In some cases consultation of the nearby mental states will suggest that the desire should be modi?ed or given up, and in some cases it will suggest that the belief should be modi?ed or given up. In still other cases, I claim, examination of nearby states will result in the conclusion that neither should be

# $

#

is believed. So suppose some thinker has a desire that

and a belief that

#

#

given the desire that , or desire that

given the belief that ? This is only an issue when

. This is,

#

The symmetric structure to these problems suggests a way out. Why not believe

#

47 given up—the thinker should remain in apparent dissatisfaction. For example: suppose I desire that I have some grapes, and I believe that I don’t have some grapes. The content of these propositional attitudes is likely to trigger nearby thoughts about how grapes are found in stores or on vines; this in turn is likely to activate thoughts about how stores are places where you can obtain things using money, and about nearby stores in the area, the lack of vines in the area, and about what money I have. These nearby beliefs suggest my belief that I don’t have grapes is easily modi?able; there is a way to modify this belief that I don’t have grapes through intention formation and action, rather than direct manipulation. On the other hand where the mismatched is “I’m a pirate”, nearby beliefs include the one that pirates of the type I wish to be existed in

this is, a belief about the feasibility of going back in time, and so on. In this case it looks like the desire is what should be modi?ed. And ?nally, suppose I am starving on a desert island (perhaps as a result of a failed pirate life), and there is in fact no food to be obtained anywhere. In this case it seems I am at an impasse; I should neither revise the desire to eat, nor the belief that there is nothing to eat. Instead, I should just remain in relative belief-desire incoherence. Examining nearby content seems like a natural method for determining how to settle any given tug-of-war between belief and desire—it also happens to be roughly what we actually, typically, do. Section 2.2 and chapter III will give further details of this method, and show how to begin modeling it computationally. For now, we might still wonder: why not, once it’s decided which of the belief and desire has to go, use one of the shortcut methods for changing it? And if it sometimes works, why not use it every time? For example, intention-formation and action may seem like a long way around for belief modi?cation when compared to wishful thinking.

('&

('&

the 17

and 18

century, which in turn is likely to trigger a belief about which century

%

48 But it may turn out to be a preferable method given stubbornness in spontaneous beliefs about having grapes. This might also be a plausible explanation for why we don’t typically do pure wishful thinking. Though one way beliefs arise is through inferential belief-forming mechanisms under something like conscious control, another way is spontaneous and not under our control; beliefs seem simply to ?lter up from our perceptions, from automatic (perhaps innate) inferential practices, and the like. These spontaneous beliefs—such as “I don’t have grapes”—may persistently surface, causing lasting beliefdesire tensions despite the wishfully believed “I have grapes.” A cognitive mechanism capable of noticing this stubbornness in spontaneous beliefs may come quickly to prefer that in the case of something so easily attained through action, the intention-formation route is easier than the wishful thinking one at reducing long-term belief-desire tension. Thinkful wishing has similar problems. There are spontaneously-formed desires, too, and for humans a thinkfully wished desire not to eat will have a hard time in the face of spontaneous desires for food. 2.1.7 External constraints But do we have reason to think that intelligent creatures will have, in general, such stubborn thoughts—ones that render wishful thinking and thinkful wishing ineffective? Yes. It is in virtue of being intelligent creatures at all that they will have such stubborn thoughts. Remember from chapter I that “creatures” are designed to perform functions of some kind. This design of the creature, it seems, will inevitably constrain their conations and cognitions; not every thought-forming mechanism is available to them. 22 Suppose I am right, for example, that feedback mechanisms for learning are responsible for cognitiveFor an everyday example, our human design restricts us from forming cognitions about smells that a dog’s design allows them to form. For an extreme but more conclusive example, a physically instantiated thinker could never possess a system capable of checking any reasonable number of atomic beliefs for truthfunctional consistency, due to the enormous computational resources required. (For other examples see Stich (1990) pp151–154, which relies heavily on the oft-cited work of Cherniak (1986).)
22

49 conative pairs in the ?rst place. These feedback mechanisms are then in the service of these creaturely functions. For a creature to develop a pure wishful thinking habit would be to short-circuit this feedback before it had a chance to in?uence cognitive dispositions toward its external aims, thereby ignoring the very reason the feedback mechanism was created in the ?rst place. At least this part of the creature, essentially, would not be functioning well. Wishful thinking and thinkful wishing look like objections to the internal pragmatic standard when we see apparent desire satisfaction as a creature’s intrinsic cognitive aim, rather than merely instrumental in achieving external functions. Suppose, for comparison, I gave elaborate ?nancial advice aimed toward the standard of “getting money”. Someone might object: I could succeed at that standard just by making up my own currency and printing it in the basement. This looks like an objection to the proposed standard, intuitively, because printing my own currency would not do me the good that money is supposed to do me. But of course it is not really an objection, because the advice presumes in the ?rst place a system where getting money is of further instrumental use. 23 Similarly, desires and beliefs are cognitive tools for the creature to perform its external function(s). The range of responses to mismatches in beliefs and desires is limited to what would serve those external functions—at least, in a creature that is functioning well. This is not to say that wishful thinking is impossible for intelligent creatures. Of course we humans have amply demonstrated its possibility. But it is easy to forget that our own ability to do it is seriously constrained—we cannot simply believe that we’re on a beach with a beer, no matter how much we might desire it. And there is reason to think that if we do manage to wishfully think pervasively, it is because we are not functioning well. Schizophrenics, for example, seem to be particularly adept wishful (or motivated) thinkers, to the point of hallucinations. Implicated in this mental malfunctioning is ex23 You might object that it is not really “money” if I make it up and print it myself. But you could play the same game of conceptual legislation for “beliefs” formed directly from desires.

50 cessive dopamine, which seems to provide a kind of feedback to the brain for its own modulation.24 Too much dopamine could, perhaps, “short-circuit” this feedback mechanism. So the point is not that we couldn’t possibly wishfully think; the point is that in virtue of being well-functioning intelligent creatures, our ability to wishfully think will be severely constrained, since it is practically guaranteed not to help ful?ll the aims for which our thinking was designed. Make no mistake that the ful?llment of these aims is an external standard—it is an evaluation based on the performance of an external function, with which the thinker’s own functioning can only contingently be correlated. I employ this external standard in defense of the internal standard of apparent desire satisfaction. Intuitions demand a reason to believe that wishful thinking (and thinkful wishing) will not in general be a rational (or “successful”) way to think. But the brain-in-a-vat examples of chapter I make it plausible that wishful thinking is not always irrational. If the laws of nature were somehow such that a thinker’s desires automatically brought about the desired state of affairs, wishful thinking would be a rational, successful way to think. The argument that wishful thinking is irrational must rely, therefore, at least in part on external factors like the way the world is, since it is only irrational in worlds like the one we ?gure to inhabit. Meanwhile the fact that humans do engage in wishful thinking a great deal—and in thinkful wishing, for that matter—suggests that our brains do struggle to match cognitions to conations in whatever way possible, within the con?nes allowed them. A wellfunctioning brain in a “normal” environment will learn that wishful thinking does not work—which is why we think Bill from chapter I isn’t so smart. (Indeed, Freudian theory has it that we start with a wishful thinking “pleasure principle” as infants, and only
Dopamine seems to be crucially involved in reinforcement learning, for example; see e.g. O’Reilly and Munakata (2000) pp193–195.
24

51 eventually learn the “reality principle” through hard experience. 25 ) So my response to the wishful thinking (and thinkful wishing) objection on behalf of the internal pragmatist, in the end, is partly that these mechanisms simply are not available to us, given reasonable assumptions about how the world is, and how we were designed by it. But that’s not all there is to the story. Suppose these fallacious forms of reasoning were more generally available to us, say through some clever new form of neurosurgery that would make us ideal wishful thinkers (or thinkful wishers). Should we undergo the procedure, according to the internal pragmatist? No, of course not. But now the reason should be plainer. We shouldn’t because we only aim to believe our desires are satis?ed insofar as we aim to have our desires satis?ed. The only way we have to judge whether they are or not is through our beliefs, of course. But our aim to have matches between our beliefs and desires only makes sense when we have reason to think that this aim would help bring about our external aims. I do not (in the ?rst instance) want to believe that —I just want that . And as an intelligent creature I have internal mechanisms capable of learning in the service of this want, requiring (I have argued) internal representations of it and my relative success with respect to it. So I try in virtue of this want to bring my belief with regard to in line with my desire that . Our internal feedback functions, remember, were designed with the service of our external functions in mind. These functions, again, will provide external constraints on the kinds of thinking available to us. So the more complete version of the response, in summary, is that wishful thinking is not generally available to a creature that is functioning well, given facts about our environment and design.
In many ways, Freud’s theory of brain functioning is instructively comparable to my own—at least as I understand his theory, through articles like Hopkins (1998) (especially 7) and Glymour (1991). Whether this is a bad sign or a good sign I leave up to you.
25

)

0

)

)

)

52

2.2 Coherence
Wishful thinking and thinkful wishing looked like objections to internal pragmatic epistemology because they looked like ideal ways to achieve the proposed internal standard, while at the same time being intuitively repulsive methods for thinking—thereby reducing the standard to absurdity. But we have seen that they will not actually be good ways to achieve the proposed epistemic standard in an intelligent creature inhabiting an environment like ours. So, then, we should look for other cognitive mechanisms to reconcile mismatched beliefs and desires. According to the suggestion just hinted at so far,

in order to determine which should be modi?ed, and how. Each of those belief-desire pairs, in turn, can be decided similarly. Such decisions must play off each other simultaneously. It looks like the system is doing better, then (according to the internal pragmatic standard) the better it balances all these belief-desire con?icts into a coherent state. And only some coherent states will be possible, given the external constraints on any intelligent creature. In this section I will provide further motivation for, and further elaboration of, this coherence approach. 2.2.1 Rational desires The symmetry between wishful thinking and thinkful wishing reminds us that an intelligent creature, intuitively, should sometimes revise its goals as well as its beliefs. Typical epistemic pragmatists (like William Lycan, Nicholas Rescher, or Stephen Stich) seem to forget this intuition.26 It’s one thing for a pragmatist to say that given a thinker’s goals, that thinker should be evaluated with respect to the achievement of those goals; it’s quite another to say that the thinker should be evaluated on whether it comes up with good goals
26

See e.g. Lycan (1988), Rescher (2001), and Stich (1990).

1 2

1

a creature that believes

and desires

can look to other “nearby” beliefs and desires

53 that then get achieved. Only the latter is consistent with the idea that part of good thinking involves choosing the right goals. It is no vice of a thinker’s belief -forming mechanisms that it fails to achieve unattainable, crazy goals like becoming a pirate. And remember the stoic thinkful wisher, who desires things to be just as they are—even when, say, nuclear armageddon is moments away. This stoic shows it is also no virtue of belief-forming mechanisms to “achieve” easily attained goals. The core intuition is that desires can be rational or irrational, just as beliefs can. Intuitively a thinker is doing pragmatically well only if it comes up with rational goals, which it then (on the whole) achieves. So the evaluation of a creature’s belief-forming mechanisms will depend in part on the success of the desire-forming mechanisms. But the evaluation of those will depend, in turn, on the belief-forming mechanisms, and so on. Consider, for example, the formation of sub-goals. An intelligent creature must pick good goals in part by evaluating their feasibility. This will be determined in part by the estimated potential success of the sub-goals that it forms to achieve this goal. So goals will be picked in part by the strength of their sub-goals, but at the same time the sub-goals will be picked or discarded for their potential success at attaining the goal. When things are not working, it could be because the original goal is unrealistic, or because the sub-goal is ineffective. These con?icts, in turn, will depend on con?icts between cognitions and conations. That, I have argued, is the basic impetus for change in cognitive disposition. The system would not need to worry about how to play off its conations against each other if all the correlating cognitions were matching. So take any cognition-conation pair resulting from a sub-function. For any mismatch between them, the system can continue to attempt improvement at the sub-aim, which would mean continuing to attempt to bring the internal representation of its success toward the goal state. Or, the system could decide that the

54 sub-aim is unfeasible, and demote it in favor of something else. The resulting picture is a mess of small constraints, no one of which is decisive, and each of which the system struggles to satisfy in one way or another. This struggle to match cognitions and conations, again, is simply part being a suf?ciently intelligent creature— and the more intelligence, it seems, the more complex such struggles will take place. Struggling to make one match may cause more mismatches elsewhere, or it may cause fewer. The challenge for the intelligent system is to ?t all these together somehow, and come to an all-things-considered position both with respect to what it “believes” and with respect to what it “desires”. The thinker generally cannot at any time satisfy all these mental constraints, so instead it is faced with a “multiple soft constraint” problem for playing them off each other, trying to violate as few as possible. Thagard has argued, in turn, that such a constraint satisfaction problem is just a more formal version of what we have always, in epistemology, called a problem of coherence. 27 The general notion should be familiar to epistemologists, but of course the proposed coherence is not typical belief coherence.28 The proposal is that in virtue of being intelligent (in the way described) a creature will seek a comprehensive coherence, a coherence among all thoughts, cognitive and conative. Now it may be that the best way to arrive at such comprehensive coherence is through separate attempts at purely cognitive coherence—and purely conative coherence, too. The creature is likely to attain more of its ?xed aims given a coherent theory of the way the world is, rather than an inconsistent or disjointed theory. The more ?exible and rich these aims, the more general and detailed the theory must be. And a coherent set of plans for altering the world to accord with the creature’s conations also seems like a natural way to serve the comprehensive coherence at issue. (In chapter III we will see, in fact,
27 28

See Thagard (2000) chapter 2. Of the type discussed by Laurence BonJour, say; BonJour (1985).

55 how speci?c proposals for calculating these local coherencies, like Thagard’s ECHO and Thagard’s and Elijah Millgram’s DECO, might be integrated into such a comprehensive framework.) The traditional alternative to justi?cation from coherence, of course, is justi?cation from foundations. In the case of internal pragmatic epistemology, it could be that there are foundational cognitions that support further cognitions inferred in the right way, and foundational conations that provide support for the conations derived from them. When it comes to speci?c con?icts between thoughts cognitive or conative, then whichever has the better foundational support will win (in a well-functioning thinker). And we have already seen, in discussing wishful thinking, that intelligent creatures will generally be severely constrained by their external functions in the mental mechanisms available to them. These external constraints may provide just the mental foundations required. In fact, I think something like that foundational view is also right, though I will argue the externally imposed constraints provide default thoughts rather than foundational ones, and the ultimate calculation is still one of dynamic coherence rather than simple derivation. The view is thus actually a hybrid of coherentism and foundationalism—though closer to the coherentist side. The external constraints will make for only default (rather than incorrigible) thoughts, I will argue, due to the form these external constraints must take. 2.2.2 Intelligence and rich aims Let’s see how the externally imposed constraints are likely to work for an intelligent creature by imagining a detailed example. Suppose we want to build a creature that has the basic aim of keeping a public park clean. We out?t it with all sorts of perceptual tools that look, sniff, listen, feel, propriocept, and the like—and maybe a few tools capable of doing more direct chemical analyses of materials. We also out?t it with many kinds of grabbing,

56 moving, climbing, and digging tools, and whatever other potentially handy capacities we can think of. Once we managed to design those pretty well, the problem then becomes one of how to convert the inputs from the perceptual devices into appropriate outputs to the motor devices in a way that accomplishes its purposed aim. That is, we need to engineer its mental system. As a ?rst try, we could simply hardwire the motor outputs without any need to consult the perceptions. For example: “move ?ve meters north by northeast, extend the arm this far, contract the pincers, move the pincers to the bag unit, release pincers, move twenty meters in new direction . . . ” Naturally this will only work (the creature will only achieve its aim) insofar as the environment cooperates by having litter at the speci?ed locations. Such a creature is unintelligent, for it is unable to adapt to an environment other than the one it is hardwired to expect. We can describe the hardwired aim not only as “keep the park clean”, but also more simply as “move in these ways”. This more speci?c way to characterize the aim tracks the lack of adaptability in trying to reach the more general aim. So instead we take advantage of its perceptual capacities. Move around and look around, we tell it. And whenever you receive this exact combination of inputs, behave in this precise way. Naturally this would take a lot of programming on our part; we would have to teach it how any candy wrapper or aluminum can would look in any state of crumpledness or from any angle. Suppose we could manage, though. Suppose also we program this creature to keep a log of where and when it ?nds trash, and have it use this log to adjust its route accordingly. It now can learn to get more trash faster, keeping the park cleaner. Still, we could say its aim is “pick up stuff that’s perceived in just this way” rather than “keep the park clean”. For when new kinds of trash come along—a new candy bar, say—our creature will not pick it up. It cannot adapt to that kind of change in environment, and so is more likely to fail at the wider construal of its aim.

57 It would be still better if we could manage to provide for it a very broad understanding of litter. We would need to use abstract concepts like “arti?cial” and “left by a visitor”. These concepts would have to be applied to determine what’s litter according to some loose weighting, so that grass clippings left by a visitor would still count, and so would trash that had blown in on a wind. With concepts like “visitor” and “arti?cial”, the creature could then even learn methods to prevent littering in the ?rst place—it discovers, by accident, that wagging its claw at a visitor about to perform a certain type of action prevents the action, for example. With a rich enough conceptual library, the robot might learn to achieve its objective most effectively through initiating public service advertising campaigns, and so on.29 I have already noted that intelligence correlates well with adaptability, and that greater adaptability requires greater ?exibility in learning. The point here is that such ?exibility in learning requires the creature to have what we might call “rich” aims. Roughly speaking, rich aims will have a greater capacity for sub-aims according to various ?ne-grained environmental differences. There is a similar theme running through work on naturalizing mental content. Think again of the simple thermostat from chapter I. As Dennett points out, its aim is better described as “keep the switch off only when the coil is so-long” than as “keep the room warm.” His more complicated thermostat versions, however—the ones with multiple ways to detect room temperature, and multiple ways to regulate it—are more and more accurately describable as having the aim of keeping the room warm. . . . as systems become perceptually richer and behaviorally more versatile, it becomes harder and harder to make substitutions in the actual links of the system to the world without changing the organization of the system itself. If
29

As will soon be obvious, this example was inspired by the complicated thermostat of Dennett (1981).

58 you change its environment, it will notice, in effect, and make a change in its internal state in response.30 That sounds a lot like adaptability; that also, Dennett suggests, is what is required for a creature truly to represent its environment. Fred Dretske has a similar hunch that there is an important tie between adaptability and rich mental content. He suggests that an approach like Dennett’s will not work for naturalizing meaning, though, because it cannot handle the notorious disjunction problem for intuitive cases of misrepresentation: No matter how versatile a detection system we might design, no matter how many routes of informational access we might give an organism, the possibility will always exist of describing its function (and therefore the meaning of its various states) as the detection of some highly disjunctive property of the proximal input. At least, this will always be possible if we have a determinate set of disjuncts to which we can retreat.31 Dretske follows this passage with a suggestion for ?xing this problem, a suggestion especially interesting in the context of my own view: Suppose, however, that we have a system capable of some form of associative learning. . . . We now have a cognitive mechanism that not only transforms a variety of different sensory inputs . . . into one output-determining state . . . , but is capable of modifying the character of this many-one mapping over time. . . . A system at this level of complexity, having not only multiple channels of access to what it needs to know about, but the resources for expanding
Dennett (1981) p235. Dretske (1986) p338; ‘meaning ’ is roughly functional meaning along teleosemantic lines. Dretske is not explicitly responding to Dennett here.
30

4

31

3

59 its information-gathering resources, possesses, I submit, a genuine power of misrepresentation.32 If this is right, then to have truly rich content—content capable of genuine misrepresentation, for example—requires at least primitive learning mechanisms. The rich content of the aims of an intelligent system consists, in part, of the wide variety of sub-aims the system can pursue in achieving the aim. 2.2.3 Defaults and foundherence Imagine, then, our park-cleaner has clean park as its basic aim—an aim ultimately responsible, in some way, for its design (in this case, through our own intentions to have a robot to clean the park). This conceptually rich goal is constituted, we have seen, by many smaller sub-aims it is hardwired to seek. “Scour the park, look over the whole thing carefully! Pick up stuff that meets these trash criteria! Keep yourself in working order! Prevent stuff that meets these littering criteria!” Let’s call the aims that came hardwired for attaining the basic aim ?xed aims.33 These then each have sub-aims of their own, internal feedback mechanisms redesigning the creature’s own cognitive system toward the matching of its cognitions to its higher conations. “Go south! Go north! Practice grabbing!

your left rotor!” A genuinely intelligent creature will teem with such potential sub-aims. Each of these either constitutes a little feedback mechanism in turn, with its own implicit conation-cognition pair, or else bottoms out in a basic function (such as a motor impulse). Unlike the ?xed aims, though, the sub-aim conations are not immutable. We were clever enough to design the robot so that they can be learned and adjusted, under the
Dretske (1986) p338. Though these are only default aims, as we will see, they are “?xed” insofar as they cannot be undefaulted, so to speak. In terms of the computational model to come later, at least some of their activating constraints cannot be adjusted by the system (namely, the constraints with special element ).
33 32

6

5

Test the hypothesis that sample

is trash! Plan a new route! Scold those litterbugs! Fix

60 guidance of the basic aim, as constituted by the ?xed aims. As suggested before, there are two ways these conations might change: ?rst, the creature may learn that even when it succeeds (subjectively) in matching the cognition to the conation, the higher function in whose service it was formed fares no better. On the other hand, the creature may learn that no amount of learning, and no sub-sub-aims, will bring about that sub-aim’s success. The cognition and the conation simply refuse to match. In both cases the sub-aim does not ultimately contribute to the higher aim, and in both cases a smart system will as a result inhibit the functioning of this sub-aim. If such con?icts always had some easy resolution according to feedback from the aims above, there would be no dif?culty in calculating what to do. Similarly, an instrumentalist about practical reason has little dif?culty explaining which goals are good or bad. Irrational desires are those deriving from irrational beliefs about what means would facilitate the higher ends. The highest ends, then, are taken as intrinsic, immutable, and rational—as foundational, in other words. In our example, the creature need merely determine through feedback which of several possibilities best achieves the aim of clean park. But the matter is not so simple, because it seems any creature with rich aims will have a variety of ?xed aims that may themselves lead to con?icts. For example: constituting its basic aim of a clean park, our park cleaner has ?xed aims both to take care of itself, and to pick up trash that it sees. Reaching way over a cliff to retrieve some garbage will be a good means to one, but not the other. The creature must somehow balance these aims against each other, and since they are ?xed, there is no higher internal aim to which it can appeal in deciding. With several such potentially con?icting ?xed aims for any potential decision, the situation becomes that much more complicated. Not all of even the ?xed aims can be served. Some will have to be violated for the sake of the others. In other words, the problem is one of maximizing soft constraint satisfaction—one of coherence.

61 Again this coherence is not free to settle into any old con?guration, as we have seen; the designed aim of the creature imposes external constraints in the form of these ?xed functions. In effect it will have default conations hardwired, constituting its rich aim as an intelligent creature. And it will be hardwired to represent its success at these conations in default ways—the creature will simply receive basic signals from its sensory periphery, for example, and those will be thrown in the coherence hopper as default cognitions. Though any one default conation or cognition might be overridden (as our robot’s overriding the default “get all trash” directive in favor of the “preserve self” directive), it would only be in order to satisfy enough of the other default cognitions and conations. Not all of them could be overridden, because (I claim) it is in the nature of the intelligent system to satisfy as many as possible in pursuit of its external function(s).34 I should emphasize that the result is not a pure coherentist position. Some thoughts— the default cognitions and conations—are more important to satisfy than others. Put in terms of traditional epistemology, these thoughts contribute more toward overall justi?cation when accepted by the system. The external constraints on the coherence thus provide some justi?cational foundation, if you like, for the rest of the thoughts. But of course the position is not a pure foundationalist one, either. The derivative thoughts have no direct inferential foundation, for one thing, but rather result from a balance of the others. And any of these default thoughts is in effect defeasible, rather than truly foundational. The proposal is therefore a hybrid position of the type Susan Haack calls “foundherentist”. 35 It may seem to be a hopelessly wishy-washy position, to have “default” thoughts that are sorta foundational. But as we will see, Thagard has shown how there is a natural, perfectly respectable algorithm for computing such foundherentist problems—an algorithm that for
Here and elsewhere to “satisfy” a conation should be read in the subjective sense, as getting a matching cognition. 35 Haack (1993).
34

62 example our own brains seem eminently capable of computing. 36 My proposal builds on this computational construal of foundherence. 2.2.4 Coherence and humans This is all a ?ne story, perhaps, about abstract intelligent creatures. But is it at all plausible for, say, the intelligent creatures most familiar to us—namely, us? In the case of humans, we were designed by natural selection to pass on our genetic material. In the service of this aim, it seems we were given a mass of ?xed aims—probably ones like avoid pain, get food, have sex, gain affection, give affection, and so on. We are so remarkably ?exible at attaining these ?xed aims that it is often at the expense of the evolutionary aim for which they were initially designed. We can keep seeking food beyond what our health (and thus genetic ?tness) can sustain. We can have sex without having kids. And so on. Similarly, the balance of our park-cleaning robot’s ?xed aims could end up violating the main aim for which it was designed. It may ?nd, for example, that the best way to keep the park clean is to keep all visitors out of the park—by any horri?c, Terminator-like means necessary. Oops, we forgot to give the robot a rich enough notion of “park” to include allowing visitors. Well, no designer is perfect, after all—not even Mother Nature. And, again, humanly ?xed aims like getting food can lose out to a complex of other ?xed aims, as for example Mahatma Gandhi demonstrated. Of course I do not claim that “freeing India” was a competing ?xed aim for Gandhi. But it did arise from a complex of other such ?xed aims in Gandhi, such as perhaps caring for other people. It must have— from where else could it have come? (Incidentally I don’t suggest that Gandhi was being sel?sh in starving. Probably some of our ?xed aims are altruistic; or if not, it is probably rational to learn altruistic aims in their service.) The point is: we are not slaves to each of our ?xed aims, but we are not completely free of them either. Our humanly desires are not
36

See especially Thagard (2000).

63 sui generis—they are either ?xed aims or derived somehow from a balance of them. Of course a human may come to have irrational sub-aims, such as serial killing. Intuitively, those humans are sick or malfunctioning in some way; they have failed to reach the right conclusions about what sub-aims to form. They are in bad belief-desire incoherence; they are also, we think, desperately unhappy. Naturally we humans also have ?xed cognitions, and they too are default constraints that may not be met. We just receive a lot of sense-data. Any packet of sense-data can be overridden; we can think ourselves out of optical illusions, for example. A pilot banking in the clouds will eventually become convinced that she is ?ying level—at least until she learns to trust the plane’s arti?cial horizon monitor, telling her otherwise. But we are not free to ignore all these cognitions, either. We have to take most of them at face value. A common starting point of internal epistemology is “trust, ceteris paribus, your senses.” 37 Fine: but why? I would say we cannot help but do this. It is part of being an intelligent creature that we will form cognitive representations of the world through mechanisms that are simply given to us. They and our basic conations form the original substance of our thoughts. The more interesting question is when and why we shouldn’t, and how and on what grounds we make such decisions. I have argued that the ultimate standard for such decisions is a general coherence between conations and cognitions, within the bounds of externally imposed constraints—making for a more speci?c, and more easily naturalized, version of internal pragmatic epistemology.

37

See e.g. Pollock and Cruz (1999), or Wedgwood (2002).

CHAPTER III

COMPUTATIONAL EPISTEMOLOGY

As I have emphasized throughout, one of the best features of this account is that it has real potential for a computational model. In fact the epistemic considerations thus far amount to a set of speci?cations from which such a computational model of intelligence might arise. In this chapter I will ?rst detail these speci?cations. Then I will provide an algorithm schema for meeting them, and a realistic example of an architecture for computing that type of algorithm. This is the last piece of the “computational internal pragmatic coherence epistemology” that I promised. With the whole view ?nally in place, I will then take some time to examine its advantages, its implications, and its suggestions for further research.

3.1 Belief-desire coherence (BDC)
This section will ?nally attempt to make the promised link from philosophy to cognitive science, by providing an algorithm and architecture to help realize the epistemology of the previous sections. Really, though, I should say that the order of explanation here is a bit arti?cial. In some ways the existence of the algorithm argues for the philosophy rather than vice-versa. And in some ways the existence of my chosen architecture argues for the speci?cations for the

64

65 algorithm, rather than vice-versa. And in general, I see the various aspects of the view as supporting each other mutually, in a neat coherence, rather than following one upon the other with only unidirectional support. 3.1.1 Speci?cations At any rate, to summarize the foregoing philosophy, we now have reason to think that the mental system of a suf?ciently intelligent creature must have the following kind of computational capacity:

To this list I will add two more desiderata that we have not yet explicitly discussed.

The latter should be uncontroversial for anyone looking to naturalize intelligence by showing how it could arise in an ordinary physical creature. The former has two, independent motivations: ?rst, suspicion on philosophical grounds that beliefs, desires and such are

7 7 7 7 7

It must be able to learn better performance at a task based on an internally available measure (thereby possessing conations and cognitions); That measure should be one of a certain kind of coherence among all the conations and cognitions the creature thus formed; That coherence should allow for default thoughts (both conative and cognitive) that have a kind of defeasible precedence.

The system should allow for degrees of acceptance (and rejection) of the conations and cognitions; The system should be physically realistic, able to compute the coherence problem in a reasonable amount of space and time.

66 not, in the ?nal analysis, all-or-nothing affairs. I will not defend or elaborate this suspicion, but trust that enough others share it, or anyway are willing to tolerate it. Second, degrees of acceptance are important for systems that learn through feedback mechanisms. As Randall O’Reilly and Yuko Munakata put it: . . . graded changes allow the system to try out various new ideas (ways of processing things), and get some kind of graded, proportional indication of how these changes affect processing. By exploring lots of little changes, the system can evaluate and strengthen those that improve performance, while abandoning those that do not.1 In other words, it is important when trying different approaches to improving at a task to be able to make ?ne-grained adjustments in behavior, both cognitive and external. (The ?ip side of this coin is graceful degradation: a system based on degrees of acceptance that fails or malfunctions in some way can do so partially, without disastrous consequences cascading throughout the system.) Perhaps talk of the system “trying” new ideas and the like, though, sounds too strange at this point. How does a system of wires or neurons “try” different methods, and “decide” to prefer these mechanisms over those? It may seem to require a cranial homunculus, testing various techniques and picking its favorite. If so, it is unclear what work the standard is doing for either arti?cial intelligence or normative epistemology—since designing such homunculi for machines only pushes the problem back a step, and since we presumably have no such homunculi ourselves. How can the brain of a park-cleaning robot or a human being make its own functioning better (according to the standard of comprehensive coherence) without a homunculus? We turn now to cognitive science for the beginning of an answer.
1

O’Reilly and Munakata (2000) p18.

67 3.1.2 Machine learning Modeling learning in a machine is certainly a tricky problem for any cognitive theorist, but there has been progress along several different tracks: for example, the numbertheoretic track based on seminal work by Hilary Putnam and others. Closely related to the number-theoretic track is the Bayesian track, perhaps most actively pursued now by Clark Glymour. Or there is the less uni?ed track examining prospects for goal-driven learning. 2 For reasons that will soon become clear, I will look brie?y at learning from a connectionist, or arti?cial neural network, standpoint. The point for now is merely to provide an example of how a machine might learn. Patricia Churchland and Terrence Sejnowski provide a taxonomy of learning algorithms for neural networks in Churchland and Sejnowski (1992). Networks, they say, can learn through some combination of external and internal feedback; in the former case something outside the system provides information about the quality of the answer, while in the latter case the system can test itself. For clarity Churchland and Sejnowski call nets with external feedback mechanisms “supervised”, and nets with internal feedback systems “monitored.” (The former term has caught on, and usually systems with only internal feedback are called “unsupervised”.) The notion of internal feedback for a network may seem mysterious to some—and since it is crucial to the project, it is worth some explanation. Consider, for example, a net required to learn to predict the next input. Assume it gets no external feedback, but it does use its previous inputs to make its predictions. When the next input enters, the net may be able to use the discrepancy between the predicted input and the actual input to get a measure of error, which it can then use to improve its next prediction.
2 For the number-theoretic version see Jain et al. (1999) and Kelly (1996); for a Bayesian version see Spirtes et al. (2001); for goal-driven versions see Ram and Leake (1995).

68 This is an example of a system with internal but no external feedback; the learning is “unsupervised” but “monitored”. “More generally,” Churchland and Sejnowski say, “there may be internal measures of consistency or coherence that can also be internally monitored and used in improving the internal representation.”3 Networks other than the most simple ones are good at extracting what Churchland and Sejnowski call “higher-order information” from large inputs. (For example, the visual cortex extracts information like surface boundaries from retinal cell impulses.) There are generally two tasks in extracting such “higher-order information”: ?rst, to see what kinds of patterns and correlations are in the information; second, to sort through and represent only the patterns of interest out of all the possible ones. “The information for this last task,” they say, cannot be garnered from inside the net itself, but must be provided from the outside. The division of labor in a net with hidden units looks like this: unsupervised learning is very good at ?nding combinations but cannot know which subset to “care” about; supervised learning can be given criteria to segregate a “useful” subset of patterns, but it is less ef?cient in searching out the basic combinations.4 For many systems, then, a combination of internal and external feedback is ideal. Plausibly the human brain is both a supervised and monitored non-arti?cial neural network. Evolution provided the design of the human brain with sloppy but (over vast time) effective external feedback. It is thanks to evolution that we care about some higher-order information—like surface boundaries, or who loves us—and not about other information, such as infrared spectra or who has detached earlobes. In this sense evolution provided us
3 4

Churchland and Sejnowski (1992) p97, emphasis in the last quotation is of course mine. Churchland and Sejnowski (1992) p99.

69 with basic things to care about, understood as ?xed conations. The problem for a net with an internal feedback mechanism is to adjust the weights between the nodes of the network in a way that best facilitates minimizing error. This is a dif?cult problem. Finding a suitable weight-change rule looks really tough, because not only are the units hidden, but they may be nonlinear, so trial and error is hopeless, and no decision procedure, apart from exhaustive search, exists for solving this problem in general.5 Still, for particular kinds of nets, even quite complicated ones, weight-adjusting rules exist: It is now clear that there are many possible solutions to the weight-adjusting problem in a net with hidden (and possibly nonlinear) units, and other solutions may draw on nets with a different architecture and with different dynamics. Thus nets may have continuous valued units, the output function for a unit may have complex nonlinearities, connections between units need not be symmetric, and the network may have more interesting dynamics, such as limit cycles and constrained trajectories.6 Of course there has been much development in this direction since this quotation was ?rst published, some ten years ago. Neural networks are steadily capable of solving more and more complex weight-adjustment problems, which for our purposes are simply learning problems (as we will see). 3.1.3 Formal coherence Now to meet the speci?cations from section 3.1.1. Let’s start with a formal construal of coherence problems generally. I want to start here partly because it is easy: Paul Thagard
5 6

Churchland and Sejnowski (1992) p100. Churchland and Sejnowski (1992) p102.

70 has already done this work for us. In Thagard (2000) he argues persuasively that philosophical coherence is best understood as a constraint satisfaction problem (what I have also heard called a “multiple soft constraint” problem). Here is a formal characterization based on Thagard’s.7

bers) or rejection (for negative numbers) of any particular element.

positive constraint between elements will contribute to coherence when their acceptance values are near; a negative constraint between two elements will contribute to coherence when their acceptance values are far.8

sociated learning problem for systems that can learn: namely how to gain better potential coherence through adjustments to the constraint function . The system’s learning consists,
Thagard’s own formal characterization does not allow for degrees of acceptance and rejection until he puts it in a connectionist framework. But as the speci?cations make clear, I think the matter of degree should be understood as part of the abstract problem type. It is surely a desideratum of the general algorithm, independent of architectural implementation. (But listing this as a desired feature ahead of time does make a connectionist architecture all the more appealing.) For Thagard’s own all-or-nothing characterization of coherence problems see Thagard (2000) p18. 8 The standard measure of what is called “harmony” in a constraint satisfaction problem would in this case be (see O’Reilly and Munakata (2000) p107); I ignore the one-half term for the double-summing. Incidentally, contrast with a measure like . prefers acceptance functions the better they are able to bring together the values of positivelyconstrained elements, and drive apart the values of negatively-constrained elements. But will give additional preference to systems that give positively-constrained elements the same sign, and negativelyconstrained elements opposite sign. It also prefers systems with low entropy—systems with fewer acceptance values near zero.
7

o r wmf (kj n ?u? k ?f "k?tf ? ? ? sfqo r ?p vm l jh

X

X

o

p vm xo ?x? "kj n m n nm l j h f ? ? ? ? ?? ? x? "kj v(mf "kj hx? k hf "k?igedy?p???

?

coherence

given the constraint function . Furthermore, I propose that there is an as-

C

Then the coherence problem is to ?nd the acceptance function(s)

q ? vy?vW?w?q ? p?W?fX A ????? @e CqA @e C @SA @e ? ? ? s

Let

q gS ie X s q iS ge yxwvfutrpFhfX

elements. Typically is such that

X VRSR WUT?QPI dcb`YX G 9 a 9D

Let

VRSR WUT?QPI HFEC G 9D

Let

. This represents the degree of acceptance (for positive num-

. These are the constraints that must be satis?ed among the .

A B@

9

8 8 8 8

Let

be the set of elements

that must be brought into coherence.

, representing the coherence of the system. A

that maximize the

71 then, in its ability to change the coherence problem it computes. An intelligent creature is functioning well to the extent it has reached coherence between its cognitions and its conations, and is learning well to the extent that it has changed its functional nature in order to improve its potential for such overall coherence. (Learning well, then, is part of functioning well.) Now that we have a computationally respectable understanding of coherence, let’s specify the speci?c coherence problem involved in learning for intelligent creatures. 9 3.1.4 The belief-desire coherence algorithm We will start with an algorithm speci?ed at the level of propositional attitudes. (Later we will see how it might be extended to a sub-propositional level.) Call this the beliefdesire coherence algorithm, or BDCA for short. I should emphasize this is not strictly an algorithm so much as an algorithm schema; the details of the algorithm would depend on the creature. First, consider all the propositionally expressed internal aims of the creature—that is, aims that arise from a conative-cognitive representation pair about some proposition .

constraints, the system will be in greater coherence when more aims are matched. That is,

will be greater the more the acceptance function assigns both parts of each belief-desire
9

Computationally “respectable” is not to be read as synonymous with “tractable”; actually, Karsten Verbeurgt has shown in Thagard and Verbeurgt (1998) that coherence problems are NP-hard and therefore overwhelmingly likely to be intractable, even before countenancing degrees of acceptance as in my version. Still, as Thagard puts it, there are “several effective approximation algorithms” (Thagard (2000) p15), most notably a connectionist one. 10 Strictly speaking, is not necessarily a desire, but only some conation; similarly is really some cognition that need not be a belief. (See section 3.2.3 for a discussion of non-belief cognitive propositional attitudes and the like.) To call them all “beliefs and desires” here is largely for convenience.

y

? ??

? ? ? {}? { z? ??|??Tf?f

? ??



coherence. Let

be a commutative function where

for each . With such

~

all these thoughts, conative and cognitive, be the set of elements

to be brought into

{ ?}

y

{ |z

There will be a desire with respect to , or

, and a belief with respect to , or

.10 Let

y

y

?

72 pair either strong acceptance or strong rejection. These constraints are ?xed.

imposed on the system by the actual world. Let the coherence problem range only over acceptance functions that map this special element to . This special element determines the default thoughts. It is always highly accepted, and positive constraints with some chosen beliefs and desires make it likely, but not guaranteed, that those will be accepted,

enough con?ict in the rest of the system, though, any default thought can be overruled—as

Finally, specify constraints between the thoughts according to propositional content. For example, contradictory beliefs should have a strong negative constraint between them. A belief with an explanatory or evidential relation to another belief should have a positive constraint between them. A desire that if satis?ed would be a means to a further desire should have a strong positive constraint between them. A belief that if true would make some desire impossible should have a negative constraint between them. And so on. 11 These non-default constraints can vary to some degree as the system learns (how freely depends on the basic design of the system). The default cognitions should be propositionally “basic” ones—something like early Wittgensteinian atoms, or Russellian sense-data. The default conations for intelligent creatures, though, should be the opposite—highly abstract and complex, or what section 2.2.2 called “rich”. Admittedly, that last step is very hand-wavy. It is an unfortunate byproduct of leaving the speci?cation at the level of folk psychology and propositions, neither of which is very
As we will see in section 3.2.1, I have in mind connections of the type made in Thagard’s DECO and ECHO models; again see Thagard (2000) for an overview. DECO is Thagard’s model of deliberative coherence, developed jointly with Elijah Millgram; see e.g. Millgram and Thagard (1996).
11

? ? ? ??? ?? ?dtux??f?

default rejected thought let

. These constraints, too, are ?xed.

? ? ??? ?? ?¤ux??f?

per our speci?cations. For each default accepted thought let

?

?

too. (Similarly a negative constraint with

makes a thought default rejected.) If there is

?

?

?

?

?

Add to the set

a special element, . This element represents the external constraints

, and for each

73 well understood when it comes to physical implementation. I will suggest a step toward remedying this in a moment; for now let’s review the intuition behind the BDCA. It is ?rst a subjective measure of the system’s success at balancing its default desires according to what is reported via its default beliefs. But second, it is also a force toward which the rest

that when the system comes to reassign the other constraints, it will be towards greater belief-desire coherence. Some of these intermediate constraints the system may not be able to adjust, but others it can. To the extent it is able to adjust these other constraints, I claim, the system is capable of learning. 3.1.5 The architecture, and a toy example Now let’s look at the “physically realistic” speci?cation for our algorithm. The question here is whether BDCA can be computed by a physical system using reasonable resources. Well, constraint satisfaction problems are the bread-and-butter of arti?cial neural networks, which are ef?cient estimators of solutions to such problems. 12 There is also, as we have seen, a great deal of work on how such connectionist networks can learn to adjust their weights according to internally-detectable error functions. And mapping a coherence problem onto a neural network is easy: let the nodes of the network represent the elements, let the constraints of the problem determine the weights between the nodes, and let

activation states according to a standard updating rule.13 We also know that some kind of parallel distributed processing roughly along connectionist lines is just the kind of thing that, for example, the real neural networks of the human brain manage to do all the time.
Especially in the “degree” formulation of coherence problems, as noted; but see chapter 2 of Thagard (2000) for more of an argument to this effect. 13 The simplest of which is ; often this sum is fed into a sigmoidal function that limits . the values asymptotically to an interval like our
12

?

the network calculate the activation function

by iterating the network through different

?

of the system can learn. Fixing the constraints between belief-desire pairs at

demands

???? ??it???

?? ? ¤ ? ? ? ? ? ? hUw¨§"????E?u?

74 Thus our algorithm seems like a cognitively realistic computation for a physical system to approximate, and within reasonable time constraints.14 With a connectionist picture in the background, we can also begin to see how this feedback mechanism might integrate into the rest of a mental system, without needing to rely on the mysterious primitive connections forged via propositional content. In a very simple, unintelligent creature, inputs (in the form of sensory stimulations) get mapped directly to outputs (in the form of motor stimulations). In the case of more complex but still unintelligent creatures, the inputs are sorted in a pre-wired way for patterns, or higher-order information—such as sorting visual stimuli for surface boundaries, and then sorting those for boundaries typical of predators. (Again, arti?cial neural networks model such extraction of higher-order information very well.) These patterns of stimuli will directly cause patterns of output, such as the pattern of motor stimulation required for moving away from the predator (given the position of the predator and such). In a complex creature capable of learning, coaching from internal feedback can shift the way the higher-level information is extracted from the senses, and shift the way low-level motor coordination results from high-level desires. In a network model, “credit” and “blame” for internally-detected successes and failures can ?lter down recursively through the layers of representational patterns. In this way the internal standard of aim-matching can provide feedback throughout the mental system. To get a sense of how this works, consider a toy example. Imagine a creature with a mental system like this: ?rst, it has a layer of sensory input nodes. Let these feed a second layer to extract patterns from these inputs, and let a third layer extract yet higher14 There is also active, empirically-backed speculation about how the constraint satisfaction form of learning takes place in humans. For example, O’Reilly and Munakata (2000) p420 mentions that the anterior cingulate cortex seems to have much to do with learning, acting somehow like the “adaptive critic” feedback mechanism for reinforcement modeling in networks. Signi?cantly, the anterior cingulate cortex has also been implicated in processing emotions, which I am inclined to read as various types of belief-desire incoherence; see section 3.2.4.

75

proto-cognitions

proto-conations

Figure 3.1: A somewhat sophisticated, but unintelligent, creature. order patterns from those patterns. Think of the periphery layer as raw data, the second as somewhat theory-tainted observations, and the third as abstract beliefs and theories about how things are. These are the proto-cognitive layers.15 Similarly let the creature have three proto-conative layers—one to represent, roughly, the abstract desires resulting from ?xed aims, the layer below it for plans to bring them about, and the bottom layer for motor output. (Of course each of these layers is likely to represent many many levels of computation in any actual creature.) Now imagine rigging these two systems together as in ?gure 3.1, hooking up the highlevel proto-cognitions to the high-level proto-conations, so that the latter can take advantage of the former in ?xed ways. (We assume that they will mesh well; that is, the kinds of patterns that the high last cognitive layer sorts out are the kind that the high conative layer is interested in.) The result is a somewhat sophisticated creature, able to react to various patterns in its environment in fairly complex ways—but still rather unintelligent by our standard, since its computations are hardwired.
I call them “proto-cognitive” because in a creature that cannot learn I suspect there is no principled distinction between cognitions and conations (as the ?gure illustrates); see section 1.3.
15

? ? ?

±?° ? ??? ?? ? ?? ?· ? ? ?? ·? · ? ? ± ?° ? ?? ? ? · · ? ??? ?? ? · ? ? ? ? · · ?? ?? ? ±?° ?· ?· ?? ? ?? ? ??

±?° ? ??? ?? ? ?? ?· ? ? ?? ·? · ? ? ± ?° ? ?? ? ? · · ? ??? ?? ? · ? ? ? ? · · ?? ?? ? ±?° ?· ?· ?? ? ?? ? ??

±?° ? ??? ?? ? ?? ?· ? ? ?? ·? · ? ? ± ?° ? ?? ? ? · · ? ??? ?? ? · ? ? ? ? · · ?? ?? ? ±?° ?· ?· ?? ? ?? ? ??

±?° ? ??? ?? ? ?? ?· ? ? ?? ·? · ? ? ± ?° ? ?? ? ? · · ? ??? ?? ? · ? ? ? ? · · ?? ?? ? ±?° ?· ?· ?? ? ?? ? ??

±?° ? ??? ?? ? ?? ?· ? ? ?? ·? · ? ? ± ?° ? ?? ? ? · · ? ??? ?? ? · ? ? ? ? · · ?? ?? ? ±?° ?· ?· ?? ? ?? ? ??

sensory inputs

“observations”

“beliefs”

“desires”

“plans”

motor outputs

±?° ? ? ?? ±?° ? ? ?? ±?° ? ? ??

76

Figure 3.2: A toy example of the BDCA. So ?nally, to give it the ability to learn, there must be ways to change the weights between nodes in the network. That allows the creature to compute different functions from input to output; different patterns can emerge from the lower-level cognitions, and different kinds of actions can result from the higher-level conations. To change its computational dispositions for the better requires an internal measure of how the creature is doing at matching its aims. So we place ?xed weights at the juncture between its cognitions and conations, demanding matches. Error from mismatches can propagate back recursively down through the layers, and the weights of the network that are ?exible will shift in its favor. The toy result would look something like ?gure 3.2. In such a coherence and learning problem, the low-level, sensory input will receive the default activation on the cognitive side. On the conative side, it is the high-level conations that would receive the stubborn, default activations. These conative defaults arti?cially represent the wiring bias the system would have, in virtue of being a creature at all, toward reacting with certain types of actions given certain cognitive circumstances (like running in the presence of a predator). Together the cognitive and conative defaults represent the

?

?

??? ? ? ??

beliefdesire coherence measure

?

??? ? ? ??? ?? ? ?? ?? ? ? ?? ?? ? ? ? ? ?? ? ?? ? ? ? ? ? ??? ?? ? ? ? ? ? ? ? ? ?? ?? ? ??? ?? ?? ?? ? ?? ? ??

??? ? ? ??? ?? ? ?? ?? ? ? ?? ?? ? ? ? ? ?? ? ?? ? ? ? ? ? ??? ?? ? ? ? ? ? ? ? ? ?? ?? ? ??? ?? ?? ?? ? ?? ? ??

??? ? ? ?? ??? ? ? ?? ? ? ? ?

??? ? ? ??? ?? ? ?? ?? ? ? ?? ?? ? ? ? ? ?? ? ?? ? ? ? ? ? ??? ?? ? ? ? ? ? ? ? ? ?? ?? ? ??? ?? ?? ?? ? ?? ? ??

sensory “observations” inputs (?xed)

“beliefs”

“desires” (?xed)

“plans”

motor outputs

? ?? ? ? ??? ?? ? ?? ?? ? ? ?? ?? ? ? ? ? ?? ? ?? ? ? ? ? ? ??? ?? ? ? ? ? ? ? ? ? ?? ?? ? ? ?? ?? ?? ?? ? ?? ? ??

??? ? ? ?? ??? ? ? ?? ??? ? ? ??

77 role the external world has to play in its thinking—in both the inputs the world gives the creature, and the ?xed aims the world has designed the creature to have. 3.1.6 Next steps Naturally this characterization, too, is approximate and arti?cial. For one thing, there may not be any sharp boundary between cognitions and conations even in creatures that can learn. It may be more like a matter of degree. When surface boundaries are extracted from visual input, that already represents some mix of the creature’s interests with its cognitive representations, since the creature has reason from higher up to pick out these boundaries. And when a conation involving patterns of movement (like “?ee”) translates to motor impulses, that represents in part a means-end cognition about how to ?ee. To re?ect this, the model of the aim-matching feedback mechanism would have to be adjusted accordingly. On this conception of the learning process, there is no one black box monitoring the creature’s highest-level thoughts. Instead we would have a network of many layers between input and output, with no sharp boundary between cognitive and conative layers, and with lots of local and global monitoring doing the work of the one big black box. In this model conations of the creature are perhaps best represented simply by ?xed bounds within which the constraint function can vary. And wherever there is a monitor for computational improvement (weight adjustment), there we can say exists an implicit cognition-conation pair—the monitoring mechanisms adjust the creature’s computation toward achieving activation states that the creature wants. Or I should rather say: the creature wants those activation states in virtue of its function to adjust the computation toward their achievement. For another thing, when it comes to modeling intelligence on the level of human brains, there are many factors to consider that are not easily re?ected in an arti?cial neural net-

?

78 work. Perhaps most obviously, the sheer size of the required network is intimidating, if the human brain is any indication. The 18 nodes in our toy creature (plus however many in the feedback mechanism) are pretty paltry compared to our own 100 billion neurons. Also the network of an intelligent creature will require a great deal of recurrent (backwards-reaching) paths in order to allow the creature to model sensory and motor processes through time, and to disambiguate degraded input. 16 The effect is another kind of feedback to the system, and its interaction with the proposed BDC feedback is unclear. Finally, this model neglects important computational aspects of biological creatures. For example, it replaces the discrete spiking function of neurons with a continuous-valued output. And perhaps most importantly, it ignores (or assumes) the effects of neurotransmitters and other chemicals. Dopamine, for example, appears to be instrumental in adjusting synaptic weighting between neurons, and thus (many suspect) may be crucial to the learning process.17

3.2 Advantages and further issues
Still, as a step toward integrating abstract epistemology and practical cognitive science, BDC shows promise. Already we have seen the theory integrate crucial intuitions of internal epistemology, intelligence, learning, coherence, and the computational theory of mind. But, as I hope to show in this section, its potential does not stop there. 3.2.1 DECO, ECHO, and BDC For one thing, BDC can integrate smoothly with some of the work Thagard and others have started in computational epistemology. For example, he and Elijah Millgram have come up with a computational model called “DECO”—based on Thagard’s earlier work
See Churchland (1995) chapter 5 for a fun, accessible discussion. See p50 and the reference in footnote 24 on the role of dopamine; for more on the general topic of chemicals and mental computation see e.g. Thagard (2002).
17 16

79 modeling explanatory coherence—to show how decisions might be made through a kind of deliberative coherence. They propose deliberative coherence as a constraint satisfaction problem where:

DECO has several advantages. First, it captures an approach to instrumental reasoning. Second, it allows for reconsideration of ends; they can be outweighed if there are enough negative constraints with other deliberative factors that are highly activated. Thus while incorporating instrumental reasoning, it allows that sometimes agents deliberate about ends, too. Perhaps most obviously, it provides the beginning of a computational model for problems in practical reasoning. But DECO takes its constraints, such as the facilitation relations, as primitive. As a result, it is incapable of learning on its own. Plugging DECO as a starting set of constraints into a BDC system would allow the system to learn better plan-formation over time. Also, DECO’s authors recognize that beliefs have a heavy part to play in deliberative coherence. At the very least, a thinker’s beliefs should affect what kind of facilitation relations hold between conative states. Therefore the authors provide a “principle of judgment”, stating that “facilitation and competition relations can depend on coherence with judgments about
18

See Thagard and Millgram (1995); Millgram and Thagard (1996); Thagard (2000, 2001).

? ? ?

the elements are deliberative “factors” such as goals, subgoals, and actions (among which DECO does not distinguish); the constraints consist of positive, symmetric constraints between factors that facilitate the accomplishment of other factors, and negative constraints between factors that are dif?cult to realize together; the interpretation of acceptance in the model is a decision on the part of the agent to adopt a coherent set of factors—that is, to form a holistic plan of action. 18

80 the acceptability of factual beliefs.”19 Just how this dependence works is left out of DECO, but it seems a crucial issue. BDC could potentially explain the formation of these facilitation constraints as indirectly conducive to satisfying its own constraints, and could also explain why facilitation relations are formed based upon antecedent cognitions. Basically, it works to do so—where belief-desire coherence provides the ?nal internal standard for what “works”. Here is another problem DECO has on its own, but that might be solved by integrating it into a BDC system. Thagard and Millgram see their proposal as an alternative to classical decision theory. Among the advantages touted, DECO explains the preferences that decision theory leaves primitive, allows for the backwards formation of goals from subgoals (such as deciding to adopt the goal of running a marathon from a desire to run every day), may model and explain irrationality with respect to the axioms of decision theory, and against a naive instrumentalist it allows for deliberation about the preferences involved.20 Still, DECO cannot diverge too far from utility theory and remain plausible. For suppose DECO endorses plans that are highly “coherent” in terms of a thick web of reinforced facilitation constraints, but that bring about less of what the agents want. Then DECO would presumably lose much of its normative force, for why should an agent prefer the coherent plan to the fruitful one? In some sense the more coherent plan—the plan that best balances deep spontaneous desires against possibilities for achieving them—must be the one with more utility to it. Such an outcome would be much better assured if the constraints were adjusted toward exactly such a purpose. When subjective utility is construed as belief-desire coherence, and when deliberative coherence constraints are formed in the service of this belief-desire coherence, the result is one neat package.
See any of Thagard and Millgram (1995) p442, Millgram and Thagard (1996) p73, Thagard (2000) p128 for this quotation. 20 See Thagard and Millgram (1995) pp449–452 for more details.
19

81 The proper functioning of DECO under a BDC learning mechanism will depend in part, of course, on how well the system as a whole forms beliefs, so we should take a look at how to form those. Again, Thagard’s work is helpful; we have already seen in section 1.2.5 how his ECHO system models explanatory coherence. With the computational take on coherence under our belt, we can look at ECHO more detail.

ECHO is a powerful model for capturing our scienti?c reasoning. Again, though, it can become still more useful when integrated into a system where the explanatory coherence serves belief-desire coherence. As I have suggested, BDC can facilitate the link between beliefs activated through ECHO and the facilitation relations needed by DECO. Sometimes means-end beliefs need to be revised (“oh, it is possible for machines to ?y!”), and sometimes it is the goals that need to be revised instead (“oh, I can’t be a pirate!”). BDC can serve as the arbiter of such matters by testing to see which mechanisms for determining revisions works better. Also, as we saw in section 1.2.5, ECHO must leave as primitive certain key parameters to compute explanatory coherence problems: parameters for “simplicity impact”, “analogy impact”, “skepticism”, “data excitation”, and “tolerance” (see p16 for more details). ECHO has these parameters set somewhat arbitrarily, but their correct value is of great
21

Thagard (1992, 2000).

? ? ?

the elements are cognitions; the constraints consist of positive, symmetric constraints between propositions where explanatory relations hold or analogical mappings exist, and negative constraints between contradictory propositions; the interpretation of acceptance in the model is belief in the proposition (where this can be a matter of degree).21

82 interest to epistemologists. ECHO is not on its own capable of learning better values. Adjusting them for the better requires an internal measure of error; these parameters should all be ?ne-tuned toward some kind of cognitive success. The degree of belief-desire coherence, of course, could provide exactly such an internal measure of error. The ideal degree of tolerance, in particular, is related to another problem for ECHO that BDC might be able to help solve. It is somewhat frustrating that ECHO must take the explanatory and contradictory constraints as primitive. Input to ECHO consists of lines like this: (EXPLAIN (H1) E1) (EXPLAIN (H2) E1) (CONTRADICT H1 H2)22 (These can be adjusted with degrees of explanation, giving the weight of the constraint.) Just like the formation of the facilitation relations in DECO, forming these constraints themselves requires a great deal of cognitive work that remains mysterious. What determines which two nodes stand in an explanatory relation, or in a contradictory relation? Why should the weights between cognitive nodes be formed in that way, rather than some other? Presumably the answer comes from the intelligence embedded in the design of the network. Constraints form that way because it works for them to do so; they helped meet some higher standard of cognitive success. Either these constraints were learned in the service of an internal standard such as belief-desire coherence, or else they came pre-wired with the design of the system toward achieving an external function. 23
Chosen from Thagard (1992) p75. Thagard also has coherence models for deductive, analogical, perceptual, conceptual, and emotional reasoning; again see Thagard (2000) for an overview. All these can similarly be subsumed by BDC, I believe, except for the emotional coherence—on this, see section 3.2.4.
23 22

83 3.2.2 Explanations, contradictions, and BDC So constraints consisting of explanatory or contradictory relations could simply be hardwired into an intelligent system, and our own human brains might be an example of such. On the other hand, it is also plausible that explanatory and even contradictory relations are learned. How might these relations, primitive to ECHO, be derived from a cognitive system’s more basic conations and cognitions through seeking belief-desire coherence? The formation of explanatory constraints is the more complicated example, I think, but not completely mysterious. For example, suppose it is correct that causal relations are basic to explanations. Learning causal relations—forming constraints that track causality— is a signi?cantly simpler problem (still tough, of course). It may even be that something like Hume’s associationist picture is right; perhaps ?ring of some nodes closely conjoined with the ?ring of other nodes reinforces the connection between nodes. This sounds very like a chestnut weight adjustment technique in arti?cial neural networks—the “Hebbian rule”—and it is known that at least some real neurons learn this way. Of course causality is unlikely to be quite that simple, and it is also unlikely to be easy to generalize causal constraints into explanatory ones. Still, it seems an avenue for fruitful investigation. 24 Explaining the learned formation of negative constraints between contradictory cognitions should be easier, but it is plenty complicated. Why might a system looking to maximize belief-desire coherence form these further constraints? To answer this question we must ?gure out what it means, in terms of the BDC model, to hold contradictory cognitions in the ?rst place. Perhaps a basic contradiction in this model would be to both accept and reject a cognitive node that . Of course on the assumption that these nodes
Thagard brie?y suggests that explanatory relations may be causal at base in Thagard (2000) pp68–69; his own longer defense is Thagard (1999) chapter 7.
24

?

84 have real physical realization of some sort, no one node can have a net activation and a net inhibition at the same time.25 So maybe instead there could be two cognitive nodes that , one accepted and the other rejected.

But ?rst, such a situation might not make conceptual sense—can a cognitive system have two token cognitions that ? Even if possible, such a circumstance would be guaranteed to cause incoherence among beliefs and desires if there is any conation to the effect

always some corresponding conative attitude toward it. (My proposal that cognitions arise in pairs with conations from learning mechanisms surely suggests so.) If so there will always be incoherence resulting from holding this kind of contradiction, and thus contradictions will always be unjusti?ed in the current sense. If not—if sometimes there is no correlated conation either way, and not even the potential for one—then it is hard to see any possible harm to that speci?c contradictory belief state, and so it is hard to see the normative force behind any admonition against it. And anyway, if it is indeed possible to have

, then the circumstance is surely rare enough to justify a general constraint-forming rule

to the effect that any two cognitions that should have roughly the same activation level. Suppose, then, that either conceptually or because of pressures like those above, there

dictory beliefs in terms of the model is to have one cognitive node for that is accepted at some times and rejected at others, back and forth. This approach better captures the phenomena, since people accused of contradictory beliefs do not actually accept and reject the same proposition in the same breath. Instead, they assert while concentrating on one
The use of terms like ‘node’ and ‘activation’ emphasizes the connectionist architecture I take for granted in the background, and some of what follows depends on such realizations. In symbolic systems, a contrain the “belief box”. Some of what diction would probably be realized as having symbols for both and follows applies just as well to this case.
25

?

?

? w?

?

?

can only be one cognitive attitude toward

?

?

two cognitions that

?

that

around, rejected or accepted. It may furthermore be that for any cognition there is

with opposite activations, and no potential conative attitude toward

at a time. Another way to understand contra-

?

?

?

85

the activation level of may vary to satisfy different local belief-desire coherencies. Such con?ict between local coherencies automatically suggests a lack of global coherence, and so according to the theory, the cognitive system is relatively unjusti?ed. Settling on a

coherence (and thus a more justi?ed epistemic state) and at the same time represent giving up the contradictory belief. That is, holding contradictions in this sense is unjusti?ed according to the theory of belief-desire coherence. Still another way to understand contradictory beliefs in this model is to accept (or reject) simultaneously different propositions that are together inconsistent; the simplest

think of slightly more complicated examples of contradictory propositions, such as the set

. In other words, we would like to know how belief-desire coherence might

be a force toward the deductive consistency of beliefs. This is the trickiest version of the question, and I can only gesture at an answer.26 At bottom is the dif?culty of explaining how constraints could form between nodes with different contents. To make the right connections the system would somehow have to “know” which propositions are semantically relevant to which others. Thus a satisfying answer to this problem would require a good, naturalistic story about what makes for mental content in the ?rst place—naturalistic because no other kind is presumably available
Yes, Thagard does have something to say in the area too; in what he claims is the tradition of Bertrand Russell, he suggests that deductive relations are also a matter of coherence. (See Thagard (2000) section 3.3; he cites passages from Russell (1907) as evidence of Russell’s view.) Deductive coherence problems have propositions for elements, deductive and contradictory relations as positive and negative constraints, and initial acceptance of propositions with “intuitive priority”. It is often pointed out that deductions are really just the discoveries of inconsistent sets—after that, there is still the matter of which propositions to accept and reject. Thagard’s deductive coherence seems aimed at that point; it still leaves deductive and contradictory relations as primitive, and the formation of those is the current interest.
26

?

? ?

is such that to accept

just is to reject cognitive node . Even if so, one need merely

? ?

?

case is to accept the cognition

and the cognition

?

globally satisfactory activation level for

?

set of issues, and then reject

while concentrating on another set. In terms of the model,

?

would represent both coming to a state of more

. Perhaps the nature of negation

? ? ?× ? ? ?× `¤?e????? ?

86 to the unintelligent constraint-forming mechanism. For example, if contents of complex propositional attitudes consist of their place in a network of such constraints—as the conceptual role semanticist might have it—then this job is easier. There is an inhibitory weight

causally disposed to be, or functionally supposed to be, or . . . ), they would not properly be

from the node for to the node for .27 Though some gesture towards an answer, this still leaves open the question: why did a constraint form between these two nodes at all? The at-bottom answer to this at-bottom question, I think, can only be: the external feedback that designed the creature demanded it be so, in order to perform its external functions. Not all of our cognitive dispositions can be learned from scratch. For an intelligent creature to form basic constraints between different propositions—or for it to have basic propositional content at all—must at some point have b

赞助商链接

更多相关文章:
更多相关标签:

All rights reserved Powered by 甜梦文库 9512.net

copyright ©right 2010-2021。
甜梦文库内容来自网络,如有侵犯请联系客服。zhit325@126.com|网站地图