An Economic Argument for a Mathematical Conclusion

27 04 2008

How valuable is an income stream that pays $1000 a year in perpetuity? Naively, one might suspect that since this stream will eventually pay out arbitrarily large amounts of money, it should be worth infinitely much. But of course, this is clearly not true - for a variety of reasons, future money is not as valuable as present money. (One reason economists focus on is the fact that present money can be invested and thus become a larger amount of future money. Another reason is that one may die at any point, and thus one may not live to be able to use the future money. Yet another reason is that one’s interests and desires gradually change, so one naturally cares less about one’s future self’s purchasing power as one’s current purchasing power.) Thus, there must be some sort of discount rate. For now, let’s make the simplifying assumption that the discount rate is constant over future years, so that money in any year from now into the future is worth 1.01 times the same amount of money a year later.

Then we can calculate mathematically that the present value of an income stream of $1000 a year in perpetuity is given by the sum \frac{1000}{1.01}+\frac{1000}{1.01^2}+\frac{1000}{1.01^3}\dots. Going through the work of summing this geometric series, we find that the present value is \frac{1000/1.01}{1-1/1.01}=\frac{1000}{1.01-1}=100,000. However, there is an easier way to calculate this present value that is purely economic. The argument is not mathematically rigorous, but there are probably economic assumptions that could be used to make it so. We know that physical intuition can often suggest mathematical calculations that can later be worked out in full rigor (consider things like the Kepler conjecture on sphere packing, or the work that led to Witten’s Fields Medal) but I’m suggesting here that the same can be true for economic intuition (though of course the mathematical calculation I’m after is much simpler).

The economic argument goes as follows. If money in any year is worth 1.01 times money in the next year, then in an efficient market, there would be investments one could make that pay an interest of 1% in each year. Investing $100,000 permanently in this and taking out the interest each year gives rise to this income stream, and thus one can fairly trade $100,000 to receive this perpetual income stream, so they must be equal in value. We don’t need to sum the series at all.

Now perhaps there is a sense in which the mathematical argument given above and the economic argument given below can be translated into one another, but it’s far from clear to me. Thus, it looks like at least sometimes, economic intuition can solve mathematical problems. People often talk about the “unreasonable effectiveness of mathematics in the sciences”, but here I think I have another example of the unreasonable effectiveness of the sciences in mathematics.





Probabilistic Causation in Hungary

20 12 2007

Budapest is a very nice city, and this sounds like an interesting program - I’m just not yet sure whether I can plan anything for that time period, or else I would certainly apply.

Course Dates: JULY 21 - AUGUST 1, 2008
Location: Central European University (CEU), Budapest, Hungary,
Detailed course description: http://www.sun.ceu.hu/causality

Faculty: Miklos Redei, Department of Philosophy, Logic and Scientific Method, London School of Economics, UK; Nancy Cartwright, London School of Economics and Political Science, UK; Damian Fennell, London School of Economics and Political Science, UK; Gabor Hofer-Szabo, King Sigismund College, Budapest, Hungary; Ferenc Huoranszki, Central European University, Budapest, Hungary
Laszlo E. Szabo, Eötvös University, Budapest, Hungary; Richard E. Neapolitan, Northeastern Illinois University

Target group: advanced graduate students, postdoctoral fellows, junior faculty and researchers in philosophy, physics, economics and computer science
Language of instruction: English
Tuition fee: EUR 500, financial aid is available.
The application deadline: February 14, 2008 (for scholarship places), April 30, 2008 (for fee-paying applications)
Online application: http://www.sun.ceu.hu/apply (attachments to be sent by email to causality@ceu.hu).

For further information queries can be directed to the SUN office by email (summeru@ceu.hu), via skype (ceu-sun) or telephone (00-36-1-327-3811).





Probability and Bayesian Epistemology

10 12 2007

From the last Carnival of Philosophy, I found a post by another Kenny about the relation between Bayesian epistemology and probability! He puts forward three views of what this relation might be:

Here are brief definitions of each view, and how each one relates subjective degrees of rational confidence to probabilities (I will explain in more depth later).

* (P) takes subjective degrees of rational confidence as primitive. There is no state space for degrees of rational confidence, because they aren’t probabilities.
* (KPW) takes subjective degrees of rational confidence to be actual probabilities over the state space of all epistemically possible worlds, where the epistemically possible worlds are formal constructions that may or may not be objectively possible.
* (LPW) takes subjective degrees of rational confidence to be actual probabilities over the state space of the subset of the really possible worlds which are epistemically accessible.

However, he seems to be focused on a very particular understanding of the word “probability” that might not quite be what I would mean by it. The very fact that he talks about a relation between rational degrees of confidence and probabilities suggests that he’s understanding the word differently from how I am.

My understanding of the word is that “probability” refers to any function from a Boolean algebra to the real numbers satisfying the following three properties: (1) it is never negative; (2) the tautology is assigned value 1; (3) finite additivity (that is, given two elements whose conjunction is the contradiction, the probability of their disjunction is the sum of their probabilities). I’d also be willing to apply the term “probability” in cases where instead of a Boolean algebra in the strict mathematical sense, one uses any structure where the terms “tautology”, “conjunction”, “contradiction”, and “disjunction” have a natural interpretation.

It seems that Kenny Pearce, by contrast, understands the term to require that the algebra be an algebra of sets over some state space, and that there be some objective fact about the probability values. If this interpretation is right, then I don’t think I’d quite take any of the positions he mentions. At any rate, I think I support something more like (KPW) than the others, where “actual probabilities” isn’t taken in any objective sense. In explaining this position, I think I can give answers to three questions he raises:

1. Why should we suppose that we can use the math of probability theory in dealing with degrees of rational confidence?
2. The math of probability theory is generally interpreted in terms of sets called state spaces, but, ex hypothesi, degrees of rational confidence, not being probabilities, have no state spaces. What, then, does the math mean?
3. Why should we suppose that when an occurrence has a well defined objective probability, our subjective degree of rational confidence should be assigned a value equal to its probability?

In response to the first question, the standard answer would be to refer to something like a Dutch book argument - degrees of rational confidence can be described by the mathematics of probability theory because if degrees of confidence couldn’t, then the agent would be subject to a certain loss from a set of bets they would be willing to take, and therefore would be irrational. (There’s some slipperiness here with generating the bets from the confidences, and concluding irrationality based on a collection of bets the agent may take individually, but I think this can be cleaned up.) There’s also a host of other arguments for something like this same conclusion (though Alan Hájek raises issues for them in his (forthcoming?) “Arguments For - Or Against? - Probabilism”). As Kenny Pearce notes, nothing about these arguments requires there to be a state space, so they don’t end up being probabilities in his sense (due to Kolmogorov), but they do seem to be probabilities in the sense I use (and Popper, and Borel, and others).

As for the second question, I think that there actually is a state space that is relevant for degrees of rational confidence, which is why I lean more towards something like what Kenny Pearce calls (KPW) rather than (P). The state space here would be the set of epistemic possibilities (whatever those are - I don’t really have a good theory of them, do you?). Despite my lack of an account of them, I think they do need to play a role. I think we can’t make very good sense of the notion of a degree of confidence in p, supposing q, without a set of possibilities that we can restrict to the q-possibilities. Also, these epistemic possibilities seem to play an important role in other aspects of epistemic modality, not just degree of belief. And most importantly, I think there’s a rational difference between having an rational confidence of 0 in p and actually being certain that p will not happen. When measuring the speed of light, there’s a difference between my attitude towards it being exactly 2.9980000000000001 x 108 m/s, and my attitude towards it between 3 m/s - I consider the former possible given what I know, and the latter not. However, since there is some interval around 2.998 x 108 m/s that I can’t rule out, and there are infinitely many such values that I am indifferent between, I can’t give any of them a positive value without either violating additivity or assigning values larger than 1 to certain disjunctions. So I propose that the state space contains infinitely many epistemic possibilities, and that my degree of confidence in certain sets of these possibilities is 0, even though the set is non-empty. (Of course, for the empty set, I trivially have confidence 0 in that set of possibilities.) So I think this aspect of the math actually applies quite well to degrees of confidence, though I’m willing to concede that many people will want to challenge this point, and I don’t think it’s as important as the point that degrees of confidence must be probabilities in something like the general sense I outlined earlier.

However, I don’t think such a state space comes with objectively correct probabilities to assign - after all, it’s infinite, and Bertrand’s Paradox shows how all sorts of troubles arise when we think that symmetries of an infinite space constrain probability assignments.

As for the third question, I’m not sure I agree with its premise. I’m not totally convinced that when there is a well-defined objective probability, we should match it with our degrees of confidence. Consider a fair coin that has just been tossed. There is some sense in which it had an objective probability of 1/2 of coming up heads, so this principle would suggest having degree of belief 1/2 in heads. But if I also know that this coin was one of 10 fair coins flipped at that point, 9 of which happened to come up heads, then (in the frequency sense, as opposed to the chance sense) there is also an objective probability of 9/10 of that coin being heads up, so this principle would suggest the contradictory degree of belief of .9. Maybe in this situation one of the two principles wins out (my guess would be the latter), but I don’t really know under what circumstances something like this should be the case. Of course, I also don’t really know what sorts of objective things count as “well-defined objective probabilities” - is it chances, frequencies, or something else? There are many well-defined objective things that obey the mathematics of probability, but it’s an interesting question which (if any) should be tracked by our degrees of confidence.

Kenny Pearce suggests that on the (KPW) theory of degrees of confidence, it’s the fact that “the worlds … divide more or less evenly” that makes us assign 1/6 to each of the propositions about the way the die might land up. I don’t think there’s such thing as an objective measure over this infinite state space, so we can’t even make sense of the worlds dividing more or less evenly. Thus, if there is some objective reason for the degrees of belief we assign, I don’t know what it is yet, but I don’t think it could be anything like what Kenny Pearce suggests in either (KPW) or (LPW). (Also, I don’t think (LPW) is even a viable candidate, because this is supposed to be a theory of degree of rational belief, and actual possibilities have almost nothing to do with rational epistemic possibilities - one could try to make a modified 2-dimensionalist version of this strategy, as Frank Jackson does, but I’m not convinced that this will work.)

I think that these degrees of confidence exist, and are actually often much more precise than we realize (there’s no reason we should have transparent access to exactly what our degrees of belief are), but they’re not constitutively tied to any sort of objective probability in the sense that Kenny Pearce was expecting for a relation between Bayesian epistemology and probability. These degrees of belief are themselves probabilities, just in a different interpretation than Kenny Pearce was specifically considering.





Jaynes on the Indifference Principle

24 09 2007

I’ve started reading Jaynes’ book on probability theory, to get a better sense of how objective Bayesians think about things. One thing I found interesting (and a bit frustrating) was his argument for the “indifference principle”, stating that, conditional on background information that says nothing about possibility A without also saying it about possibility B, A and B must have the same probability.

The argument for this principle is quite interesting. He starts with the premise that a rational agent (or “robot”, as he often calls it) must assign probabilities to outcomes just based on the information about them, and that the probabilities should be the same in situations with identical information. Thus, if there are two propositions about which the information says nothing different, we can interchange them and end up in an identical situation to how we started, so the probabilities assigned must be the same. It’s a nice little argument, but I think it relies on a missing premise, which states that given any background information, there is a set of probabilities that it is uniquely right to assign - if many probability assignments are all allowed (as most subjective Bayesians will say), then this argument won’t entail that they all have to obey the indifference principle, as long as every permissible assignment with one asymmetry has a corresponding permissible assignment with the other asymmetry.

What makes me more suspicious about this indifference principle is how Jaynes actually goes on to use it. He says that using it requires the background information about the different propositions to actually be identical, but his very first use of it violates this condition!

Consider the traditional ‘Bernoulli urn’ of probability theory; ours is known to contain ten balls of identical size and weight, labeled {1,2,…,10}. Three balls (numbers 4, 6, 7) are black, the other seven are white. We are to shake the urn and draw one ball blindfolded. The background information … consists of the statements in the last two sentences. What is the probability that we draw a black one? (p. 42)

Of course, he goes on to say that the probability is 3/10 (which is obviously the “right” answer in some sense), because “the background information is indifferent to these ten possibilities”, so each ball has probability 1/10 of being drawn, and we can add the three chances for a black ball, since the background information entails that they are mutually exclusive events.

However, it looks to me like this is a mis-application of the principle, as he has stated it. The background information is explicitly not indifferent to the ten possibilities - it says that three of the balls are black and seven are white. A strict use of the indifference principle will say that balls 1,2,3,5,8,9,10 are all equally likely, and balls 4,6,7 are equally likely, but there’s no obvious way to apply the indifference principle to compare possibilities from one set and possibilities from the other. To see why this is the case, consider the following example, which is identical in terms of information content, but gives rise to an intuition other than 3/10:

our cabinet is known to contain ten balls of identical size and weight, labeled {1,2,…,10}. Three balls (numbers 4, 6, 7) are in the black drawer, the other seven are in the white drawer. We are to spin the cabinet and draw one ball blindfolded. The background information consists of the statements in the last two sentences. What is the probability that we draw one from the black drawer?

Unless our information includes something about how drawers and urns and paint and the like behave physically, there is no distinguishing between these two set-ups. However, it would seem quite odd to assign probability 3/10 in the latter set-up of drawing a ball from the black drawer - a better answer (if there is a right answer) seems like 1/2. But Jaynes seems to explicitly state that there is no information about drawers and urns in the background, since he says “the background information consists of the statements in the last two sentences”. (Something like this is exactly what changes between classical and quantum statistics of particle arrangements, so this is a relevant worry if we want to apply this objective Bayesianism to physics.)

Another way to get his answer would be to first consider the set-up where we’re told there are ten balls in the urn, and told their numbers, but not told anything about their colors. Now we see that the probability of drawing one of the balls 4,6,7 is 3/10, so when we learn that these three are black and the others are white, we conclude that the probability of getting a black ball is 3/10.

But this relies on the supposition that telling us the color of the balls has no effect on our rational degree of belief that any ball is drawn. Intuitively this seems right, but that’s only because we know how color behaves in the physical world - if it had been the size or shape, this would have been less clear, and properties about location or stickiness or solidity or whatever should clearly have changed the probabilities. Without this background information explicitly included, this update can’t work.

Additionally, there’s another way to reach this set-up from a slightly smaller set of background information. If we first just say that there is an urn with some balls in it, some of which are black and some of which are white, then the indifference principle would entail that the rational degree of belief in either black or white should be 1/2. But upon learning precisely which balls are black or white, we should somehow update our probabilities in a way that changes things - but how precisely to do this is left unspecified by the indifference principle.

So Jaynes must be implicitly appealing to some extra principles here in order to get the intuitive answers, unless he thinks the problems he set up implicitly contain more information than he has stated. If so, then he won’t be able to apply this objectively in actual physical scenarios where this background information isn’t known (which is why the experiment is being performed). This is no problem for a subjective Bayesian, because she doesn’t claim that an agent has no further information, or that there is a unique probability value that every rational agent must assign in this situation. It’s also no problem for someone who takes either a frequency or chance view of probability, since in any actual physical set-up we can assume that those numbers are well-defined, even though the agent has no access to them. (This causes problems for using frequency or chance as the sole basis of statistical inference, but that’s a different worry.) The situation seems uniquely troubling for the objective Bayesian.





Betting Odds and Credences

17 08 2007

I was just reading the interesting paper When Betting Odds and Credences Come Apart, by Darren Bradley and Hannes Leitgeb, at least in part because of some issues that are coming up in my dissertation about the relations between bets and credences. Their paper is a response to a paper by Chris Hitchcock arguing for the 1/3 answer in the Sleeping Beauty problem, where he shows that if Beauty bets as if her credences were anything other than 1/3, then she is susceptible to a Dutch book.

They end up agreeing that she should bet as if her credences were 1/3, but they argue that this doesn’t mean that her credences should actually be 1/3, because of some similarities this case has to other cases where betting odds and credences come apart. I know at least Darren supports (or has supported) the 1/2 answer in the Sleeping Beauty case, so he’s got a reason to argue for this position.

I think in the end though, their paper has convinced me of the opposite - the correct thing to do in this situation is to bet as if one’s credence is 1/2, even though one’s credence should actually be 1/3! I get the 1/3 credence argument from a bunch of sources (especially Mike Titelbaum’s work on the topic). But for the betting as if one’s credence is 1/2, I might be using the term “bet” in a somewhat non-standard way. However, I think my usage is inspired by my attempt to resist some of the claims of Bradley and Leitgeb.

They give some examples of other cases in which it might look as if one should bet at different odds than one’s credences. For instance, if one is offered a bet on a coin coming up heads, but knows that this bet will only be offered if the coin has actually come up tails, then it looks as if one should bet at odds different from one’s credences. However, they agree that in this case one’s credences change as soon as the bet is offered, and one should bet at odds equal to the new credences.

Their next example is very similar, but without the shift in credences. One is offered a bet on a coin coming up heads, but knows that if the coin actually came up heads then the bet is carried out with fake money (indistinguishably replacing the real money in your and the bookie’s pockets) and is real if the coin actually came up tails. In this case, it looks like one should bet at odds different from one’s credences, which should still be 1/2.

However, I think that in this case what’s going on is that one isn’t really being offered a proper bet on heads at odds of 1/2. Functionally speaking, the money transfer involved will be like a bet on heads at odds of 1. It might be described as a bet at different odds, but I think bets should be individuated in some sort of functionalist way here, rather than according to their description in this sense. Thus, since one’s credence in heads is less than 1, one shouldn’t accept this bet.

Bradley and Leitgeb then say that what goes on in Hitchcock’s set-up of the Sleeping Beauty bets is similar. The bet will be repeated twice if the coin comes up tails (because Beauty and the bookie both forget the Monday bet), and thus this is a situation like the one with the bet that might turn out to be with pretend money, but in the opposite direction. Thus, this bet ends up being one that costs the agent $20 if the coin comes up heads, and wins her $20 if it comes up tails, so it’s functionally a bet at odds of 1/2. I think this is the set of bets she should be willing to accept, but that her credence in heads should be 1/3, so her betting odds and credences should come apart.

Of course, there may be a slight difference between the situations. In this version of the Sleeping Beauty bets, the bet gets made twice if the coin comes up tails, rather than paying off double. Perhaps the fact that it’s agreed to multiple times doesn’t make the same difference that having money replaced by something twice as valuable would. If so, then this bet really was properly described as a bet at odds of 1/3, so that I would no longer think that this is an example where betting odds and credences should come apart.

So I think I don’t really accept the particular claims that Bradley and Leitgeb make in this paper, but it’s only because I’m trying to do something subtle about how to individuate bets in functional terms. I’m sure there are good examples out there on which betting odds and credences could rationally come apart, but I’m not convinced whether the Sleeping Beauty case is one of them.





The Principal Principle

3 08 2007

A very plausible normative principle relating subjective degree of belief to objective chance is David Lewis’ “Principal Principle”. In a simplified version, this principle says that if you know the objective chance of some inherently chancy outcome, then your degree of belief in that outcome should equal the chance. Thus, if you know that the coin is fair, then you should have degree of belief 1/2 that it will come up heads.

This has some added bite because the chance information overrules a lot of other information - if you know the coin is fair, then it doesn’t matter how it happened to come up on the last 1000 flips, you should still believe in heads to degree 1/2. Even if the last 1000 flips were all tails - this is one idea of what’s fallacious about the gambler’s fallacy (or inverse gambler’s fallacy).

Of course, some sorts of information can overrule the chance information - if a very accurate fortuneteller has told you that the coin will come up heads, then maybe you should believe to a degree higher than 1/2, even though you still believe the coin is fair. This sort of information is what Lewis called “inadmissible” information. The question for the Principal Principle then is just what counts as inadmissible information?

To answer this, I think we need to consider just what chance really is. On one notion of chance, it requires that the world be objectively indeterministic, so that there is no fact of the matter about future chancy events. On this account, the idea of an accurate fortuneteller for chancy events doesn’t even make sense. This might be a natural view of chance that arises from the many-worlds interpretation of quantum mechanics. On this view, the chance of an event could potentially depend on anything for which there is a fact of the matter - but this only includes facts about the past and present. But since you’d need to know all this information (or the relevant parts anyway) to know the chances, there will trivially be no possibility of inadmissible evidence, so the Principal Principle stands (if at all) in a very simple form!

But there are other notions of chance I’ve heard people talk about. One is supposed to be compatible with strict determinism. I don’t know too many of the details, but I suspect that the idea is that there’s some natural class of “nearby worlds”, and chance is just some sort of probability measure on those worlds. This can definitely give rise to non-extreme values for chances, even though there is no possibility other than necessity. However, on this interpretation of chance, I don’t see why anything like the Principal Principle would have any normative force at all. I suppose if you can somehow narrow things down enough to know what the chances are, but can’t eliminate any of the worlds in the class that defines the chances, then it would make sense. But it’s far from clear to me why this situation would be at all common.

Then of course there’s Lewis’ own characterization of chance. I believe his idea is that one can read off the natural laws of a world by seeing what best systematizes the entire history of it. If there are certain types of events that have no interesting pattern to them at all except for a certain limiting frequency, then the best way to systematize these will be with chancy laws. In this setting it’s not clear how one would justify the Principal Principle, or how one would claim to have knowledge about the chances.

At any rate, the Principal Principle seems to say different things on these different interpretations of chance, and it gives rise to either different justifications or different accounts of what should count as “inadmissible evidence”.





Back from Australia

5 07 2007

I’m back from spending three weeks in Australia again - as usual, it was a very productive trip. It was also nice to get to attend the workshops on Norms and Analysis and Probability that went on last week. There were a lot of interesting talks there, so I won’t go through very many of them. Overall, I think the most interesting was Peter Railton’s talk in the first workshop, where he seemed to be supporting a framework for metaethics and reasons that is broadly compatible with the framework of decision theory. However, he brought in lots of empirical work in psychology to show that for both degree of belief and degree of desire, there seem to be two distinct systems at work - one more immediately regulating behavior, while the other being more responsive to feedback and generally regulating the first. It reminded me somewhat of what Daniel Kahneman was talking about in a lecture here at Berkeley a few months ago. But not being an expert in any of this stuff, I can’t say too much more than that.

Another particularly thought-provoking talk was Roy Sorensen’s in the Norms and Analysis workshop. He presented a situation in which you are the detective in a library. You just saw Tom steal a book, so you know that he’s guilty. However, before you punish him, the defense presents an envelope that may either contain nothing, or may contain exculpatory evidence (something like, “Tom has an identical twin brother in town”, or “The librarians have done a count and it seems that no books are missing”, which would make you give up your belief that Tom was guilty). Given that you know Tom is guilty, should you open the envelope or not? On the one hand, it seems you should, because you should make maximally informed decisions. On the other hand, it seems you shouldn’t, because either the envelope contains nothing, or it contains information you know is misleading, and in either case it’s no good.

Sorensen was arguing that you shouldn’t open the envelope, but I don’t think he succeeded in convincing any of the audience. But I think the puzzle sheds interesting light on what it takes to know that evidence is misleading, and how apparent evidence or the lack thereof really plays out when you know other background facts about where the evidence is coming from.





Probabilistic Inference Barrier

21 01 2007

Using the methods of Russell and Restall’s paper on inference barriers, I will show that one can’t derive an “is” from a “probably”. That is, no consistent set of statements expressing only relations among the probabilities of statements expressible in a non-probabilistic object language can entail anything about the actual truth-values of such object language statements, unless the statements are either tautologies or contradictions.

Let S be a (consistent) set of probabilistic statements, and let O be the (non-tautological) object language statement that is said to follow from S. To show that O does not follow from S, start with a probability space (a set of “states”, together with an algebra of subsets of this set called “events”, a real-valued function on this algebra satisfying the probability axioms, and a specification of which state is “actual”) satisfying all of S. Call this space P. Now create a space P’ by adding a single state X in which O is false to the state space. A subset of this space will be an event iff either it doesn’t contain X and was an event in P, or it does contain X and removing X gives an event in P. Define the probability function on these events by assigning the same probability to any event in P’ that either it, or it without X, had in P. Let X be the “actual” state in the new space.

By construction, every probabilistic statement will have the same truth-value in the two spaces, because every proposition has exactly the same probability (effectively, all we did was add a single state with 0 probability). In particular, all of S is true in P’. Also, by construction, O is false in P’. Thus, S does not entail O, QED.

One way to block this argument would be to require that every non-empty event have non-zero probability, but this would block a lot of interesting probability spaces (in particular, any space with uncountably many mutually incompatible events). However, a very similar argument would go through if one allowed some small tolerance of epsilon in the probabilistic statements of S (assuming none of the statements are conditional probability statements, whose value can in fact deviate by much more than epsilon due to the addition of a single state of probability epsilon).

But in some sense, this shows the weakness of these “inference barrier” results, which Gillian Russell points out should really be called “implication barriers”. Under certain conditions, it’s certainly rational to infer that it will rain, given that there’s a 99% chance of rain. This result merely shows that no amount of probabilistic evidence will ever entail anything with certainty, even though it might entail it with probability 1. The distinction between probability 1 and certainty is something I’m thinking about right now for my dissertation.





Chance Expressivism

1 07 2006

I had a long conversation with Mike Titelbaum yesterday, largely about Hilary Greaves‘ manuscript, “Probability in the Everett Interpretation”. I think it’s a very interesting paper, trying to use a deterministic interpretation of quantum mechanics (essentially a many-worlds interpretation) and ordinary principles of decision theory to show that a rational agent will always act as if her credences matched the probabilities recommended by the Born rule, so that there’s no need for objective chance. In talking about this, I was trying to explain to Mike why I’ve always been suspicious of objective chances, because they imply some sort of fishy metaphysics, like that of the Copenhagen interpretation of quantum mechanics.

Mike’s been reading Lewis’ “A Subjectivist’s Guide to Objective Chance”, where he suggests that the Principal Principle (basically, the rule that you should proportion your credences according to the chances) gives him almost his complete understanding of chance. So he suggested that theories that postulate chances might instead by interpreted as just directly specifying credences for an agent to have. A very simple such theory tells me to always have credence 1/2 in heads when I’m flipping an ordinary coin. Although such a theory isn’t as good as a deterministic theory in always telling me to believe the truth, it does at least guarantee that I can’t get Dutch booked (because I use a probability function) and in addition that my credences tend to match the long-run frequencies, at least with probability 1.

But I pointed out, this “probability 1″ is only the probability of the theory, which is exactly what we’re trying to justify here. So it’s unclear exactly what makes these credences good ones to have (this is basically what we were trying to puzzle out from Hilary Greaves’ paper, which I haven’t fully read). And I pointed out that there’s another weirdness here in these theories.

Most scientific theories affect our beliefs just by telling us what is true. On this view, a theory with chances affects our beliefs by telling us how strongly to believe things, without saying anything about what’s true. You can say that it says “the coin has 1/2 chance of heads” is true, but this is a purely theoretical statement since it involves chances - just as a delta function fills the place of a function but isn’t one, “chance” fills the place of a noun, but doesn’t refer to anything. Instead, we know how to calculate integrals involving a delta function, and how to adjust our beliefs when “believing” a sentence with “chance” in it. In this sense, a theory involving chances doesn’t make metaphysical claims the way ordinary scientific theories do - they try to tell us what’s true (and thereby what to believe) while the chance-based theory just tells us what to believe without going for the intermediary of truth.

In a sense, this is like some of what happens in the Stalnaker framework for conversations. Ordinarily, a conversational context is taken to be the set of worlds that is theoretically open for speakers in the conversation (either one of the participants might actually know enough to rule out some of these worlds, but this set represents the “common ground” between them). Whenever someone asserts something, the proposition expressed by the asserted sentence defines some set of possible worlds, and the context set is then intersected with this set to produce the new conversational context. However, some people have proposed that certain types of sentences, like epistemic modals involving “might”, work differently. They have a context change potential just like ordinary sentences, but it’s just directly a function from contexts to contexts, rather than a set that is intersected. Thus, these sentences tell you how to update the context, but not by telling you what’s true. Instead, they just do it directly.

On the view described above, scientific theories involving chances do something similar. If such theories are accepted, it’s a blow for scientific realism, because we’ll have a theory that doesn’t say what’s true. But it might be the best we can do. If we can make sense of in just what sense such a theory might be good. But as Mike points out, this might just mean solving the problem of induction, because it’s exactly the sense in which I believe with credence 1/2 that the coin will come up heads, because approximately half of the flips in the past have come up heads.

(I’m not sure if this post by Cosma Shalizi is hinting at something similar or not.)





Non-Factive Knowledge

28 04 2006

Bayesians have suggested that belief is not an all-or-nothing notion, but rather one that comes in degrees from 0 to 1 (which happen to obey the Kolmogorov axioms of probability, at least for rational agents). I’ve lately been wondering whether we can do the same with knowledge - on something like a justified true belief account, we can obviously grade knowledge based on the strength of the belief involved, but it could also be graded based on the level of justification. If instead we can figure out a way to grade knowledge directly, maybe we can get a more sophisticated account of knowledge out of this, rather than seeking the “fourth condition”. The most natural attempts would probably involve something like the tracking account given by Sherri Roush in Tracking Truth (which I embarrassingly still haven’t read yet).

Anyway, thinking about this, I was struck by this account of knowledge given by Roger Shuy, over at Language Log:

1. One believes it to be true.
2. One has good reason to believe it to be true.
3. There is a substantial probability that it is true.

It seems quite parallel to the JTB account (or, BJT in this ordering), except that he seems to have weakened the truth condition quite a bit! I’ve sometimes thought that slightly relaxing the factivity condition on knowledge could make it fit much better with ordinary linguistic usage, but everyone tells me I’m crazy when I suggest this.

If we consider the correctness of knowledge attributions, rather than the obtaining of the actual state of having knowledge, then maybe this makes more sense - an agent A can judge an assertion that S knows P to be correct to degree D, where D=J*T*B, and J is S’s degree of justification for P, T is A’s subjective probability that P is true, and B is S’s subjective probability that P is true. Perhaps J and B should actually be modified to be A’s estimate of S’s degree of justification and subjective probability, rather than being the actual degree of justification and subjective probability.

Of course, this still doesn’t account for Gettier cases unless we understand “degree of justification” in some very strong way, and this might be totally crazy, but it’s just something I’m playing around with.