A Stronger Two-Envelope Paradox

24 05 2009

Consider the standard two-envelope paradox – there are two envelopes in front of you, and all you know is that one of them has twice as much money as the other. It seems that you should be indifferent to which envelope you choose to take. However, once you’ve taken an envelope and opened it, you’ll see some amount x of money there, and you can reason that the other envelope either has 2x or x/2, which has an expected value of 1.5x 1.25x, so you’ll wish you had taken the other.

Of course, this reasoning doesn’t really work, because of course you always know more than just that one envelope has twice as much money as the other, especially if you know who’s giving you the money. (If the amount I see is large enough, I’ll be fairly certain that this one is the larger of the two envelopes, and therefore be happy that I chose it rather than the other.)

But it’s also well-known that the paradox can be fixed up and made more paradoxical again by imposing a probability distribution on the amounts of money in the two envelopes. Let’s say that one envelope has 2^n dollars in it, and the other has 2^{n+1} dollars, where n is chosen by counting the number of tries it takes a biased coin to come up heads, where the probability of tails on any given flip is 2/3. Now, if you see that your envelope has 2 dollars, then you know the other one has 4 dollars, so you’d rather switch than stay. And if you see that your envelope has any amount other than 2, then (after working out the math) the expected value of the other one will be (I believe) 5/4 of the amount in your current envelope, so again you’d rather switch than stay.

This is all fair enough – Dave Chalmers pointed out in his “The St. Petersburg Two-Envelope Problem” that there are other cases where A can be preferred to B conditional on the value of B, while B can be preferred to A conditional on the value of A, while unconditionally, one should be indifferent between A and B. This just means that we shouldn’t accept Savage’s “sure-thing principle”, which says that if there is a partition of possibilities, and you prefer A to B conditional on any element of the partition, then you should prefer A to B unconditionally. Of course, restricted versions of this principle hold, either when the partition is finite, or the payouts of the two actions are bounded, or one of the unconditional expected values is finite, or when the partition is fine enough that there is no uncertainty conditional on the partition (that is, when you’re talking about strict dominance rather than the sure-thing principle).

What I just noticed is that it’s trivial to come up with an example where we have the same pattern of conditional preferences, but there should be a strict unconditional preference for A over B. To see this, just consider this same example, where you know that the two envelopes are filled with the same pattern as above, but that 5% of the money has been taken out of envelope B. It seems clear that unconditionally one should prefer A to B, since it has the same probabilities of the same pre-tax amounts, and no tax. But once you know how much is in A, you should prefer B, because the 5% loss is smaller than the 25% difference in expected values. And of course, the previous reasoning shows why, once you know how much is in B, you should prefer A.

Violations of the sure-thing principle definitely feel weird, but I guess we just have to live with them if we’re going to allow decision theory to deal with infinitary cases.


Betting Odds and Credences

17 08 2007

I was just reading the interesting paper When Betting Odds and Credences Come Apart, by Darren Bradley and Hannes Leitgeb, at least in part because of some issues that are coming up in my dissertation about the relations between bets and credences. Their paper is a response to a paper by Chris Hitchcock arguing for the 1/3 answer in the Sleeping Beauty problem, where he shows that if Beauty bets as if her credences were anything other than 1/3, then she is susceptible to a Dutch book.

They end up agreeing that she should bet as if her credences were 1/3, but they argue that this doesn’t mean that her credences should actually be 1/3, because of some similarities this case has to other cases where betting odds and credences come apart. I know at least Darren supports (or has supported) the 1/2 answer in the Sleeping Beauty case, so he’s got a reason to argue for this position.

I think in the end though, their paper has convinced me of the opposite – the correct thing to do in this situation is to bet as if one’s credence is 1/2, even though one’s credence should actually be 1/3! I get the 1/3 credence argument from a bunch of sources (especially Mike Titelbaum’s work on the topic). But for the betting as if one’s credence is 1/2, I might be using the term “bet” in a somewhat non-standard way. However, I think my usage is inspired by my attempt to resist some of the claims of Bradley and Leitgeb.

They give some examples of other cases in which it might look as if one should bet at different odds than one’s credences. For instance, if one is offered a bet on a coin coming up heads, but knows that this bet will only be offered if the coin has actually come up tails, then it looks as if one should bet at odds different from one’s credences. However, they agree that in this case one’s credences change as soon as the bet is offered, and one should bet at odds equal to the new credences.

Their next example is very similar, but without the shift in credences. One is offered a bet on a coin coming up heads, but knows that if the coin actually came up heads then the bet is carried out with fake money (indistinguishably replacing the real money in your and the bookie’s pockets) and is real if the coin actually came up tails. In this case, it looks like one should bet at odds different from one’s credences, which should still be 1/2.

However, I think that in this case what’s going on is that one isn’t really being offered a proper bet on heads at odds of 1/2. Functionally speaking, the money transfer involved will be like a bet on heads at odds of 1. It might be described as a bet at different odds, but I think bets should be individuated in some sort of functionalist way here, rather than according to their description in this sense. Thus, since one’s credence in heads is less than 1, one shouldn’t accept this bet.

Bradley and Leitgeb then say that what goes on in Hitchcock’s set-up of the Sleeping Beauty bets is similar. The bet will be repeated twice if the coin comes up tails (because Beauty and the bookie both forget the Monday bet), and thus this is a situation like the one with the bet that might turn out to be with pretend money, but in the opposite direction. Thus, this bet ends up being one that costs the agent $20 if the coin comes up heads, and wins her $20 if it comes up tails, so it’s functionally a bet at odds of 1/2. I think this is the set of bets she should be willing to accept, but that her credence in heads should be 1/3, so her betting odds and credences should come apart.

Of course, there may be a slight difference between the situations. In this version of the Sleeping Beauty bets, the bet gets made twice if the coin comes up tails, rather than paying off double. Perhaps the fact that it’s agreed to multiple times doesn’t make the same difference that having money replaced by something twice as valuable would. If so, then this bet really was properly described as a bet at odds of 1/3, so that I would no longer think that this is an example where betting odds and credences should come apart.

So I think I don’t really accept the particular claims that Bradley and Leitgeb make in this paper, but it’s only because I’m trying to do something subtle about how to individuate bets in functional terms. I’m sure there are good examples out there on which betting odds and credences could rationally come apart, but I’m not convinced whether the Sleeping Beauty case is one of them.

Back from Australia

5 07 2007

I’m back from spending three weeks in Australia again – as usual, it was a very productive trip. It was also nice to get to attend the workshops on Norms and Analysis and Probability that went on last week. There were a lot of interesting talks there, so I won’t go through very many of them. Overall, I think the most interesting was Peter Railton’s talk in the first workshop, where he seemed to be supporting a framework for metaethics and reasons that is broadly compatible with the framework of decision theory. However, he brought in lots of empirical work in psychology to show that for both degree of belief and degree of desire, there seem to be two distinct systems at work – one more immediately regulating behavior, while the other being more responsive to feedback and generally regulating the first. It reminded me somewhat of what Daniel Kahneman was talking about in a lecture here at Berkeley a few months ago. But not being an expert in any of this stuff, I can’t say too much more than that.

Another particularly thought-provoking talk was Roy Sorensen’s in the Norms and Analysis workshop. He presented a situation in which you are the detective in a library. You just saw Tom steal a book, so you know that he’s guilty. However, before you punish him, the defense presents an envelope that may either contain nothing, or may contain exculpatory evidence (something like, “Tom has an identical twin brother in town”, or “The librarians have done a count and it seems that no books are missing”, which would make you give up your belief that Tom was guilty). Given that you know Tom is guilty, should you open the envelope or not? On the one hand, it seems you should, because you should make maximally informed decisions. On the other hand, it seems you shouldn’t, because either the envelope contains nothing, or it contains information you know is misleading, and in either case it’s no good.

Sorensen was arguing that you shouldn’t open the envelope, but I don’t think he succeeded in convincing any of the audience. But I think the puzzle sheds interesting light on what it takes to know that evidence is misleading, and how apparent evidence or the lack thereof really plays out when you know other background facts about where the evidence is coming from.

The Las Vegas Paradox

8 01 2007

It seems safe to say that money (and basically any other good) has a generally diminishing marginal value. This is perhaps one of the biggest justification for redistributive taxation, in which we take a bunch of money unequally from people and give it to people in some much more even distribution, as with social security and some other government programs. (Of course, most programs redistribute things unequally, but still often in a more equal way than the money was originally distributed.)

However, another sort of redistribution sometimes seems justified, and it suggests that the marginal value of money can’t be strictly decreasing. If we took a penny from everyone in the country, and gave the resulting $3,000,000 to one person at random, it seems that it would make that one person tremendously happy at basically no cost to anyone else. And sure enough, people voluntarily engage in this sort of activity all the time, in raffles, and (notably, especially since I just spent a week and a half visiting my parents at their new place in Las Vegas) in slot machines. In fact, in both cases, people willingly take part despite the fact that some of the money is siphoned off either for charity or to the rich people that own the casinos.

Now, perhaps this behavior is just irrational (so that we shouldn’t derive any moral about the marginal utility of money from it). Or perhaps people get some other benefit from the transaction (like the feeling of doing good for a charity in the case of a raffle, or the excitement one gets from occasionally getting small payoffs that one promptly loses again in the slot machine). But at some level, the original game of taking a penny from everyone to give the entire amount to someone chosen at random just intuitively (at least to me) seems reasonable.

However, there may be some sort of argument that it isn’t reasonable. After all, if it was an improvement to overall utility to do that, then some sort of principle of additivity (which I’m willing to question for other reasons however) would suggest that it would be good to do it multiple times. It’s unclear at what point it could go from being good to bad (maybe there’s a sorites in here?) so if it’s good to do it once, then it would be good to do it 300,000,000 times. But at that point, it seems that it could have a severe negative effect on total utility (if some people ended up losing $3,000,000 overall while some other lucky ones won it two or more times – a loss of $3,000,000 is clearly much worse to almost everyone than a gain of that much is good to anyone) or at best a neutral effect (if everyone wins exactly once). So either it was never good to begin with (which just seems implausible to me), or it switches from being good to being bad at some point (though it seems very hard to say where), or else we have to give up some sort of additivity for gambles, though it’s unclear just how.

Texas Decision Theory

24 10 2006

I was in Austin a couple weeks ago for the second Texas Decision Theory Workshop, which was a lot of fun. It was a fairly small group, and some interesting topics I didn’t know much about were discussed. In particular, there was a lot of discussion (primarily by Sahotra Sarkar and Carl Wagner) about decision making with imprecise probabilities. There was also a lot of discussion of multiple-criteria decision making. My friend Alex Moffett discussed the impossibility theorems of Arrow, and Gibbard and Satterthwaite – he mentioned an analogy between multiple-agent decision making (as in the traditional presentations of these theorems) and multiple-criteria decision making, suggesting that in this context at least, the “independence of irrelevant alternatives” criterion really is important. And Mike Titelbaum presented some of his work on generalized versions of conditionalization as constraints on rational agents even under forgetting.

I talked about some of the things I discussed earlier this summer, but with more of a worked-out formalism for describing the decision apparatus (and constructing stronger decision theories out of weaker ones). I was surprised to see that some of this formalism that I developed for infinitary cases seems to resemble some of the formalisms for imprecise probabilities! I’ll have to look into this more to see how they really connect.

Dominance and Decisions

21 07 2006

I’ve finally posted slides from the talk that I gave at Stanford and a couple times in Australia in the last couple months. (Each of the talks was somewhat shorter than this whole set of slides – I’ve combined them all here.) I discuss some ideas about putting decision theory on new foundations in order to better deal with some problematic cases due to Alan Hájek and others, and in the process get a slightly more unified account of the Two-Envelope Paradox and some others. Of course, my theory’s not fully worked out yet, so comments and criticism are certainly welcome!

Melbourne Visit

18 06 2006

Liek Richard before me (and myself last year), I had a nice visit in Melbourne. Unfortunately, it was fairly short because the tickets were more expensive at other times. It’s amazing how helpful it can be to explain your ideas to someone who isn’t working immediately in the same field – I got some useful ideas from my conversations with Greg and Zach that I spent some time writing up yesterday. In some sense they’re just points about how to present some of the ideas, but the right way to present and link ideas is certainly an extremely large part of the advances in most good work (if not 90% of the progress).

Anyway, so that this post has some slight amount of content itself, here is a link my boyfriend sent me to a talk by psychologist Daniel Gilbert on decision theory, and how people are often bad at estimating both probabilities and utilities. I find it particularly interesting because I’m talking on Tuesday about decision theory here in Canberra (I’ll be repeating it at the AAP in a couple weeks, and I gave a version a few weeks ago at Stanford as well). But also, it’s interesting that someone could be talking about this stuff to a general audience at South by Southwest (which apparently is much more than just a music festival).

Discontinuities in States and Actions

13 06 2006

(The ideas in this post originate in several conversations with Alan Hájek and Aidan Lyon.)

In chapter 15 of his big book on probability, E. T. Jaynes said “Not only in probability theory, but in all mathematics, it is the careless use of infinite sets, and of infinite and infinitesimal quantities, that generates most paradoxes.” (By “paradox”, he means “something which is absurd or logically contradictory, but which appears at first glance to be the result of sound reasoning. … A paradox is simply an error out of control; i.e. one that has trapped so many unwary minds that it has gone public, become institutionalized in our literature, and taught as truth.” His position is somewhat unorthodox, hinting that in some sense all of infinite set theory (and many classic examples in probability theory) is made up of this sort of paradox. But I th ink a lot of what he says in the chapter is useful, and I intend to study it more to see what it says about the particular infinities and zeroes that I’ve been worrying about in probability theory.

His recommendation of what to do is as follows:

Apply the ordinary processes of arithmetic and analysis only to expressions with a finite number n of terms. Then after the calculation is done, observe how the resulting finite expressions behave as the parameter n increases indefinitely.

Put more succinctly, passage to a limit should always be the last operation, not the first.

One suggestion to take from this idea might be that in phrasing any well-defined decision problem, the payoffs should in some sense be a continuous function o f the states. (I should point out that I got this suggestion from Aidan Lyon.) For instance, consider the game in the St. Petersburg paradox (interestingly, the argument in section 3 there about boundedness of utilities seems to miss a possibility – M artin considers the case of bounded utilities where the maximum is achievable, and unbounded ones where the maximum is not achievable, but not bounded ones where the maximum is unachievable, which seems to largely invalidate the argument). An objection y ou might be able to make to this game is that payoffs haven’t been specified for every possibility – although the probability of a fair coin repeatedly flipped coming up heads every time is zero, that doesn’t mean that it’s impossible. So we must specify a payoff for this state. But of course, we’d like to not have to wait forever to start giving you your payout, since then you’ll effectively get no payout. So we have to be giving a sequence of approximations at each step. Which basically suggests tha t we should make the payoff for the limit state be the limit of the payoffs of the finite states. Which is just as Jaynes would like – we shouldn’t do something (specify a payout) after taking a limit. Instead, we should specify payouts, and t hen take a limit, making the payout of the limit stage be the limit of the finite payouts. Which in this case means that there’s actually a chance of an infinite payout for the St. Petersburg game! (Even if that chance has probability zero.) So per haps it’s no longer so problematic a game – the expectation is no longer higher than every payout.

Note the sort of continuity I’m considering here. In some sense the payouts are discontinuous (obviously, they jump with each coin toss). But in the natural topology on the sp ace (where the open sets are exactly the pieces of evidence we could have at some point – in this case that the game will take at least n flips) it is continuous. Which leads us to a distinction between two games that classically look the same – in one game I flip a fair coin and give you $1 if it comes up heads and nothing if it comes up tails; in the other game I throw an infinitely thin dart randomly at a dartboard and give you $1 if it hits the left half and nothing if it hits the right half (stipulate that the upper half of the center line counts as left, the lower half counts as right, and the center point doesn’t exist). The difference is that in the former case, it’s always easy to tell which state has occurred, so we can calculate the pa yoff. In the latter case though, if the dart hits exactly on the middle line, then we can’t tell which payoff you should get unless we can measure the location of the dart with infinite accuracy. If we can only tell to within 1 mm where the dart has hit, then any dart that hits within 1 mm of the center line will be impossible to pay on. If we can refine our observations, then we can pay up for most of these points, but even closer ones will still cause trouble. And no matter how much we refine them, a dart that hits the line exactly (this has probability zero, but it seems that it still might happen, since it’ll hit some line) will be one that we can never know which payoff is right. So you’ll be stuck waiting for your payoff rather than actu ally getting one or the other. So the game is bad again, although the analogous coin-flipping game is good.

So once we’ve found the right topology for the state space, it seems that we may want to require that the payoffs for any well-defined game be co ntinuous on it. (For other reasons, like representing our limited capacity to know about the world, we might want to require that any isolated points (where the payoff can jump) must have positive probability, like the finite numbers of coin flips in the St. Petersburg case, but not the infinite one.)

In conversation today, Alan Hájek pointed out another sort of discontinuity that can arise in decision theory, namely one where payoffs are discontinuous on one’s actions. In the cases above, what’s discontinuous are the payoffs within some action I might agree to perform (say, playing the St. Petersburg game, or agreeing to a payoff schedule for a dart throw). But we can run into problems when we’re faced with multiple possible actions. The one that Alan mentioned was where you’re at the gates of heaven, and god offers you the possibility to stay as many days as you like, provided that you give him the (finite) number ahead of time on a piece of paper (a really large one, that you can compress a s much text on as you want). So you start writing down a large number, say by writing 9999…. No matter how long you keep writing, it’s in your interest to write another 9. However, Alan points out that the worst possible outcome here is that you keep writing 9’s forever and never actually make it into heaven!

To model this in the framework of decision theory, there are a bunch of actions that you can choose from. Action n results in you being in heaven for exactly n days, and then the re is one more action, that’s somehow the limit of all these previous actions, in which you never make it to heaven. No state space or anything else is relevant (in this particular case). But on your preferences, action n is preferred to action m iff n>m, except that the limit of these actions doesn’t get the limit of the payoffs. That is, staying there writing 9’s forever doesn’t give you infinitely many days in heaven – in fact, it gives you none! So this decision problem mig ht also be counted as somehow bad on Jaynes’ account, because it involves an essential discontinuity in the payoffs, though this time it is with respect to the space of actions rather than the space of states. (The topology on the space of actions will have to be defined with open sets being possible partial actions, or something of the sort, just as the topology on the space of states is defined with open sets being the possible partial observations, or something.)

Maybe this is a problem for Jaynes? This game that god has given you doesn’t seem to be too paradoxical – somehow it doesn’t seem as bad as St. Petersburg, even though it puts decision theory almost in a worse place (no decision is correct, rather than the correct decision being to pay any finite amount for a single shot at a game).

Anyway, I thought there was an interesting distinction here between these two types of discontinuities. I don’t know if one is more problematic than the other, but it’s something to think about. Also, I should point out that a decision problem like this last one seems to have first been introduced by Cliff Landesman, in “When to Terminate a Charitable Trust?”, which came out in Analysis some time in the mid-’90s I think.s

Motivation for Expected Utility

28 05 2006

FEW just ended, and it was just as exciting as ever. I think it was a bit more international, a bit more interdisciplinary, and substantially larger (in terms of audience) than either of the past two years. Anyway, as seems to happen when I’m at conferences, I’ve got some more ideas to blog about. This one was something I thought of when writing my comments for Katie Steele’s paper on decision theory and the Allais paradox. To start out, here is a presentation of the Allais paradox: In the first decision, one has a 90% chance of $1,000,000 and is choosing whether the other chances should be a 1% chance of $0 and a 9% chance of $5,000,000, or a 10% chance of $1,000,000. The second decision is just the same, but with a 90% chance of $0. The principle of independence suggests that whether the 90% chance is of $1,000,000 or of $0 should make no difference when choosing between the different 10% options – yet many seemingly-rational people choose the flat $1,000,000 in the first case and the 9% chance of $5,000,00 in the second.

There are several principles of decision theory that seem fairly intuitive I have abstracted from the general notion of “independence”, and they seem to lead directly to expected utility theory:

  • The value of a gamble is a real number.
  • Nothing besides probabilities and utilities of outcomes is relevant for the value of a gamble.
  • The value of a gamble is a weighted sum of several contributions.
  • Each contribution is associated with a particular possible outcome.
  • The weight of each contribution is proportional to the probability of the corresponding outcome.
  • The value of each contribution is proportional to the utility of that outcome.

From these principles, it is easy to see that (wherever possible) the value of a gamble is equal to its expected utility (modulo some scalar multiple that applies equally to all gambles).

These principles seem fairly plausible, but of course there are reasons one might question each. The first principle underlies most traditional decision theory of any sort, even though there are traditional examples (the St. Petersburg game) that seem to contradict it, and one can also come up with actions where the intuitive preference relation between them can’t possible be represented by real numbers.

The second principle can be questioned in cases where one already has a package of gambles, and one wants to amortize risk. That is, if purchasing insurance is to be considered rational, it will be because we care not just about the probabilities and payoffs of the insurance gamble, but also about the fact that we get positive payoffs when something bad happens, and negative payoffs when good things happen.

The third principle is probably fairly easy to question, but I don’t know of any natural way to reject it.

The fourth principle can be rejected in one natural way to deal with risk-aversion. For instance, in addition to the “local” factors associated with each outcome, one can add a “global” factor associated with the variance or standard deviation of the payoffs of the gamble. We might need to be careful when adding this factor to make sure that we don’t violate more fundamental constraints (like the principle of dominance – that if gamble A always has a better payoff than gamble B in every state, then one should prefer A to B). In introducing such a factor, we’ll have to figure out just what extra factors might be relevant, and how to weight them, which is at least one reason why this option is much less attractive than standard expected utility theory, though it has obvious appeal for dealing with risk-aversion.

I don’t know if there’s any reason one might rationally reject the fifth principle, though presumably it will have to be relaxed somehow if one is to rationally prefer gamble A to gamble B if they are identical, save for a much higher payoff for A than B on a state of probability 0.

The most natural way to relax the sixth may be in conjunction with relaxing the second. Some other factor beyond utility of an outcome may be considered. Another way to relax it would be to make the contribution of an outcome depend not only on its utility, but also on how things could have turned out otherwise on the same gamble. In an example by Amartya Sen cited by Katie, cracking open a bottle of champagne when receiving nothing in the mail may make quite a different contribution to an overall gamble when one could have received a serious traffic summons in the mail than when one could have received a large check in the mail.

To make it more clear that this doesn’t only arise when the experience of the event is different, we can consider a version of the Allais paradox with memory erasure. That is, the payoffs are just the same, but in addition to the cash, one has one’s memory erased and replaced with the memory of making a gamble with a sure outcome of whatever it is one received. Thus, there is nothing worse about the $0 when one could have had a guaranteed $1,000,000 than about the $0 when one only had a chance of making money. Since we seem to make the same decisions anyway, it seems that a counterfactual factor of what could have happened otherwise (rather than an emotional factor) must be factored into the value of the gamble.

These methods of relaxing the fourth and sixth principle seem to do different violations to the notion of independence (the sixth maintaining a kind of locality, and the fourth adding a global factor), but it turns out that the procedures in each can model exactly the same decisions as the other. I think the only way to decide between them will be by finding replacements for these principles and seeing which has more natural restrictions.

I hadn’t thought much about violations of independence before reading Katie’s paper, but I think they might be quite plausible. However, it’s interesting to see how some very strong version of principles like independence lead directly to expected utility, in a way that avoids standard representation theorems and the laws of large numbers.