## Discontinuities in States and Actions

13 06 2006

(The ideas in this post originate in several conversations with Alan Hájek and Aidan Lyon.)

In chapter 15 of his big book on probability, E. T. Jaynes said “Not only in probability theory, but in all mathematics, it is the careless use of infinite sets, and of infinite and infinitesimal quantities, that generates most paradoxes.” (By “paradox”, he means “something which is absurd or logically contradictory, but which appears at first glance to be the result of sound reasoning. … A paradox is simply an error out of control; i.e. one that has trapped so many unwary minds that it has gone public, become institutionalized in our literature, and taught as truth.” His position is somewhat unorthodox, hinting that in some sense all of infinite set theory (and many classic examples in probability theory) is made up of this sort of paradox. But I th ink a lot of what he says in the chapter is useful, and I intend to study it more to see what it says about the particular infinities and zeroes that I’ve been worrying about in probability theory.

His recommendation of what to do is as follows:

Apply the ordinary processes of arithmetic and analysis only to expressions with a finite number n of terms. Then after the calculation is done, observe how the resulting finite expressions behave as the parameter n increases indefinitely.

Put more succinctly, passage to a limit should always be the last operation, not the first.

One suggestion to take from this idea might be that in phrasing any well-defined decision problem, the payoffs should in some sense be a continuous function o f the states. (I should point out that I got this suggestion from Aidan Lyon.) For instance, consider the game in the St. Petersburg paradox (interestingly, the argument in section 3 there about boundedness of utilities seems to miss a possibility – M artin considers the case of bounded utilities where the maximum is achievable, and unbounded ones where the maximum is not achievable, but not bounded ones where the maximum is unachievable, which seems to largely invalidate the argument). An objection y ou might be able to make to this game is that payoffs haven’t been specified for every possibility – although the probability of a fair coin repeatedly flipped coming up heads every time is zero, that doesn’t mean that it’s impossible. So we must specify a payoff for this state. But of course, we’d like to not have to wait forever to start giving you your payout, since then you’ll effectively get no payout. So we have to be giving a sequence of approximations at each step. Which basically suggests tha t we should make the payoff for the limit state be the limit of the payoffs of the finite states. Which is just as Jaynes would like – we shouldn’t do something (specify a payout) after taking a limit. Instead, we should specify payouts, and t hen take a limit, making the payout of the limit stage be the limit of the finite payouts. Which in this case means that there’s actually a chance of an infinite payout for the St. Petersburg game! (Even if that chance has probability zero.) So per haps it’s no longer so problematic a game – the expectation is no longer higher than every payout.

Note the sort of continuity I’m considering here. In some sense the payouts are discontinuous (obviously, they jump with each coin toss). But in the natural topology on the sp ace (where the open sets are exactly the pieces of evidence we could have at some point – in this case that the game will take at least n flips) it is continuous. Which leads us to a distinction between two games that classically look the same – in one game I flip a fair coin and give you \$1 if it comes up heads and nothing if it comes up tails; in the other game I throw an infinitely thin dart randomly at a dartboard and give you \$1 if it hits the left half and nothing if it hits the right half (stipulate that the upper half of the center line counts as left, the lower half counts as right, and the center point doesn’t exist). The difference is that in the former case, it’s always easy to tell which state has occurred, so we can calculate the pa yoff. In the latter case though, if the dart hits exactly on the middle line, then we can’t tell which payoff you should get unless we can measure the location of the dart with infinite accuracy. If we can only tell to within 1 mm where the dart has hit, then any dart that hits within 1 mm of the center line will be impossible to pay on. If we can refine our observations, then we can pay up for most of these points, but even closer ones will still cause trouble. And no matter how much we refine them, a dart that hits the line exactly (this has probability zero, but it seems that it still might happen, since it’ll hit some line) will be one that we can never know which payoff is right. So you’ll be stuck waiting for your payoff rather than actu ally getting one or the other. So the game is bad again, although the analogous coin-flipping game is good.

So once we’ve found the right topology for the state space, it seems that we may want to require that the payoffs for any well-defined game be co ntinuous on it. (For other reasons, like representing our limited capacity to know about the world, we might want to require that any isolated points (where the payoff can jump) must have positive probability, like the finite numbers of coin flips in the St. Petersburg case, but not the infinite one.)

In conversation today, Alan Hájek pointed out another sort of discontinuity that can arise in decision theory, namely one where payoffs are discontinuous on one’s actions. In the cases above, what’s discontinuous are the payoffs within some action I might agree to perform (say, playing the St. Petersburg game, or agreeing to a payoff schedule for a dart throw). But we can run into problems when we’re faced with multiple possible actions. The one that Alan mentioned was where you’re at the gates of heaven, and god offers you the possibility to stay as many days as you like, provided that you give him the (finite) number ahead of time on a piece of paper (a really large one, that you can compress a s much text on as you want). So you start writing down a large number, say by writing 9999…. No matter how long you keep writing, it’s in your interest to write another 9. However, Alan points out that the worst possible outcome here is that you keep writing 9’s forever and never actually make it into heaven!

To model this in the framework of decision theory, there are a bunch of actions that you can choose from. Action n results in you being in heaven for exactly n days, and then the re is one more action, that’s somehow the limit of all these previous actions, in which you never make it to heaven. No state space or anything else is relevant (in this particular case). But on your preferences, action n is preferred to action m iff n>m, except that the limit of these actions doesn’t get the limit of the payoffs. That is, staying there writing 9’s forever doesn’t give you infinitely many days in heaven – in fact, it gives you none! So this decision problem mig ht also be counted as somehow bad on Jaynes’ account, because it involves an essential discontinuity in the payoffs, though this time it is with respect to the space of actions rather than the space of states. (The topology on the space of actions will have to be defined with open sets being possible partial actions, or something of the sort, just as the topology on the space of states is defined with open sets being the possible partial observations, or something.)

Maybe this is a problem for Jaynes? This game that god has given you doesn’t seem to be too paradoxical – somehow it doesn’t seem as bad as St. Petersburg, even though it puts decision theory almost in a worse place (no decision is correct, rather than the correct decision being to pay any finite amount for a single shot at a game).

Anyway, I thought there was an interesting distinction here between these two types of discontinuities. I don’t know if one is more problematic than the other, but it’s something to think about. Also, I should point out that a decision problem like this last one seems to have first been introduced by Cliff Landesman, in “When to Terminate a Charitable Trust?”, which came out in Analysis some time in the mid-’90s I think.s

## Motivation for Expected Utility

28 05 2006

FEW just ended, and it was just as exciting as ever. I think it was a bit more international, a bit more interdisciplinary, and substantially larger (in terms of audience) than either of the past two years. Anyway, as seems to happen when I’m at conferences, I’ve got some more ideas to blog about. This one was something I thought of when writing my comments for Katie Steele’s paper on decision theory and the Allais paradox. To start out, here is a presentation of the Allais paradox: In the first decision, one has a 90% chance of \$1,000,000 and is choosing whether the other chances should be a 1% chance of \$0 and a 9% chance of \$5,000,000, or a 10% chance of \$1,000,000. The second decision is just the same, but with a 90% chance of \$0. The principle of independence suggests that whether the 90% chance is of \$1,000,000 or of \$0 should make no difference when choosing between the different 10% options – yet many seemingly-rational people choose the flat \$1,000,000 in the first case and the 9% chance of \$5,000,00 in the second.

There are several principles of decision theory that seem fairly intuitive I have abstracted from the general notion of “independence”, and they seem to lead directly to expected utility theory:

• The value of a gamble is a real number.
• Nothing besides probabilities and utilities of outcomes is relevant for the value of a gamble.
• The value of a gamble is a weighted sum of several contributions.
• Each contribution is associated with a particular possible outcome.
• The weight of each contribution is proportional to the probability of the corresponding outcome.
• The value of each contribution is proportional to the utility of that outcome.

From these principles, it is easy to see that (wherever possible) the value of a gamble is equal to its expected utility (modulo some scalar multiple that applies equally to all gambles).

These principles seem fairly plausible, but of course there are reasons one might question each. The first principle underlies most traditional decision theory of any sort, even though there are traditional examples (the St. Petersburg game) that seem to contradict it, and one can also come up with actions where the intuitive preference relation between them can’t possible be represented by real numbers.

The second principle can be questioned in cases where one already has a package of gambles, and one wants to amortize risk. That is, if purchasing insurance is to be considered rational, it will be because we care not just about the probabilities and payoffs of the insurance gamble, but also about the fact that we get positive payoffs when something bad happens, and negative payoffs when good things happen.

The third principle is probably fairly easy to question, but I don’t know of any natural way to reject it.

The fourth principle can be rejected in one natural way to deal with risk-aversion. For instance, in addition to the “local” factors associated with each outcome, one can add a “global” factor associated with the variance or standard deviation of the payoffs of the gamble. We might need to be careful when adding this factor to make sure that we don’t violate more fundamental constraints (like the principle of dominance – that if gamble A always has a better payoff than gamble B in every state, then one should prefer A to B). In introducing such a factor, we’ll have to figure out just what extra factors might be relevant, and how to weight them, which is at least one reason why this option is much less attractive than standard expected utility theory, though it has obvious appeal for dealing with risk-aversion.

I don’t know if there’s any reason one might rationally reject the fifth principle, though presumably it will have to be relaxed somehow if one is to rationally prefer gamble A to gamble B if they are identical, save for a much higher payoff for A than B on a state of probability 0.

The most natural way to relax the sixth may be in conjunction with relaxing the second. Some other factor beyond utility of an outcome may be considered. Another way to relax it would be to make the contribution of an outcome depend not only on its utility, but also on how things could have turned out otherwise on the same gamble. In an example by Amartya Sen cited by Katie, cracking open a bottle of champagne when receiving nothing in the mail may make quite a different contribution to an overall gamble when one could have received a serious traffic summons in the mail than when one could have received a large check in the mail.

To make it more clear that this doesn’t only arise when the experience of the event is different, we can consider a version of the Allais paradox with memory erasure. That is, the payoffs are just the same, but in addition to the cash, one has one’s memory erased and replaced with the memory of making a gamble with a sure outcome of whatever it is one received. Thus, there is nothing worse about the \$0 when one could have had a guaranteed \$1,000,000 than about the \$0 when one only had a chance of making money. Since we seem to make the same decisions anyway, it seems that a counterfactual factor of what could have happened otherwise (rather than an emotional factor) must be factored into the value of the gamble.

These methods of relaxing the fourth and sixth principle seem to do different violations to the notion of independence (the sixth maintaining a kind of locality, and the fourth adding a global factor), but it turns out that the procedures in each can model exactly the same decisions as the other. I think the only way to decide between them will be by finding replacements for these principles and seeing which has more natural restrictions.

I hadn’t thought much about violations of independence before reading Katie’s paper, but I think they might be quite plausible. However, it’s interesting to see how some very strong version of principles like independence lead directly to expected utility, in a way that avoids standard representation theorems and the laws of large numbers.