Guessing the result of infinitely many coin tosses

26 09 2008

I’ve stolen the title of a very interesting post by Andrew Bacon over at Possibly Philosophy. He considers a situation where there will be infinitely many coin flips occurring between noon and 1 pm, but instead of getting faster and faster, they’re getting slower and slower. In particular, one coin flip occurs at 1/n past noon for each integer n. He shows that there exists a strategy for guessing, where each guess may use information about how previous coin flips have come up (which sounds like it should be irrelevant, because they’re all independent 50/50 flips), such that one is guaranteed to get all but finitely many of the guesses correct. In particular, this means that although there is no first flip, there is a first flip on which one gets an incorrect guess, so up to that point one has successfully guessed the result of infinitely many independent fair coin tosses.

Check out his post to see the description of the strategy, or work it out yourself, before going on.

Anyway, I managed to figure this out, only because it’s formally identical to a game with colored hats that I had discussed earlier with math friends. But I think Andrew’s version with the coin flips is somehow even more surprising and shocking!

He begins his discussion with a discussion of a very nice paper by Tim Williamson showing that allowing infinitesimal probabilities doesn’t get us around the requirement that the probability of an infinite sequence of heads be 0. (Coincidentally, I’m currently working on a paper extending this to suggest that infinitesimal probabilities really can’t do much of anything useful for epistemological purposes.) Andrew concludes with a bit more discussion of these probabilistic issues:

Now this raises some further puzzles. For example, suppose that it’s half past twelve, and you know the representative sequence predicts that the next toss will be heads. What should your credence that it will land heads be? On the one hand it should be 1/2 since you know it’s a fair coin. But on the other hand, you know that the chance that this is one of the few times in the hour that you guess incorrectly is very small. In fact, in this scenario it is “infinitely” smaller in comparison, so that your credence in heads should be 1. So it seems this kind of reasoning violates the Principal Principle.

I was going to leave this following series of remarks as a comment, but it got bigger than I expected, and I liked this puzzle so much that I figured I should point other people to it in my own post anyway. So I don’t think that this puzzle leads to a violation of the Principal Principle. (For those who aren’t familiar, this principle comes from David Lewis’ paper “A Subjectivist’s Guide to Objective Chance” and states the seeming triviality that when one knows the objective chance of some outcome, and doesn’t know anything “inadmissible” specifically about the actual result of the process, then regardless of what other historical information one knows, one’s degree of belief in the outcome should be exactly equal to the chances. The only reason this principle is normally in dispute is in terms of figuring out what “inadmissible” means – if we can’t make sense of that concept then we can’t use the principle.)

The set of representatives involved in picking the strategy is a non-measurable set. To see this, we can see that the construction is basically a version of Vitali’s construction of a non-measurable set of reals. The only difference is that in the case of the reals we make use of the fact that Lebesgue measure is translation invariant – here we make use of the fact that measure is invariant under transformations that interchange heads and tails on any set of flips. Instead of the set of rational translations, we consider the set of finite interchanges of heads and tails. (This argument is basically outlined in the paper by Kadane, Schervish, and Seidenfeld, “Statistical Implications of Finitely Additive Probability”, which I’ve been working through more carefully recently for another paper I’m working on.)

This suggests to me that the set of situations in which you make the wrong guess on flip n may well be unmeasurable as well. (I haven’t worked out a proof of this claim though. [UPDATE: I have shown that it can sometimes be measurable depending on the strategy, and I conjecture that whenever it is measurable, it will have probability 1/2. See the comments for the argument.]) So from the outset (before the game starts) it looks like you can’t (or perhaps better, don’t) assign a credence to “I will make the wrong choice on flip n”. However, once it’s actually 1/n past twelve, you’ve got a lot more information – in particular, you know the outcomes of all the previous flips. I suspect that conditional on this information, the probability of making the wrong choice on flip n will just be 1/2, as we expect. (Of course, we have to make sense of what the right conditional probability to use here is, because we’re explicitly conditionalizing on a set of probability 0, as per Williamson.) Thus, there’s no violation of the Principal Principle when the time actually comes.

However, there does appear to be a violation of a principle called “conglomerability”, which is what my interest in the Kadane, Schervish, and Seidenfeld paper is about. One version of this principle states that if there is a partition of the probability space, and an event whose probability conditional on every event in this partition is in some interval, then the unconditional probability of that event must also be in this interval. In this case we’ve got a partition (into all the sequences of flips down to the nth) and the probability of guessing wrong conditional on each event in that partition is 1/2, and yet I’ve said that the unconditional probability is undefined, rather than 1/2.

I think I have a defense here though, which is part of my general defense of conglomerability against several other apparent counterexamples (including some in this paper and others by Seidenfeld, Schervish, and Kadane, as well as a paper by Arntzenius, Elga, and Hawthorne). The problem here is that the event whose unconditional probability we are interested in is unmeasurable, so it doesn’t even make sense to talk about the unconditional probability. In the case of the conditional probabilities, I’d say that strictly speaking it doesn’t make sense to talk about them either. However, conditional on the outcome of the flips down to the nth, this event is coextensive with something that is measurable, namely “the coin will come up heads” (or else perhaps, “the coin will come up tails”), because the sequence of flips down to that point determines which outcome I will guess for that flip. Thus these events have well-defined probabilities.

The only reason this situation of an apparent violation of conglomerability emerges is because we’re dealing explicitly with unmeasurable sets. However, I’m only interested in the application of probability theory to agents with a certain type of restriction on their potential mental contents. I don’t want to insist that they can only grasp finite amounts of information, because in some sense we actually do have access to infinitely many pieces of information at any time (at least, to the extent that general Chomskyan competence/performance distinctions suggest that we have the competence to deal with infinitely many propositions about things, even if in practice we never deal with extremely large sentences or numbers or whatever). However, I think it is reasonable to insist that any set that an agent can grasp not essentially depend on the Axiom of Choice for the proof that it exists. Although we can grasp infinitely much information, it’s only in cases where it’s nicely structured that we can grasp it. Thus, the strategy that gives rise to this problem is something that can’t be grasped by any agent, and therefore this problem isn’t really a problem.


The Role of Existence Proofs

13 09 2008

When I was an undergraduate, I remember being very struck by some of the early results in the class I was taking on abstract algebra. Of course, I was eventually very struck by the results from Galois theory when we got there, but in the early parts of the class I was struck by the results proving the existence of the algebraic closure of a field, and proving the existence of a field of fractions for every integral domain. In particular, these seemed to me to validate the use of the complex numbers (once the reals were given) and the rational numbers (once the integers were given). I was still vaguely dissatisfied that we hadn’t yet had a proof of the existence of the integers, but I became happier when I saw the definition of the natural numbers as the smallest set containing 0 and closed under the successor operation, especially because this let proof by induction be a theorem rather than an axiom.

However, earlier this week (in conversation with Zach Weber, while visiting Sydney), I started realizing what I should have realized long ago, which is that these theorems really can’t be doing as much work in justifying our use of the various number concepts as I had thought when I was younger. Of course, these theorems are quite useful when talking about abstract fields or rings, but when we’re talking about the familiar complex, real, rational, and integer numbers, it’s no longer clear to me that these theorems add anything whatsoever. After all, what these theorems show is just that, by using some fancy set-theoretic machinery of ordered pairs and equivalence classes, we can create a structure that has all the properties of a structure that we already basically understood. Perhaps in the case of the complex numbers this mathematical assurance is useful (though even there we already had the simple assurance in the form of thinking of complex numbers as ordered pairs of reals, rather than as polynomials over R modulo the ideal [x2+1]), but for the rationals and negative numbers, our understanding of them as integers with fractional remainder, or as formal inverses of positive numbers, is already sophisticated enough to see that they’re perfectly well-defined structures, even before we get the construction as equivalence classes of ordered pairs of integers or naturals.

But this is all a sort of prelude to thinking about the two more famous set-theoretic reductions, that of the reals to Dedekind cuts (or Cauchy sequences) of rationals, and that of the naturals to the finite von Neumann ordinals. Unlike the others, I think the Cauchy and Dedekind constructions of the reals are quite useful – before their work, I think the notion of real number was quite vague. We knew that every continuous function that achieves positive and negative values should have a zero, but it wasn’t quite clear why this should be so. Also, I think intuitively there remained worries about whether there could be a distinct real number named by “.99999…” as opposed to the one named by “1”, not to mention the worries about whether certain non-convergent series could be summed, like 1-1+1-1+1….

But for the reduction of the naturals to the von Neumann ordinals, I think it’s clear that this should do no work in explicating the notion at hand. To prove that enough von Neumann ordinals exist to do this work, you already need a decent amount of set theory. (John Burgess’ excellent book Fixing Frege does a good job investigating just how much set theory is needed for this and various other reductions.) And while some of the notions involved are basic, like membership and union, I think the concept of sets of mixed rank (for instance, sets that have as members both sets, and sets of sets) already strains our concept of set much more than any of this can help clarify basic notions like successor, addition, and multiplication. One might even be able to make a case that to understand the relevant formal set theory one must already have the concept of an ordered string of symbols, which requires the concept of finite ordering, which is basically already the concept of natural numbers!

In some sense, this was one project that Frege was engaged in, and his greatest failure (the fact that his set theory was inconsistent) shows in a sense just how unnecessary this project was. At least to some extent, Frege’s set theory was motivated by an extent to show the consistency of Peano arithmetic, and clarify the concept of natural number. However, when his explanation failed, this didn’t undermine our confidence in the correctness of Peano arithmetic. The same thing would be the case if someone today were to discover that ZFC was inconsistent – most of the mathematics that we today justify by appeal to ZFC would still stand. We wouldn’t abandon Peano arithmetic, and I think we wouldn’t even abandon most abstract algebra, geometry, analysis, and the like, except perhaps in some cases where we make strong appeals to the Axiom of Choice and strange set-theoretic constructions.

Of course, Frege’s actual attempted reduction of the number concepts to set theory would have been a very nice one, and could help explain what we mean by number, because he reduced each number to the collection of all sets with that many elements. However, modern set theory suggests that no such collections exist (except in the case of the number 0), and the proposed reductions are much less illuminating.

So I wonder, what role do these proofs play, that demonstrate the existence of structures that behave like the familiar natural numbers, integers, rationals, reals and complex numbers? I’ve suggested that in the case of the reals it may actually do important work, but I’m starting to be skeptical of most of the other cases.