Probabilistic Proofs of Undecidable Sentences?

22 04 2009

As I suggested (though perhaps not made as clear as I would have liked) in my forthcoming paper “Probabilistic Proofs and Transferability”, I think that probabilistic proofs can be used to provide knowledge of mathematical facts, even though there are reasons why we might think they should remain unacceptable as part of mathematical practice. (Basically, even though mathematicians gain knowledge through testimony all the time, there may be reasons for a methodological insistence that all mathematical arguments not rely essentially on testimony, the way probabilistic proofs (just like experiments in the natural sciences) do.) Given that this allows for a non-deductive means of acquiring mathematical knowledge, a natural question would be whether probabilistic proofs can be used to settle questions that are undecided by standard axiom sets. Unfortunately, I think the answer is probably “no”.

The sort of probabilistic proofs that I endorse aren’t just any old argument that isn’t deductively valid. In my paper, I contrast probabilistic proofs of the primality of a given number with certain arguments in favor of the Goldbach Conjecture and the Riemann Hypothesis, as well as lottery arguments. The primality arguments work by showing that if n is prime, then a certain result is guaranteed, while if n is composite, then the probability of the same sort of result is at most 1/4k, where k can be made as large as one wants by generating more random numbers. As a result the evidence “tracks” the truth of the claim involved (that is, the evidence is very likely if the claim is true and very unlikely if the claim is false).

In the case of the Goldbach Conjecture, one of the pieces of “probabilistic” evidence that we have is that none of the first several million even numbers is a counterexample to the claim. However, although this evidence is very likely if the claim is true, we have no good a priori bounds on how likely this evidence would be if the claim were false. Thus, the evidence may give us some reason to believe the Goldbach Conjecture, but it’s hard to see just how much more evidence we’d need in order for this to give us knowledge that the Goldbach Conjecture is in fact true.

In the case of the Riemann Hypothesis, I don’t have room (or the knowledge) to state all the relevant pieces of evidence, but in this case the evidence is more of the type of “inference to the best explanation”, but again lacks any sort of probabilistic bounds on the possibility of it arising in misleading cases. This is not to say that this type of evidence can’t provide knowledge – in fact, we may already have enough evidence to count as knowing that the Riemann Hypothesis is true, though I’m definitely not certain of this. At any rate, if this does provide knowledge, it’s of a different sort than probabilistic proof, and may well have its own problems for acceptability in the mathematical canon.

In “lottery cases”, we have strong probabilistic evidence for the claim that a particular ticket will not win (in particular, that this ticket is in a fair lottery, and that there are a billion separate tickets, exactly one of which is a winner). However, most people want to say that we don’t have knowledge of this claim. Again, at least part of the reason is that our evidence doesn’t track the truth of the claim – nothing about our evidence is sensitive to the ticket’s status, and the evidence is just as likely if the claim is false (that is, the ticket is a winner) as if it’s true (that the ticket isn’t a winner).

Therefore, it seems that for any sort of evidence that we’d count as a “probabilistic proof” that gives us knowledge of some mathematical claim, we’d want to have good bounds on the probability of getting such evidence if the claim were true, as well as good bounds on the probability of getting such evidence if the claim were false. At the moment, I can’t really conceive of ways of generating mathematical evidence where there would be clear bounds on these probabilities other than by randomly sampling numbers from a particular bounded set. (There may be possibilities of sampling from unbounded sets, provided that the sampling method doesn’t rely on a non-countably-additive probability, but it’s a bit harder to see how these sorts of things could be relevant.)

Now let’s consider some claim that is independent of Peano Arithmetic, and see if there might be any way to come up with a probabilistic technique that would give evidence for or against such a claim, that had the relevant probabilistic bounds.

Unfortunately, I don’t think there will be. Consider two models of PA – one in which the claim is true, and one in which it’s false. We can assume that one of these models is the standard model (that is, it just consists of the standard natural numbers) while the other is some end-extension of this model. If our probabilistic method involves generating natural numbers through some random process and then checking their properties, then this method will have the same probabilities of generating the same numbers if applied in either one of these models (since presumably, the method will only ever generate standard natural numbers, and not any of the extra things in the non-standard model). Thus, any result of such a test will have the same probability whether the statement is true or false, and thus can’t be sensitive in the sense demanded above.

Now, one might worry that this argument assumes that conditional probabilities will have to be probabilities given by considering a particular non-standard model. This clearly isn’t what we’re doing in the case of primality testing however – we don’t consider the probabilities of various outcomes in a model where n is prime and in a model where n is not prime, not least because we have no idea what a model of the false one would even look like.

However, I think this is what we’d have to do for statements that could conceivably be independent of PA. Consider any process that involves sampling uniformly among the integers up to k, and then performing some calculations on them. Knowing the truth of PA, we can calculate what all the possible results of such a process could be, and verify that these will be the same whether our independent statement is true or false. Since this calculation will tell us everything we could possibly gain from running the probabilistic process, and it hasn’t given us knowledge about the independent statement, the probabilistic process can’t either, since it adds no new information, and therefore no new knowledge.

If we allow for probabilistic processes that sample from all natural numbers (rather than just sampling uniformly from a bounded range of them) I think we still can’t get any relevant knowledge. Assuming that the sampling process must have some countably additive probability distribution, then for any ε there must be some sufficiently large N that the probability that the process never generates a number larger than N is at most ε. By the above argument, the process can’t tell us anything about the independent statement if it never generates a number larger than N. Therefore, the difference in probability of various outcomes given the statement and its negation can’t be any larger than ε. But since ε was arbitrary, this means that the difference in probabilities must be 0. Thus, the process still fails to meet the sensitivity requirement, and therefore can’t give knowledge about the independent statement.

Thus, we can’t get probabilistic information about independent claims. Of course, since many people will be willing to say that we have good reason to believe the Goldbach Conjecture, and the Riemann Hypothesis, on the evidence mentioned above, we may be able to get arguments of those sorts for independent claims. But such arguments aren’t really “probabilistic” in the sense standardly discussed.

Additionally, there are many claims independent of Peano Arithmetic that we can prove using standard mathematical means. Some examples include Goodstein’s Theorem, the Paris-Harrington Theorem, and Borel Determinacy. (The latter is in fact so strong that it can’t be proven in Zermelo set theory, which is ZF set theory without the Axiom of Replacement.) In fact, the arguments given above show that probabilistic methods not only fail to decide claims that are undecided in PA, but in fact fail to decide claims that are independent of substantially weaker systems, like Robinson arithmetic and probably even Primitive Recursive Arithmetic. Since basically everything that is of mathematical interest is independent of PRA (basically all it can prove is statements saying that a recursive function has a specific value on specified inputs, like “376+91=467”) this means that probabilistic proofs really can’t do much. All they can do is give feasible proofs of claims that would otherwise take a very long time to verify, as in the primality of a particular number.

6 responses

7 05 2009

[…] finally, a smattering of logic: Andrew Bacon discusses Restall’s Paradox, Kenny Easwaran endorses a particular kind of probabilistic proof, and I consider a slight variation on an old […]

16 06 2009

Suppose I flip a coin 2 million times. It is almost certain that the resulting bit stream will have Kolmogorov complexity larger than 1 million (i.e. the string can’t be compressed down to half its original length). But by Chaitin’s version of Goedel’s incompleteness theorem, that assertion (about the Kolmogorov complexity) is Pi-0-1 statement undecidable in any reasonable axiom system (or you could make it even worse by increasing the number of flips). There ya go.

(This example inspired by Leonid Levin’s article here: http://www.cs.bu.edu/fac/lnd/expo/gdl.htm see his example of rolling dice to make an incompressible string)

17 06 2009

That’s an interesting idea. However, I’m not sure if this gives what would be called a “probabilistic proof” that such a string has high Kolmogorov complexity. After all, nothing about the evidence would be at all different if the string had complexity of around 900,000 instead.

Consider the idea of a normal number (that is, one such that in every base, all digits have equal limiting frequency in the representation of the number). It is not hard to prove that in the interval [0,1], the set of normal numbers has measure 1. However, no one thinks that this gives anything like a probabilistic proof that 1/pi or 1/e is normal. It’s not clear to me how your example differs significantly from this, except that there is a positive probability of error rather than probability 0 of error.

The Chaitin-type statements are interesting ones to consider though, since there is some very natural sense in which they are “very likely” to be true, despite the fact that they are unprovable.

17 06 2009

I think “1/pi is normal” is like Goldbach’s conjecture with regard to probabilistic proof, while the Kolmogorov complexity statement about 2 million coin flips is more like a pseudoprime test. The difference is as follows.

Let N be a nice big number, like Ackermann(10000000000) or something like that. Suppose we can experimentally verify Goldbach’s conjecture for all even numbers up to N. We might use our experiment to form a Bayesian prior hypothesis about numbers larger than N, i.e. “for k>N, the probability that k is a counterexample to Goldbach’s conjecture is at most 1/N”. But, big as N is by the standards of normal human experience, it’s smaller than almost all of the finite natural numbers. We really can’t infer anything about “for all k” from that hypothesis. There is no probability distribution over the entire set of naturals that we can claim to have sampled.

The coin flipping experiment, on the other hand, samples from a known, finite distribution. The sample space is the 2**(2e6) strings of 2 million bits. The number of such strings with Kolmogorov complexity <= 1 million is approx 2**(1e6) which means the probability of getting such a string from random flips is approx 2**(-1e6), and the probability that the assertion is true is about (1-2**(-1e6)). Of course you can make it arbitrary close to 1 by flipping more times. We really do get a quantifiable probability bound on the statement, just like with a pseudoprime test.

There is a rather powerful implicit axiom here, namely that flipping coins actually gives Kolmogorov-random bits. They may not be perfectly p=0.5 iid since they are real coins which might have a tiny bias or correlation, but we can make fairly good estimates of the Shannon entropy which should be close to 1 bit per flip. So, our creaky old physical universe is somehow capable of accessing mathematical truth (abductive inference) that is beyond being deducible from axioms.

What’s more, if we subscribe to current beliefs about complexity theory, we really do need testimonial evidence (someone saying they actually saw the coins being flipped) to accept the statement. The stronger forms of P/=NP say that we can make a cryptographic pseudorandom number generator that, given (say) a 128-bit uniform-random seed, can expand it to arbitrary size pseudorandom strings that can’t be distinguished from true random bits by any feasible (i.e. P-time) computational process (of course it’s trivial in exp-time).

17 06 2009

I should have said, about your normal number example, it’s impossible to pick a random number with uniform probability. We can only pick a definable real, and the definables are a set of measure 0 in any interval. We don’t have any reason to think they’re not somehow special with regard to normality. But with the coin flipping thing, we’re hypothesizing that we really are (approximately) choosing independent samples from a specific, well-behaved probability distribution.

18 06 2009

After a little more thought, there are actually 3 situations.

1. primality test: for any arbitrary P fixed in advance, for any epsilon, you can verify P’s primality with confidence (1-epsilon) by doing 1-epsilon. But, unlike primality tests, this procedure tells you nothing about the complexity of an R fixed in advance.