« That career thing | Main | One more hurdle, cleared. »
July 17, 2006
Uncertainty about probability
In the past few days, I've been reading about different interpretations of probability, i.e., the frequentist and bayesian approaches (for a primer, try here). This has, of course, led me back to my roots in physics since both quantum physics (QM) and statistical mechanics both rely on probabilities to describe the behavior of nature. Amusingly, I must not have been paying much attention while I was taking QM at Haverford, e.g., Neils Bohr once said "If quantum mechanics hasn't profoundly shocked you, you haven't understood it yet." and back then I was neither shocked nor confused by things like the uncertainty principle, quantum indeterminacy or Bell's Theorem. Today, however, it's a different story entirely.
John Baez has a nice summary and selection of news-group posts that discuss the idea of frequentism versus bayesianism in the context of theoretical physics. This, in turn, led me to another physicist's perspective on the matter. The late Ed Jaynes has an entire book on probability from a physics perspective, but I most enjoyed his discussion of the physics of a "random experiment", in which he notes that quantum physics differs sharply in its use of probabilities from macroscopic sciences like biology. I'll just quote Jaynes on this point, since he describes it so eloquently:
In biology or medicine, if we note that an effect E (for example, muscle contraction) does not occur unless a condition C (nerve impulse) is present, it seems natural to infer that C is a necessary causative agent of E... But suppose that condition C does not always lead to effect E; what further inferences should a scientist draw? At this point the reasoning formats of biology and quantum theory diverge sharply.
... Consider, for example, the photoelectric effect (we shine a light on a metal surface and find that electrons are ejected from it). The experimental fact is that the electrons do not appear unless light is present. So light must be a causative factor. But light does not always produce ejected electrons... Why then do we not draw the obvious inference, that in addition to the light there must be a second causative factor...?
... What is done in quantum theory is just the opposite; when no cause is apparent, one simple postulates that no cause exists; ergo, the laws of physics are indeterministic and can be expressed only in probability form.
... In classical statistical mechanics, probability distributions represent our ignorance of the true microscopic coordinates - ignorance that was avoidable in principle but unavoidable in practice, but which did not prevent us from predicting reproducible phenomena, just because those phenomena are independent of the microscopic details.
In current quantum theory, probabilities express the ignorance due to our failure to search for the real causes of physical phenomena. This may be unavoidable in practice, but in our present state of knowledge we do not know whether it is unavoidable in principle.
Jaynes goes on to describe how current quantum physics may simply be in a rough patch where our experimental methods are simply too inadequate to appropriately isolate the physical causes of the apparent indeterministic behavior of our physical systems. But, I don't quite understand how this idea could square with the refutations of such a hidden variable theory after Bell's Theorem basically laid local realism to rest. It seems to me that Jaynes and Baez, in fact, evoke similar interpretations of all probabilities, i.e., that they only represent our (human) model of our (human) ignorance, which can be about either the initial conditions of the system in question, the causative rules that cause it to evolve in certain ways, or both.
It would be unfair to those statistical physicists who work in the field of complex networks to say that they share the same assumptions of no-causal-factor that their quantum physics colleagues may accept. In statistical physics, as Jaynes points out, the reliance on statistical methodology is forced on statistical physicists by our measurement limitations. Similarly, in complex networks, it's impractical to know the entire developmental history of the Internet, the evolutionary history of every species in a foodweb, etc. But unlike statistical physics, in which experiments are highly repeatable, every complex network has a high degree of uniqueness, and are thus more like biological and climatological systems where there is only one instance to study. To make matters even worse, complex networks are also quite small, typically having between 10^2 and 10^6 parts; in contrast, most systems that concern statistical physics have 10^22 or more parts. In these, it's probably not terribly wrong to use a frequentist perspective and assume that their relative frequencies behave like probabilities. But when you only have a few thousand or million parts, such claims seems less tenable since it's hard to argue that you're close to asymptotic behavior in this case. Bayesianism, being more capable of dealing with data-poor situations in which many alternative hypotheses are plausible, seems to offer the right way to deal with such problems. But, perhaps owing to the history of the field, few people in network science seem to use it.
For my own part, I find myself being slowly seduced by their siren call of mathematical rigor and the notion of principled approaches to these complicated problems. Yet, there are three things about the bayesian approach that make me a little uncomfortable. First, given that with enough data, it doesn't matter what your original assumption about the likelihood of any outcome is (i.e., your "prior"), shouldn't bayesian and frequentist arguments lead to the same inferences in a limiting, or simply very large, set of identical experiments? If this is right, then it seems more reasonable that statistical physicists have been using frequentist approaches for years with great success. Second, in the case where we are far from the limiting set of experiments, doesn't being able to choose an arbitrary prior amount to a kind of scientific relativism? Perhaps this is wrong because the manner in which you update your prior, given new evidence, is what distinguishes it from certain crackpot theories.
Finally, choosing an initial prior seems highly arbitrary, since one can always recurse a level and ask what prior on priors you might take. Here, I like the ideas of a uniform prior, i.e., I think everything is equally plausible, and of using the principle of maximum entropy (MaxEnt; also called the principle of indifference, by Laplace). Entropy is a nice way to connect this approach with certain biases in physics, and may say something very deep about the behavior of our incomplete description of nature at the quantum level. But, it's not entirely clear to me (or, apparently, others: see here and here) how to use maximum entropy in the context of previous knowledge constraining our estimates of the future. Indeed, one of the main things I still don't understand is how, if we model the absorption of knowledge as a sequential process, to update our understanding of the world in a rigorous way while guaranteeing that the order we see the data doesn't matter.
Update July 17: Cosma points out that Jaynes's Bayesian formulation of statistical mechanics leads to unphysical implications like a backwards arrow of time. Although it's comforting to know that statistical mechanics cannot be reduced to mere Bayesian crank-turning, it doesn't resolve my confusion about just what it means that the quantum state of matter is best expressed probabilistically! His article also reminds me that there are good empirical reasons to use a frequentist approach, reasons based on Mayo's arguments and which should be familiar to any scientist who has actually worked with data in the lab. Interested readers should refer to Cosma's review of Mayo's Error, in which he summarizes her critique of Bayesianism.
posted July 17, 2006 03:30 PM in Scientifically Speaking | permalink