The weak likelihood principle (WLP) has been summarized as: If a sufficient statistic computed on two different samples has the same value on each sample, then the two samples contain the same inferentially useful information. The WLP is usually described as a "widely accepted" or "very reasonable" statement, but not as a theorem. From this I infer that there exists no proof of the WLP.
My question is, why does the absence of a proof for the WLP not prompt skepticism about it? Or, if not skepticism, then at least pressure to find a proof (or a disproof) among statisticians? Or among mathematicians, for that matter? Why is the WLP not the subject of a Millennium Prize (or its statistical equivalent)? Why do we not regard it as the Fermat's Last Theorem of probability and statistics? (Maybe the parallel postulate would be a better analogy....)
As far as answers go, I'd appreciate either/both of two types: theoretical explanations ("We don't need to prove it because..." or "Actually, there is a theorem, see reference...") or historical explanations ("Early statisticians went through a phase when they tried to find a proof but ultimately settled for an axiom..." or "Fisher bullied people into it..."). (My own search turns up no evidence along any of these lines, but I'd welcome examples if they exist.)
Clarification based on input from responders: I'm calling this the WLP but some may prefer to identify it as the sufficiency principle (SP). I'm okay with that, because the SP implies the WLP. Alternatively, you could say the SP is the mathematical statement made by Fisher and proven by the factorization theorem--that the sufficient statistic contains all the parameter information in the sample, and that the sample conditional on a sufficient statistic is independent of the parameter--and that the WLP takes this a step further by insisting there is no non-likelihood information in the sample that's inferentially useful. I'm okay with that, too. Whether it's called the WLP or SP, and whether it involves only the likelihood function or includes the sampling distribution, both are empirical claims about the best possible estimate calculable on a sample in practice, and there seems to be no imperative for proving either.
Edit 2: I think an answer is materializing across both answers and both sets of comments. If someone agrees and wants to write this up, or modify further, I'll give it a checkmark. a) Statistics lacks a formal axiomatic system. b) Instead, statistics relies on these things called "principles," which are like axioms or postulates except they arise by convention and are adopted by consensus (implicit or explicit). c) No one expects or even hopes to turn these principles into theorems, because without a formal system of axioms, it may well be impossible, and in any event it's very hard to know how or where to start. d) Birbaum's proof of the SLP is the exception that proves the rule, in that he was able to deduce a strong principle from two weaker ones (controversially). e) If someone were to prove or contradict the WLP, it would be another such exception.