9

If I am correct the scientific method is an application of induction to science.

Is the scientific method entirely based on statistics (statistical inference)? (I guess so, because it makes use of sampling and experimental design, which fall into the statistical discipline. But I am not sure about the inference part.)

If not, what others is it based on?

Thanks.

15
  • 5
    See Scientific Method: "scientific activity varies so much across disciplines, times, places, and scientists that any account which manages to unify it all will either consist of overwhelming descriptive detail, or trivial generalizations." Commented Sep 12, 2023 at 5:53
  • 10
    Obviously not: the greatest scientific discovery made by Galileo, Newton, Darwin, Einstein, Heisenberg had nothing to do with statistics. Commented Sep 12, 2023 at 5:54
  • 3
    Modern statistics has become such a specialized discipline since Fisher in the twentieth century that scientists had to outsource and hire statisticians in their backyard to do data analysis works, and staunch Bayesianist and physicist Jaynes once recommended just using simpler Bayesian statistics instead of those of freqentists such as Fisher so that scientists can once again do the data manipulation work themselves intuitively once again probably … Commented Sep 12, 2023 at 7:38
  • 3
    See also this post on "the" scientific method. Commented Sep 12, 2023 at 9:46
  • 5
    A uses X, therefore, A is entirely based on X. Do you see the problem with that reasoning? Science was done long before statistics was even developed, although using it certainly sharpened evidence collection and testing. But you cannot get scientific hypotheses or experiment designs out of statistics. No does statistics participate in deriving consequences from those hypotheses that can be experimentally tested. And without all of that there would be nothing to apply statistics to. A wood chisel does not equal carpentry, a single tool does not make a method.
    – Conifold
    Commented Sep 12, 2023 at 10:22

7 Answers 7

22

Ernest Rutherford said:

"If your experiment needs statistics, you ought to have done a better experiment"

which is deeply ironic, given that his subject (atomic/nuclear/particle physics) is now one of the sciences most reliant on statistics!

However, what he writes does have some truth in it. Some experiments can be made so definitive that you don't need statistics to draw conclusions, for instance astronauts dropping feathers and hammers on the moon. If you can come up with such a definitive experiment to test your hypothesis then undoubtedly you should do so. The problem is that as science advances and more and more scientists are working on problems, the proportion of problems where this is possible becomes very small.

It should be noted that statistics is often used badly in science to give and impression of rigour, sometimes without the substance. Common examples include: use of null hypothesis statistical tests where it is obvious that there is a meaningful effect; claiming statistical significance where the effect size if too small for it to be of practical significance; not considering whether 95% is the appropriate level of significance; using a "null hypothesis" of "no difference" when that is not a sensible skeptic hypothesis to be nullified; performing an NHST for the existence of a difference when you know a-priori there will be a difference (and the test is whether you have enough data to detect something you know to exist. For more details, see "Mindless Statistics" by Gerd Gigerenzer (many of the above form what he calls "The Null Ritual").

However, while that is true, good use of statistics in science is very important:

"It’s easy to lie with statistics. It’s hard to tell the truth without statistics." – Andrejs Dunkels

2
  • 4
    +1 "Lies, damned lies, and statistics!".
    – J D
    Commented Sep 12, 2023 at 20:14
  • 2
    "Some circumstantial evidence is very strong, as when you find a trout in the milk." --Henry David Thoreau
    – Wastrel
    Commented Sep 13, 2023 at 13:34
10

The experimental portion of the scientific method requires statistics. The model-building process, on the other hand, relies instead on mathematical formalism.

7
  • 1
    Thanks. By "mathematical formalism", do you mean deductive logic, specifically, first order logic?
    – Tim
    Commented Sep 12, 2023 at 2:56
  • 1
    No, I mean using mathematics to write the governing equations that describe the model. Commented Sep 12, 2023 at 3:13
  • 4
    That is a whole different question, which I suggest you post here. I'd be curious to see the responses. -NN Commented Sep 12, 2023 at 3:30
  • 5
    @Tim You might get more useful responses on history of science and mathematics, and "what did mathematicians consider to be the bases of mathematics over the course of the millennia and the present?" might be a more useful question. Math may be an exercise of deductive logic now, after much development in abstract mathematics (late 18th century through present day), but its history includes empiricism, theology, and sorcery in large measure.
    – g s
    Commented Sep 12, 2023 at 4:16
  • 2
    Math is still not just an exercise in deductive logic. In fact, one might say that deductive logic is to math as.... statistics is to science! As pointed out in comments and answers up above, there's a lot of observation and experimental design and so on before the statistics kicks in. In math, there's a lot of constructing examples and counterexamples and searching for patterns before the deductive logic kicks in.
    – Lee Mosher
    Commented Sep 13, 2023 at 23:01
7

I mean, yes but also no. Statistics is very useful for science, and philosophers have made attempts to formalise the process of science in terms of the logic of inductive inference etc. But in reality that's just an attempt to try and understand what it is that scientists were already doing. Scientists don't worry about any of that formal stuff 99% of the time. Scientific training mainly consists of studying past experiments, different theories/models, learning how to critically read the literature, and so on. There is no class on "the scientific method". That's a philosophy class, and most scientists don't take it. So in the end, "the scientific method" is kind of just "the way scientists do things", and is mostly just trial and error where the results of experiments are prioritised above theorising.

11
  • 3
    " Scientists don't worry about any of that formal stuff 99% of the time" but they should. Commented Sep 12, 2023 at 8:52
  • 1
    Should they though? Yes there are cases where they should, like in certain fields like social sciences in which performing controlled experiments is extremely tough so you have to be extremely rigorous with your methods. And indeed they do tend to worry about these things more than say physicists, where usually it's more fruitful to just build a better machine to do a better experiment than to sit around debating about philosophical points of methodology
    – Ben Farmer
    Commented Sep 12, 2023 at 9:37
  • 1
    Yes, it isn't hard to find cases of papers in e.g. environmental science where the findings have been deeply misleading because the scientists were following a recipe from the "statistics cookbook" without really understanding what they were doing. Hard to think of a better machine than CERN and they need their statistics to be rigorous. Commented Sep 12, 2023 at 10:30
  • 1
    Well, sort of. They are again pretty practically-oriented with that at CERN. I've seen all sorts of ad-hoc hybrid Bayesian-frequentist methods thrown around in particle physics that would make statisticians cry.
    – Ben Farmer
    Commented Sep 12, 2023 at 11:07
  • 2
    The caricature of social science does you no favours. Commented Sep 13, 2023 at 5:56
4

Bayesians sure like to pretend it is!

There have been many attempts to try to describe the scientific process as a knowledge-accumulating process based on rational, formal (or formalisable, read: automatible) rules.

One thing that will become clear as you look further into these attempts is that somehow, as a species, we managed to amass a considerable body of knowledge on the basis of a (comparatively speaking) extremely thin body of data. To ask how that could have been possible is asking the right question.

1
  • That question is something that vexed C.S. Peirce, in many ways prompting him to treat abductive inferences as so essential to the scientific process.
    – Hokon
    Commented Sep 13, 2023 at 9:24
2

Bayes

If you take the Bayesian view of the world, all knowledge is inherently statistical. It's just that some knowledge involves statistics at the extremes (probability 0 and 1). Human brains seem to be wired for what psychologists call "folk theories". We have "folk physics", "folk biology", "folk chemistry", etc. These are what many people call "intuition". You don't need to teach a child algebra or ballistics to teach them how to catch a baseball. After observation and many attempts, they learn automatically. And the way they learn is inherently statistical.

That's because the learning machine is literally a brain in a vat. It's a little 5 lb. computer swimming in a pool of cerebrospinal fluid, and it's doing its business entirely on the basis of electrical impulses coming in from the rest of the body. When this brain is first trying to catch a ball, it sends signals to the muscles which cause body motions that may or may not get the body closer to catching it. When it succeeds, the signal pathways are strengthened. When it fails, the pathways are weakened. But at no point does the brain start with a precise mathematical description of ballistic trajectories and then issue commands to the muscles in accordance with precise geometry and mechanics. Rather, the brain just issues motor patterns similar to other scenarios in which a comparable task was performed successfully. When the brain has no experience, it just issues motor commands randomly, because that is what its structure is primed to do (meaning, the neural connections themselves are created randomly). That's why babies spend a lot of time just moving their limbs around aimlessly. They are learning to control them.

Eventually, the brain encounters many examples of ballistic motion, and these experiences implicitly form the folk physics that it uses going forward. If you then toss an apple to that brain as an adult, and they deftly catch it absent-mindedly, and you ask: "How were you able to catch that apple?" They will say: "I didn't even think about it. I just knew how to do it. Catching things is just intuitive." But "intuitive" just means: "learned so well it no longer rises to the level of conscious experience", which is why we can't rationalize intuition. The reasons motivating it are lost to us, because they are no longer necessary.

Science

What science does is simply formalize the learning process to make things explicit. Newtonian physics would not be possible without folk physics. A scientist locked in a room, strapped to a chair with no objects to interact with would almost certainly not derive a system of mechanics. All knowledge is based on observation, and all models are a compression of those observations into a more compact representation. That is what folk physics is, after all: a compressed representation of the prediction of the next observation of a ballistic object in motion. It would simply not do to generate all possible ballistic trajectories and objects and store them in one's brain, then consult this table to decide where the object will land. That would take up too much space.

So science is ultimately about patterns: things that occur more than once. The Mongols invading Asia is not a scientific phenomenon because it does not happen regularly. It's a one-time historical event. Science is also about object phenomena: art criticism is not a scientific endeavor because it depends partly on the subjective feelings of the viewer, and scientists cannot observe or measure these feelings, only the way the viewer reports them (which is notoriously unreliable). And finally, science is about claims that can be wrong: religion is in fundamental tension with science because it makes claims that must be accepted and cannot be challenged.

That is why experiment is central to science: it is the test which tells us if a scientific claim is wrong. The problem is that an experiment can only tell us if a claim is wrong. It cannot tell us if a claim is right. That's because the positive outcome might have occurred for a reason having nothing to do with the model. And thus, the most strictly accurate statement one can make about a scientific claim is: "The probability that the hypothesis is wrong given this positive experimental outcome is less than epsilon." When scientists insist that some demonstration proves a model beyond all doubt, what they are really claiming is: "I believe that if this experiment were repeated an arbitrary number of times, you would get the same result." At the end of the day, it's all about making predictions. And experiments are the tailor-made events designed to test those predictions.

But a prediction is ultimately a single event in spacetime. Just because you predicted one event doesn't mean you can predict 2 or 1,000 or a googol. Each successful prediction increases the confidence in your model, which is why Newtonian mechanics is regarded as "true" in its domain of applicability: it has been validated so many times in so many ways that physicists would be shocked to observe a bona fide violation. Basically all of human technology depends on it.

Model Building

When people say that science is more than statistics, they are taking umbrage at the fact that there is some intellectual process going on beyond mere counting of beans. The process of building a model feels like high art. And for some models that is indeed a fair description. But at the end of the day, a hypothesis is just a compression of a large number of observations. The trick is that the model does not represent the entire observation. It only represents some particular dimensions of interest. Science is as much about choosing what to ignore as it is choosing what to observe. And deciding such things is more sophisticated than mere counting. Or is it? Every set of observations contains an innumerable number of irrelevant details. Surely picking the interesting observational dimensions requires deep intelligence? Perhaps not.

At the end of the day, observations must be measured, and there are only so many dimensions we can measure reliably. Thus, our models are constrained to things we can measure in one way or another. When we try to determine how fast a horse can run, we don't bother to measure the effect of the movement of individual tail hairs, because the effect is small, and because we don't have a feasible way to measure that even if we wanted to. In many cases, the relevant dimensions to include in the model are somewhat obvious because there are only a handful of practically measurable observables.

And even if we naively recorded every dimension for which we can measure precisely and reliably, it would still be possible to infer the most predictive dimensions...statistically. Principal Components Analysis does exactly this. It is not beyond reason to think that a robot scientist could simply collect all the data without a starting hypothesis, perform an analysis, determine that several dimensions are strongly predictive, and then build an obvious and straightforward model based on those dimensions. On some level, this is what the human scientists are doing, too. They just aren't being honest about it. The PCA is running in their own heads, below the level of consciousness. But even the process of building a model is ultimately statistical in nature.

Logic

Now, saying that "science is just statistics" is being too glib. There is an entire branch of "data science" that deals with the special edge cases of statistics: things that are always true or always false. And we call that field "Logic". Logic is powerful because it enables deduction. If A implies B and B implies C, then A implies C. But if A implies B 20% of the time, and B implies C 15% of the time, does that mean that A implies C 3% of the time? No. It depends on which cases of B imply C, because there may be a correlation which makes the A->C implication larger or smaller. Trying to do inference over uncertainty is difficult in the best of circumstances.

Quite a bit of model building does not entail observations themselves, but rather combining other models together. And in this case, we can use logic to constrain how the unified model must look, thus allowing us to eliminate a large number of candidates without running a single experiment. I think this process is part of why scientists bridle when you suggest that science = statistics. A pedant could argue that logic is just "degenerate statistics" because it's just statistics over the probabilities {0, 1}, and you could certainly derive logic from that foundation if you so desired. But emergent phenomena are things that can only exist at a particular level of complexity, and logic is one of those things. Framing it in terms of statistics is not helpful, which is why nobody does it.

Conclusion

To the extent that science is about building models that stand up to experimental test, I think it's fair to say that science is statistical. The fact that the process of science involves some things that are not statistical in nature (like using logic to build models) means that it is perhaps too strong to say that science is entirely based on statistics. But if you absolutely wanted to insist that it is, and you are not embarrassed by an excessive degree of pedantry, then you could probably make this case.

2
  • 1
    "we can't rationalize intuition" - just to be pedantic, catching an apple is roughly some combination of muscle memory, subconscious muscle control and reflex, none of which apply when using intuition in reasoning. Intuition in reasoning is not reliable and you can understand where that came from to a large degree. You might intuitively know the Sun will rise in the morning, but you know that because it's risen every morning, and also because you know about Earth's orbit. You might also "know" something because that's what your parents or a teacher told you, which is far less good of a reason.
    – NotThatGuy
    Commented Sep 13, 2023 at 8:16
  • @NotThatGuy I call "intuition" any knowledge that has been gained implicitly, rather than explicitly. So an outfielder catching a fly ball is the result of hours of practice. But a person catching a random object generally is not. If you ask Michael Jordan how he managed an 83% free-throw when the league average was 76%, he could give you detailed descriptions of how he sets up, aims, positions his elbow, hands, legs, etc. But if you ask him how he scored 50+ games, the answers will be a lot more nebulous and questionable. Commented Sep 13, 2023 at 18:49
2

No. There are at least three important counter-arguments to the claim that "the scientific method is entirely based on statistics", roughly in order of importance according to me:

  1. Many advances in science were made without any (even implicit) use of statistics. As noted in Dikran's answer many experiments/observations can just be performed so clearly that using statistics makes no sense.
  2. Statistics is useless without distinguishing causality and correlation. Recent-ish work on causal inference gives some useful answers in that regard (that many past scientists understood, just less formally). Your definition may vary, but causal inference, experimental design and related considerations are not an integral part of statistical inference, but are crucial to scientific inference from uncertain evidence and thus the word "entirely" in the claim is misplaced.
  3. There is whole array of qualitative research methods, which almost by definition do not reduce to statistics.

The historical considerations are IMHO the most important ones. Claiming that the scientific method requires statistics would imply that what many consider great scientific achievements was not "real science". You can obviously define science as relying on statistical inference, but you'd get pretty far away from the normal meaning of the word.

Some (pretty random) examples of scientific developments completely without statistics:

  • Most development of observation/measurement methods, e.g.:
  • Linnaeus's taxonomy
  • Germ theory
  • Planetary observations (e.g. that there are mountains on the moon as observed by Galileo, the phases of Venus as refutation of the geocentric system, ...)
  • Demonstrations of superfluidity/superconductivity
  • Arago spot as a crucial demonstration that light has wave-like properties.

Finally, the premise that scientific method is fundamentally inductive is at best debated - many succesful philosophical accounts of science (notably Popper's and his intellectual descendants) would argue that induction has no place in science.

2
  • 1
    A related blog post.
    – Galen
    Commented Sep 13, 2023 at 19:46
  • +1 Outstanding form as rebuttal.
    – J D
    Commented Sep 13, 2023 at 20:09
2

If I am correct the scientific method is an application of induction to science.

You are not correct.

Is the scientific method entirely based on statistics (statistical inference)? (I guess so, because it makes use of sampling and experimental design, which fall into the statistical discipline. But I am not sure about the inference part.)

If not, what others is it based on?

An observation is a claim that a particular kind of event happened at a particular place and time. For example, Eddington observed the position of starlight on photographic plates during a solar eclipse on 29 May 1919 on the West African island of Príncipe.

Nothing follows from that fact alone other than that Eddington made that observation. If the observation agreed with general relativity, it wouldn't imply that general relativity is true. An agreement between a theory and any finite set of observations doesn't imply that the theory is true. Every observation for over a century agreed with Newtonian mechanics, but that theory was false.

In addition, a scientific theory is an account of what is happening in reality, not a digest of observations and it may describe events that can't be observed by any physically possible measuring instrument, e.g. - what happens in the core of a star during a supernova.

Now, if general relativity is true, then it may follow that the apparent position of stars should be changed by their light passing near the sun. If the position of the stars isn't changed by the sun, then that creates a problem. Either there is something wrong with the theory or there is something wrong with the experiment or the interpretation of its results. And even if the experiment agrees with the theory there may be something wrong with the experiment as illustrated by the controversy over whether Eddington's results were correctly interpreted.

As Karl Popper pointed out science proceeds by guessing solutions to problems with current theories that may come from experiment or from theory or whatever. An observation may then be done that distinguishes between a current theory and a new theory that claims to solve some problem. The result of the experiment would then provide a problem for the theory that makes false predictions that would have to be solved by rejecting either the theory or the experimental result as described above. For some readings see

https://fallibleideas.com/books#popper

Using statistics doesn't change this problem because the probabilities of outcomes also come from some theory about how the probabilities should be computed. And that theory also can't follow from observations. Rather it has to be chosen to be consistent with whatever theory you're using. For example, in quantum theory the square amplitudes that give the probabilities of observations don't always obey the rules of probability so you have to be careful about applying the rules of probability:

https://arxiv.org/abs/math/9911150

https://arxiv.org/abs/1508.02048

In addition, if you want to make a decision then you have to select one option and reject others, not attach a probability to a theory

https://criticalfallibilism.com/yes-or-no-philosophy-summary/

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .