Statistics filters scientific research to find facts in the data. The p-value is one of the most important statistical calculations done on a data set — it tells you how likely your experiment would be to get the same results if run again (or at least I think I’ve got that right). But it does not tell you the magnitude of the effect or the strength of the evidence. It does not tell you if the experiment was run correctly, nor if the other stats were done correctly. It certainly does not tell you how meaningful the results are. The p-value is useful, but it is easy to be deceived by it, and to deceive others using it.

Surprisingly, as this post by Christie Aschwanden claims, most scientist throw around p-values to support their claims yet do not really understanding its meaning and potential abuse. Now that is faith!

I illustrated the four main uses of the word “faith” here. The meaning I am using here is “trust”. And in an interview on that post, one scientist, when asked to define “the p-value” said:

“I know what many people that I have respected have written about [the p-value] and in fact quoted them. Is that a round about enough way to dodge your question.”

And indeed, that is what religious folks do. They listen to folks they trust, the read folks they trust and though they may not really understand the issue themselves, they trust these people. They have faith.

**Take home message**: “Faith” is useful, but we still need to remain skeptical.

First of all, let us keep in mind that that the application of Statistics is not an exact science and mistakes are possible. Moreover, many (fortunately not all!) statisticians themselves suffer from the disease of faith. Why? Because they do not understand the tools they are using and that is because of their inability to understand mathematics. So, faith + lack of understanding makes statistics sort of close to religion. Fortunately, there are many who understand and are aware of mistakes.

I’m not sure I understand your definition of p-value. First of all, it appears that you have drawn a so-called Gaussian density. This is an approximation of how large sums behave under some assumptions that may not be valid in practice. Let me try to define the p-value. We run an experiment and collect data. We make a hypothesis about the *probability law* governing the experiment. This is just a hypothesis that may or may not be true. Once we make the hypothesis we have a mathematical formula that can be applied to the data. We also need a “statistic” X which is something that depends on the data. For example, X could be the sum of the data (assuming that the data is a collection of numbers). Our mathematical formula tells us how to compute the probability p(s) that the observed sum is at least s. Since this is a formula, it can be applied to any s. Now we measure the actual sum we obtained from the data, call it x. Plug it into the formula and find p(x). This is the “p-value”. This is a concrete number. Independently of this, we make an assumption about the “significance level” a of the experiment. This is a small number and is rather arbitrary. Let us say that the significance level is a=5%. And here is how we take the decision: If p(x) is less than a we reject the hypothesis we made. If p(x) is bigger than a we don’t reject it; in this case, perhaps, we make a further experiment or use other statistics in order to become more confident.

The whole thing depends: (i) on how we collect data (errors are possible), (ii) on what kind of hypothesis we make (a bad hypothesis makes things bad), (iii) on which statistic we use (many choices possible), (iv) on what significance level we choose (an empirical number). So, whether we make the right decision or not (e.g., does watching TV influence schizophrenia?) depends on many factors. Faith may enter at several stages.

For example, let’s say we want to test whether a die is fair or not. We decide to roll the die 10 times. Let’s make the assumption that the die is fair. Let’s pick as test statistic the sum of the throws. Mathematics tells us how to compute the probability p(s) that the sum is at least s. I computed this and found

p(60)=0.0000000165, p(59)=0.000000182, p(58)=0.00000109, p(57)=0.00000472, p(56)=0.0000165, p(55)=0.0000495, p(54)=0.000132, etc.

Let us say that we observed the following throws: 6, 5, 5, 6, 6, 6, 6, 5, 6, 6. Their sum is 57. Let’s set a=1% = 0.01 as significance level. So the p-value is p(57)=0.00000472. That’s way much smaller than 0.01. So we can (safely) reject the hypothesis that the die is fair. Errors are, of course, possible.

In your posting you have apparently used the so-called Gaussian distribution. This is used when the mathematical formula I talked about before is not known, as an approximation. Yet another source of possible error.

While I was writing this, the f*****g computer crashed. We have crappy systems at work. Fortunately, it was there when I restored the system.

Morale of what I was saying is this:

Your linked article states “Not Even Scientists Can Easily Explain P-values”. This is clear. Not even (many, but not all) statisticians can explain what the basis of their science is owing to (i) lack of understanding of mathematics and (ii) prejudices and beliefs that they can’t shed away because they never think about thinking.

Many scientists are religious, without even realizing it: they follow the trodden path without ever questioning it. (This is religion.) The believe what they read (this is religion) and–what is worse–what their bosses tell them (this is religion again).

Religion is prevalent in science.

And so, yes, “Not Even Scientists Can Easily Explain P-values”. No paradox here.

A true scientist should be willing to reject everything (for good reasons) but also have the b***s to reconstruct it.