# Binomial exact test - calculate p-value of two tailed test

#### obh

##### Well-Known Member
Hi,
In the below example 1, (x=2, n=20, p=0.25) two tailed test.
I would expect the p-value of the two-tailed test to be 2* p(X≤2)=2* 0.09126043=0.18252086

I understand that probability in the right tail is discrete so it can be 0.101811857 or 0.040925168
So it seems that in R takes the value which is closer to 0.09126043 (some cases bigger and some cases smaller)
but still, don't feel it is correct to choose one of the discrete values.

Some test examples compare the one value based on x (in this example 0.09126043) to alpha/2.
So this is exactly like using 2* p(X≤2)

So my question if there is a correct method? or why not using 2* p(X≤2) in this example?

Thanks

Example 1
> binom.test(x=2, n=20, p = 0.25, alternative="less", conf.level = 0.95)$p.value  0.09126043 > binom.test(x=2, n=20, p = 0.25, alternative="two.sided", conf.level = 0.95)$p.value
 0.1930723

p(X≥8)=0.101811857
p(X≥9)=0.040925168
0.09126043+0.101811857=0.1930723

Example 2
> binom.test(x=4, n=20, p = 0.25, alternative="less", conf.level = 0.95)$p.value  0.4148415 > binom.test(x=4, n=20, p = 0.25, alternative="two.sided", conf.level = 0.95)$p.value
 0.7976688

p(X≥5)=0.585158497
p(X≥6)=0.382827346
0.4148415+0.382827346=0.7976688

Last edited:

#### obh

##### Well-Known Member
Okay, I search in several books and didn't find an explanation.

I check in the R code and for the other tail calculation, instead of looking on the accumulate tale, they are looking at the last value (p(X=x).
And looking for the last value of the other tail x', while p(X=x')≤p(X=x)

So it is a different method than the continuous distribution that use and equal accumulation distribution in each tail

Example (x=1, n=8 , p-0.25)
the last value in the the left tail:
p(X=1)=0.266968

the first value in the right tail
p(X'=3)=0.207642 . while the next bigger right tail p(X=2)=0.311462 >0.266968.
so we take x'=3

p-value = p(X≤1) + p(X≥3) = 0.367081 + 0.321457 = 0.688538

> binom.test(X =1, n =8, p =0.25, alternative="two.sided", conf.level = 0.95)\$p.value

 0.6885376 Last edited:

#### GretaGarbo

##### Human
I did not notice this thread.

What is it that you what to investivate here? Is it about the nonparametric binomial test?

#### obh

##### Well-Known Member
I tried to understand how to calculate the two tails p-value of the binomial exact test (using the discrete binomial distribution). And why?

I read there are 2 methods but could understand only one.

I found that R calculates base on the density value if x is the actual value and x' is the other tail value
For convenience, I write only for the left tail example: p-value =p(X≤x)+p(X≥x')
R calculate the x' based on the following:
The bigger p(X=x') that meets the following : p(X=x')≤p(X=x)

The questions: (and partially potential answers ...)

1. why using the density instead of the accumulate distribution p(X≥x')<=p(X≤x)? like we use in continuous distribution.
( is this the other method?)

2. The p-value result will be smaller, say more extream. why not "The smaller p(X=x') that meets the following : p(X=x')≥p(X=x)" which is bigger for x' say less extream?

my answer: if for example, binomial(n=8, x=1, p0=0.35) p(X≤x)=0.266968 now you try to go from the right tail to the left until 0.2669 and the only discrete value that falls in the range [0.2669,0] is smaller (for example 0.207642)

But why not going from 0.2669 to the left until one value exist, like 0.3114?

3. I understand that you can't get the same p-value for the other tail (x'), but why not just using only the left tail p-value=2 * p(X≤x)? (say 2*0.2669)

my answer: because this value doesn't exist?

Thanks

Last edited:

#### GretaGarbo

##### Human
I am sorry but I don't understand the notation. The usual way to write it, as I understand it, is to write for the random variable X and the value x, the probability that X is equal to x is P(X =x). And for the binomial distribution for n =8 trials and with p=0.35 gives P(X = x) = [for x=1] =P(X=1) = 0.1372624.

Code:
dbinom(x = 1, size = 8, prob=0.35)
 0.1372624
I tried to get the number "0.266968", but I could not get where it came from. I was not sure if "<=" means "less or equal to". (Or is it an arrow?)

If this question is about how to get p-values, or limits for a confidence interval, for a distribution that is not symmetric, then (as I remember it) there is no agreed solution. To make the interval symmetric around the estimated parameter, or to make the upper p-value equal to the lower p-value (e.g. 2.5% each).

Sorry for my lack of understanding. #### obh

##### Well-Known Member
Hi Greta,

Thanks for looking.

Sorry I was confused I wrote p(x=X): x-variable, X value ...but it should be as you wrote. You missed the p, it is 0.25:
Code:
> dbinom(x = 1, size = 8, prob=0.25)
 0.2669678
Sorry: "<=" means "less or equal to", ≤ . I didn't write any arrows.

I updated all the thread to the correct notation, so it will be readable.

Thanks.
OB

Last edited: