# Test priori power - Welch T test

#### obh

##### Active Member
Hi All,

I didn't find the answer for how to calculate the prioir power of Welch's t-test directly.
some suggest using a simulation.
I know that calculate the power for "pooled variance t-test" should also give a reasonable estimation for the Welch's t-test .

the power of "pooled variance t test" is: (for right tail example)
power = 1 - non-central t (t_critical, df , n)
t_critical=t1-α
df=n1+n2 - 2
n=1/(1/n1+1/n2)
non cenral parameter=d√n

Welch's df= (s1^2/n1 + s2^2/n2)^2 / (s1^4/(n1^2(n1-1) + s2^4/(n2^2(n2-1)

Is the power of Welch's t-test the same as for "pooled variance t-test", except for the df?

#### obh

##### Active Member
Hi,

Under the Welch's assumption that S1 ≠ S2
and S population = sqrt( (S1^2+S2^2)/2 )

I assume also n= n1n2(S1^2+S2^2) / (2*n2*S1^2+2*n1*S2^2)

Do you have any idea if correct?

#### obh

##### Active Member
I also can see R doesn't calculate the power for Welch's t-test?

#### obh

##### Active Member
hlsmith, gretagarbo, karabiner, do you have any idea?

#### obh

##### Active Member
Thank you Dason!

I looked at pwr.t2n.test : https://www.statmethods.net/stats/power.html
But I assume I shouldn't consider R as one software but as many packages I will try now the power.t.test
and see if my calculation is correct ...

#### obh

##### Active Member
Hi Dason,

I loaded the "Mess" package and it still doesn't recognize the df.method parameter. (works without this parameter)
So I guess it chooses the power.t.test function from different package?

Any idea?

library(MESS)
> power.t.test(n=64, delta=5, sd=10, sig.level = 0.05, power = NULL, type = c("two.sample"), alternative = c("two.sided"), df.method = "welch")
Error in power.t.test(n = 64, delta = 5, sd = 10, sig.level = 0.05, power = NULL, :
unused argument (df.method = "welch")

#### obh

##### Active Member
Okay, it is power_t_test not power.t.test ... >power_t_test(n = 34, delta = 5, sd = 10, sig.level = 0.05, power = NULL, ratio = 1, sd.ratio = 2, type = c("two.sample"), alternative = c("one.sided"), df.method = c("welch"))

Two-sample t test power calculation

n = 34
delta = 5
sd = 10
sig.level = 0.05
power = 0.6537609
alternative = one.sided

NOTE: n is number in *each* group

#### obh

##### Active Member
Hi Dason,

I tried to calculate myself the power of Welch t-test, simple right tail. power=0.5540257
The function leads to a different result: power_t_test. => power=0.6779871
(I think that power_t_test uses ncp=diff/(s1 / sqrt(n1/(1+(s2*n1)/(s1*n2)) not that I understand why)

Do you have any idea what is wrong with the following simple calculation?

> diff=5; s1<-10; s2<-20; n1<-30; n2<-90
> df<-(((s1^2/n1)+(s2^2/n2))^2) / ( ((s1^2/n1)^2)/(n1-1) + ((s2^2/n2)^2)/(n2-1) ) #the known formula for Welch DF
> s_stat<- sqrt((s1^2/n1)+ (s2^2/n2)) # #the known calculation for the statistic's standard deviation
> t_cr<-qt(0.95,df)
>
> #H1
> ncp<-diff/s_stat
>
> #power using non central t distribution
> 1-pt(t_cr, df, ncp)
 0.5540257
>
> df
 99.97549
> t_cr
 1.660238
> s_stat
 2.788867
> ncp
 1.792843
=========================
>library(MESS)
> power_t_test(n = 30, delta = 5, sd = 10, sig.level = 0.05, power = NULL, ratio = 3, sd.ratio = 2, type = c("two.sample"), alternative = c("one.sided"), df.method = c("welch"))

Two-sample t test power calculation with unequal sizes

n = 30, 90
delta = 5
sd = 20
sig.level = 0.05
power = 0.6779871
alternative = one.sided

NOTE: n is vector of number in each group

Last edited:

#### Dason

I would simulate to figure out which makes more sense

#### obh

##### Active Member
Thanks Dason,

I'm new with R, I never run simulation before ...but it wasn't complex

Is the following ok ?

> n1 <- 30; n2 <- 90 # sample size
> sigma1 <- 10; sigma2 <- 20 # true SD
> delta <- 5 # change
> mu1 <- 100 # mean under the null hypothesis
> mu2 <- 100+delta # mean under the null hypothesis
>
> reps <- 5000000 # number of simulations
>
> ## p-value approach:
>
> pvalues <- numeric(reps)
>
> set.seed(1)
>
> for (i in 1:reps) {
+ x1 <- rnorm(n1, mu1, sigma1)
+ x2 <- rnorm(n2, mu2, sigma2)
+ s1=sd(x1)
+ s2=sd(x2)
+ df<-(((s1^2/n1)+(s2^2/n2))^2) / ( ((s1^2/n1)^2)/(n1-1) + ((s2^2/n2)^2)/(n2-1) ) #the known formula for Welch DF
+ s_stat<- sqrt((s1^2/n1)+ (s2^2/n2)) # the known calculation for the statistic's standard deviation
+ t.stat <- (mean(x2) - mean(x1))/ s_stat
+ pvalues <- 1 - pt(t.stat, df
+ )
+ }
>
> mean(pvalues < 0.05)
 0.5539248

#### Dason

Even better would just be to simulate the data and let the functions in R take care of running the test for you.

#### obh

##### Active Member
Thanks Dason Good advice, this will reduce the possibility to make the same mistake also in the simulation ...
Okay, at last, I found the $p.value ... Is this okay now? > n1 <- 30; n2 <- 90 # sample size > sigma1 <- 10; sigma2 <- 20 # true SD > delta <- 5 # change > mu1 <- 100 # mean under the null hypothesis > mu2 <- 100+delta # mean under the null hypothesis > > reps <- 1000000 # number of simulations > > ## p-value approach: > > pvalues <- numeric(reps) > > set.seed(1) > > for (i in 1:reps) { + x1 <- rnorm(n1, mu1, sigma1) + x2 <- rnorm(n2, mu2, sigma2) + pvalues <- t.test(x2,x1,alternative="greater")$p.value
+ }
>
> mean(pvalues < 0.05)
 0.55509
>

#### GretaGarbo

##### Human
Even better would just be to simulate the data and let the functions in R take care of running the test for you.
Even better would just be to just put the code in a code window. But I just don't understand how that code can work without saving the individual "pvalues".

pvalues[i]

Aha, the parantheses disappears outside of a code window, due to Talskstats treatment.

(And for mee 10000 replicates would be enough.)

Code:
n1 <- 30; n2 <- 90 # sample size
sigma1 <- 10; sigma2 <- 20 # true SD
delta <- 5 # change
mu1 <- 100 # mean under the null hypothesis
mu2 <- 100+delta # mean under the null hypothesis

reps <- 10000 # number of simulations

## p-value approach:

pvalues <- numeric(reps)

set.seed(1)

for (i in 1:reps) {
x1 <- rnorm(n1, mu1, sigma1)
x2 <- rnorm(n2, mu2, sigma2)
s1=sd(x1)
s2=sd(x2)
df<-(((s1^2/n1)+(s2^2/n2))^2) / ( ((s1^2/n1)^2)/(n1-1) + ((s2^2/n2)^2)/(n2-1) ) #the known formula for Welch DF
s_stat<- sqrt((s1^2/n1)+ (s2^2/n2)) # the known calculation for the statistic's standard deviation
t.stat <- (mean(x2) - mean(x1))/ s_stat
pvalues[i] <- 1 - pt(t.stat, df )
}

mean(pvalues < 0.05)

#####

pvalues <- numeric(reps)

set.seed(1)

for (i in 1:reps) {
x1 <- rnorm(n1, mu1, sigma1)
x2 <- rnorm(n2, mu2, sigma2)
pvalues[i] <- t.test(x2,x1,alternative="greater")\$p.value
}

mean(pvalues < 0.05)
#  0.5562

hist(pvalues, breaks = 30)

Last edited by a moderator:

#### obh

##### Active Member
Thanks Greta I'm new with R ...I assume you mean I should run the code in "R Editor" instead of the console?
Definitely, 10000 is sufficient, I just try to see if getting exactly the same as my "manual" calculation, I don't care the computer will work hard 2 min ... I can see the simulation support my manual calculation and not the power_t_test. (leads to power=0.6779871)

So can I assume the manual calculation for Welch t-test is correct and power_t_test is wrong? (or misused by me ...)

Thanks,
Obh

Manual calculation
> diff=5; s1<-10; s2<-20; n1<-30; n2<-90
> df<-(((s1^2/n1)+(s2^2/n2))^2) / ( ((s1^2/n1)^2)/(n1-1) + ((s2^2/n2)^2)/(n2-1) ) #the known formula for Welch DF
> s_stat<- sqrt((s1^2/n1)+ (s2^2/n2)) # #the known calculation for the statistic's standard deviation
> t_cr<-qt(0.95,df)
>
> #H1
> ncp<-diff/s_stat
>
> #power using non central t distribution
> 1-pt(t_cr, df, ncp)
 0.5540257

#### obh

##### Active Member
Okay, I checked with the nice person from R that wrote the function and it appears that he missed squared (for SD), so the manual calculation is correct. I can close this long thread
Thanks Dason and Greta for your help
This is the best statistics forum.

#### GretaGarbo

##### Human
I'm new with R ...I assume you mean I should run the code in "R Editor" instead of the console?
I am not sure what you mean here. Well, I use RStudio, and I believe that most people do. I suggest that you download it and use it. It is great.

Someone said that statisticians tend to forget that they are statisticians when they simulate. Notice that you can do a confidence interval since the sum of significances is binomially distruted and its variance is the usual "n*p*(1-p)".

#### obh

##### Active Member
Thanks Greta.
I will try the RStudio
I couldn't understand why is the sum of the p values binomially distributed. But may be it is the late hour in Melbourne.

Yes the simulation is great #### GretaGarbo

##### Human
I couldn't understand why is the sum of the p values binomially distributed. But may be it is the late hour in Melbourne.
In each run it is significant or not. It is 1 or 0. That is a Bernoulli trial with probability p. (That p is the power that you want to estimate). The sum of Bernoulli trials is binomially distributed. There I had an "n" of 10000. So the confidence interval for the propotion p will be p +/- 1.96*sqrt(p*(1-p)/n).