# Minimum Sample size for multiple linear regression

#### obh

##### Well-Known Member
Hi,

I tried to calculate the minimum sample size for multiple linear regression.

I tried to check the sample size for predictors=4, effect size f=0.2/d=0.2, sig.level =0.05, power=0.8

1. When I checked the power of the entire model (F power) n=304
2. When I checked the power of one coefficient (t power) n=198
2. When I checked the power of one coefficient with Bonferroni correction(t power) n=281 (sig.level =0.05/4)

I probably doing something wrong as I get a smaller sample size when evaluating each coefficient??? (R code below)

You may say regression effect size f and t-test d are not the same. so Low effect is d=0.2 and f=0.14 get even larger sample size with F power.

Thanks

----------------------------------------------------------------------------------------------------------------

Per Green, if you check only for R squared, say the entire model is significant: n=50 + 8*predictors.
and if you try to evaluate the coefficients: n=104+predictors

Code:
> pwr.f2.test(u =4, v=(304-4-1), f2=0.04, sig.level =0.05)

Multiple regression power calculation

u = 4
v = 299
f2 = 0.04
sig.level = 0.05
power = 0.8012571

> pwr.t.test(power=0.8, d = 0.2 , sig.level = 0.05 ,alternative="two.sided" , type = "one.sample")

One-sample t test power calculation

n = 198.1508
d = 0.2
sig.level = 0.05
power = 0.8
alternative = two.sided

> pwr.t.test(power=0.8, d = 0.2 , sig.level = 0.05/4 ,alternative="two.sided" , type = "one.sample")

One-sample t test power calculation

n = 281.903
d = 0.2
sig.level = 0.0125
power = 0.8
alternative = two.sided

==============================

> pwr.f2.test(u =4, v=(614-4-1), f2=0.14^2, sig.level =0.05)

Multiple regression power calculation

u = 4
v = 609
f2 = 0.0196
sig.level = 0.05
power = 0.8002173

#### hlsmith

##### Less is more. Stay pure. Stay poor.
I don't get a lot of linear reg, so I am not to up on this. What is with the one-sample t-test, is that for a continuous and/or proportion test not equal to null of zero? Also, why do you think you need a Bonferroni correction?

#### obh

##### Well-Known Member
What is with the one-sample t-test, is that for a continuous and/or proportion test not equal to null of zero? Also, why do you think you need a Bonferroni correction?
The one sample t-test checking the H0 that the coefficient equal to zero.

If you know the model, you probably interested only on the F test for he entire model.

If you are not sure what IVs to keep in the model then you have in the edge case t-tests as the number of IVs, so multiple tests.
So the solution should be between no correction and Bonferroni correction, but even with Bonferroni correction the sample size ls smaller than for the entire model?

#### hlsmith

##### Less is more. Stay pure. Stay poor.
Given I am not thinking too hard about this. what does the nu stand for in the first equation. Is a difference partially associated with the potential degrees of freedom beside that they are looking at different tests.

Last edited:

#### obh

##### Well-Known Member
U=4, number of IVs (predictors)

#### noetsi

##### Fortran must die
I have heard that in ANOVA contrasts have different power than the overall F test even though the sample size is likely the same. It has to do with the assumptions behind the F test and contrasts. Various statistics have different power aside from sample size, that is why they create new statistic often to increase power.

Not sure if that is what is being asked.

#### obh

##### Well-Known Member
I have heard that in ANOVA contrasts have different power than the overall F test even though the sample size is likely the same. It has to do with the assumptions behind the F test and contrasts. Various statistics have different power aside from sample size, that is why they create new statistic often to increase power.

Not sure if that is what is being asked.
Hi Noetsi,

I don't understand what did you try to say.

#### noetsi

##### Fortran must die
What I tried to say what that different statistics, like F and t test could have different power even with the same sample size. So you might need different sample sizes for different statistics.

Not sure that is any better.

#### obh

##### Well-Known Member
What I tried to say what that different statistics, like F and t test could have different power even with the same sample size. So you might need different sample sizes for different statistics.

Not sure that is any better.
Thanks noetsi

I know this, but I read that the t statistic in the regression should produce a larger sample size, but I got different results, so Am I wrong? Or...

#### Dason

You are using a power calculator for a one sample t-test. That isn't really the test you are doing. I think that's the main issue in the results you're seeing. I don't typically use these power calculator functions do I can't recommend what you need exactly. I'd probably just run a simulation.

#### obh

##### Well-Known Member
Thanks Dason

I compare the t statistic to a constant value, so why the one sample t test is not the test?

Yes simulation is the magical solution.

#### Dason

When you're doing regression the other predictors impact the power of your test even for single coefficients. For intuitive reasoning think about multicollinearity and how when you were using the power code for a one sample t-test it had no way of incorporating that. Also think about just along LOTS of variables. At some point you have more variables than observations so you need to increase the sample size just because of that.

All of this is just hopefully to get you to see why the one sample t-test power code isn't completely appropriate.

#### obh

##### Well-Known Member
Thanks Dason

Correct the degree of freedom of the regression t test is n-p-1 so there is a minor influence of the number of the predictors. So the influence on the test power is minor, I can calculate again and see if it change the picture, or still the f test determine the sample size.

#### obh

##### Well-Known Member
Hi Dason,

I checked and when reducing the DF to (n-p-1) instead of (n-1) the change of power is minor.
I assume it doesn't change the big picture, so it must be something else?

#### Dason

Are you still using the one sample t-test function? I'm telling you that is the wrong function to use.

#### obh

##### Well-Known Member
Hi Dason,

You can't use the R one sample t-test function, as you can't control the DF
I calculated manually similar to one sample t-test function, but with the correct df. base on the following:

Can you please be more specific about the difference in the way to calculate the power?

Change: coefficient change allowed
d= Change / SE
Actually I used: d = 0.2

df=n-p-1

H0: statistic distribute|: t(df)
H1: statistic distribute: Noncentral t(df, d√n)

Thanks

Last edited: