Power Calculation for a Multilevel Linear Regression

hlsmith

Less is more. Stay pure. Stay poor.
#1
Well, I am getting jealous by all of the posting on this topic. I have my own power calculation that I need to run and thought I would see if you all could provide a little guidance. I haven't done one for MLM data before.

Study:
-Two treatments groups (Treatment1 = Tx1; Treatment2 = Tx2)
- 30 subjects (#subjects)
- Crossover design: each subject will receive each treatment five times then the other treatment. Subjects treatment order is balanced via randomization.
- Outcome: continuous variable (DV), 5 serial measures for Tx1 and 5 serial measures for treatment Tx2, order of serial Tx randomized

Assumptions:
Tx1_1 will have an effect on DV of rnorm(#subject, 25, 5) and the value will go down with each Tx1 application, so autoregressive correlation across 5 serial measures

Tx2_1 will have an effect on DV of Tx1_1 + rnorm(subject#, 1.5, 0.5) and the DV value will go down with each Tx2 application, so autoregressive
correlation across 5 serial measures)

The second treatment given (either Tx1 or Tx2) will start off with an effect on DV of rnorm(subject#, 5, 1.5) lower than the first observation for the first time in the first period.

I am happy to do this in R. Below is what the dataframe could look like, these data are manually made up.

Code:
Sequence  Time  Period  Subject  Treatment  DV
1 1 1 1 1 25
1 2 1 1 1 24
1 3 1 1 1 22
1 4 1 1 1 20
1 5 1 1 1 20
1 1 2 1 2 20
1 2 2 1 2 18
1 3 2 1 2 16
1 4 2 1 2 15
1 5 2 1 2 20
2 1 1 2 2 19
2 2 1 2 2 19
2 3 1 2 2 17
2 4 1 2 2 16
2 5 1 2 2 15
2 1 2 2 1 16
2 2 2 2 1 16
2 3 2 2 1 14
2 4 2 2 1 13
2 5 2 2 1 13
...
1 5 2 30 1 15
So I guess I need to write out a function, but the correlation between DV values will need to be accounted for.

Tx1_1_1 = treatment 1 first period and first timepoint
E(DV_Tx1) = (Tx1_1_1) + (AR(Tx1_1_1) +,..., (AR(Tx1_1_4), something like this gets the first period obs for Tx1
E(DV_Tx2) = (Tx1_1_1 + Tx2_1_1) + (AR(Tx2_1_1) +,..., (AR(Tx2_1_4), something like this gets the first period obs for Tx2
Then I need to add an indicator to halve of the obs if they occurred during the second period, so Tx?_2_1 will be lower.

So moreover, DV is slightly lower for Tx1 than Tx2 and the DV goes down a little with each treatment during the period, then the starting value in the second period should go down - so not a clean washout between the periods.

Getting ready to leave work, but I will check back to see if there are any comments. The autoregressive correlation with the values going down will be the interesting part.
 

hlsmith

Less is more. Stay pure. Stay poor.
#2
My first crack at the simulation is below. I am sure there are better approaches but the coefficients are pretty close to my target values. I just need to tune this up and replicate over and over. I am guessing that requires an apply or one of those functions.

Code:
set.seed(1)
#Period 1 Tx1
Y111 = rnorm(15,30,2)
Y112 = Y111*rnorm(1,0.95,0.05)
Y113 = Y112*rnorm(1,0.95,0.05)
Y114 = Y113*rnorm(1,0.95,0.05)
Y115 = Y114*rnorm(1,0.95,0.05)

#Period 2 Tx2
Y221 = Y111 + (rnorm(1,4,2.5)) - (rnorm(1,4,1.5))
Y222 = Y221*rnorm(1,0.95,0.05)
Y223 = Y222*rnorm(1,0.95,0.05)
Y224 = Y223*rnorm(1,0.95,0.05)
Y225 = Y224*rnorm(1,0.95,0.05)

#Period 1 Tx2
Y211 = rnorm(15,35,2.5)
Y212 = Y211*rnorm(1,0.95,0.05)
Y213 = Y212*rnorm(1,0.95,0.05)
Y214 = Y213*rnorm(1,0.95,0.05)
Y215 = Y214*rnorm(1,0.95,0.05)

#Period 2 Tx1
Y121 = Y211 - (rnorm(1,5,2)) - (rnorm(1,4,1.5))
Y122 = Y121*rnorm(1,0.95,0.05)
Y123 = Y122*rnorm(1,0.95,0.05)
Y124 = Y123*rnorm(1,0.95,0.05)
Y125 = Y124*rnorm(1,0.95,0.05)
DV = c(Y111,
         Y112,
         Y113,
         Y114,
         Y115,
         Y221,
         Y222,
         Y223,
         Y224,
         Y225,
         Y211,
         Y212,
         Y213,
         Y214,
         Y215,
         Y121,
         Y122,
         Y123,
         Y124,
         Y125)
str(DV)
DV

Tx = rep(c(1,2,2,1),times = c(75,75,75,75))

Period = rep(c(1,2,1,2),times = c(75,75,75,75))

Attempt=rep(1:5, times=4, each=15)

ID1 = rep(1:15, times=10, each=1)
ID2 = rep(16:30, times=10, each=1)
ID =  c(ID1, ID2)
MLM = data.frame(Tx, Period, Attempt, ID, DV)


#To visualize DV values next to each other
glimpse = data.frame(Y111,
                       Y112,
                       Y113,
                       Y114,
                       Y115,
                       Y221,
                       Y222,
                       Y223,
                       Y224,
                       Y225,
                       Y211,
                       Y212,
                       Y213,
                       Y214,
                       Y215,
                       Y121,
                       Y122,
                       Y123,
                       Y124,
                       Y125)
 

hlsmith

Less is more. Stay pure. Stay poor.
#3
My data generating function for simulating multilevel (repeated measure) data. I simulated more data than the desired sample size called for, then grouped data into sets the size of the desired n-value in order to generate more than one simulation set. I am guessing this is appropriate enough but not the absolute best approach.

Code:
m=100
n=15*m
set.seed(1)
#Period 1 Tx1
Y111 = rnorm(n,30,3) + rnorm(n,0,1)
Y112 = Y111*rnorm(n,0.95,0.08) + rnorm(n,0,1)
Y113 = Y112*rnorm(n,0.95,0.08) + rnorm(n,0,1)
Y114 = Y113*rnorm(n,0.95,0.08) + rnorm(n,0,1)
Y115 = Y114*rnorm(n,0.95,0.08) + rnorm(n,0,1)

#Period 2 Tx2
Y221 = Y111 + (rnorm(n,4,3)) - (rnorm(n,4,1.5)) + rnorm(n,0,1)
Y222 = Y221*rnorm(n,0.95,0.08) + rnorm(n,0,1)
Y223 = Y222*rnorm(n,0.95,0.08) + rnorm(n,0,1)
Y224 = Y223*rnorm(n,0.95,0.08) + rnorm(n,0,1)
Y225 = Y224*rnorm(n,0.95,0.08) + rnorm(n,0,1)

#Period 1 Tx2
Y211 = rnorm(n,35,3) + rnorm(n,0,1)
Y212 = Y211*rnorm(n,0.95,0.08) + rnorm(n,0,1)
Y213 = Y212*rnorm(n,0.95,0.08) + rnorm(n,0,1)
Y214 = Y213*rnorm(n,0.95,0.08) + rnorm(n,0,1)
Y215 = Y214*rnorm(n,0.95,0.08) + rnorm(n,0,1)

#Period 2 Tx1

Y121 = Y211 - (rnorm(n,4,3)) - (rnorm(n,4,1.5)) + rnorm(n,0,1)
Y122 = Y121*rnorm(n,0.95,0.08) + rnorm(n,0,1)
Y123 = Y122*rnorm(n,0.95,0.08) + rnorm(n,0,1)
Y124 = Y123*rnorm(n,0.95,0.08) + rnorm(n,0,1)
Y125 = Y124*rnorm(n,0.95,0.08) + rnorm(n,0,1)
DV = c(Y111,
       Y112,
       Y113,
       Y114,
       Y115,
       Y221,
       Y222,
       Y223,
       Y224,
       Y225,
       Y211,
       Y212,
       Y213,
       Y214,
       Y215,
       Y121,
       Y122,
       Y123,
       Y124,
       Y125)

Tx = rep(c(1,2,2,1),times = c(75*m,75*m,75*m,75*m))

Period = rep(c(1,2,1,2),times = c(75*m,75*m,75*m,75*m))

Attempt = rep(1:5, times=4, each=15*m)

mset = rep(c(1:m), times = 20, each = 15)

ID1 = rep(1:(n), times=10, each=1)
ID2 = rep(((n)+1):(n*2), times=10, each=1)
ID =  c(ID1, ID2)
MLMM = data.frame(Tx, Period, Attempt, ID, DV, mset)