Repeated measurements with uneven sample size - which test is suited?

#1
I am struggling to find the appropriate statistical test to analyze my data. I hope that my question will be understandable.
I have the following setup:
  • A porcine spine with three vertebral bodies (L1,L2,L3).
  • The spine was scanned on three different imaging modalities (Modality A,B,C)
  • On each of the modalities, different rings of fat were wrapped around the spine resulting in 5 different simulated sizes (size 1 to 5).
  • For each vertebral body of each of the sizes of each modality, I can measure the bone density (BD) as BD.L1, BD.L2, BD.L3
For reproducibility a dataset representing the full data with fictive values has been added as R code

Code:
set.seed(23)



df <- data.frame(

  Modality = c(rep("A",30),rep("B",45),rep("C",45)),

  Size = factor(c(rep(rep(1:5,each=2),3),rep(rep(1:5,each=3),6)), levels=c(1,2,3,4,5),ordered=TRUE),

  Repeat = factor(c(rep(1:2,15),rep(rep(1:3,15),2))),

  Level = c(rep(c("L1","L2","L3"),each=10),rep(rep(c("L1","L2","L3"),each=15),2)),

  BD = c(runif(30,1,3),runif(45,2,4),runif(45,3,5))

)



str(df)

'data.frame':   120 obs. of  5 variables:

 $ Modality: Factor w/ 3 levels "A","B","C": 1 1 1 1 1 1 1 1 1 1 ...

 $ Size    : Ord.factor w/ 5 levels "1"<"2"<"3"<"4"<..: 1 1 2 2 3 3 4 4 5 5 ...

 $ Repeat  : Factor w/ 3 levels "1","2","3": 1 2 1 2 1 2 1 2 1 2 ...

 $ Level   : Factor w/ 3 levels "L1","L2","L3": 1 1 1 1 1 1 1 1 1 1 ...

 $ BD      : num  2.15 1.45 1.66 2.42 2.64 ...
I would like to answer the following questions:
  1. Are there significant differences in bone density (BD) measurements among the three modalities for each phantom size?
  2. Are there significant differences in bone density (BD) measurements among the sizes within each modality?
Here the tricky part: for modality A all sizes were scanned twice (2 repeats) while for modalities B and C all sizes were scanned thrice (3 repeats).

Because the data points are very few, I thought to compare the BD measurements for each size not on a per-vertebra basis, but using the BD measurements of all three vertebra together for each modality and size.

Former suggestions were to use linear models and assess if they are significantly different from each other.
Or to use a permutation test?

I highly appreciate your suggestions. Thank you.