I have the impression that people that have worked a lot with experimental design can only accept a small deviations from perfect balance. But people that are used to observational data, data that have a lot of colinearity, that is severe imbalance, would gladly run an imbalanced model. Well, opinions can be different so it is interesting to hear different views.
I started asking about sample size because an imbalanced model can be viewed as “not-acceptable” if the imbalance is severe, even if the data is normally distributed.
If you estimate a full model with 48 parameters there is plenty of room to make the residuals look normal since least squares is maximum likelihood estimates based on the normal distribution.
“The interactions are, btw, all non-significant.
“ And the variance constant. Maybe the situation is not that bad.
“I wonder what would happen if i combined some groups together thus decreasing the number of cells and enlargening the sample sizes in each cell.”
I think that could cause other problems, so I would avoid that.
Some people want to start with the full model, from the top of jpkellys (great) list. Others want to start with just the main effects, from the bottom of jpkelleys list.
One possibility is to include significant terms (if you start from bottom) or to drop non-significant effects (if you start from the top). Then you can plot the normal QQ-plot for the 400 residuals and look if it is on a straight line. If it is on a straight line it is normally distributed.
Having said that I don't want to make any suggestions since I don't want to come with an arbitrary suggestion.
But I believe that knedlica have said that the data are still non-normal. Then it remains a link function to some other distribution or a normalising transformation. And what would be wrong with that?