The dataset itself is rather complex, but attached is an excel file screenshot with a simplified tables demonstrating what I think is the problem. Would be grateful for a confirmation or refutation.

- Thread starter GAL
- Start date
- Tags regression anova

The dataset itself is rather complex, but attached is an excel file screenshot with a simplified tables demonstrating what I think is the problem. Would be grateful for a confirmation or refutation.

On the other hand, if you have a few very colinear variables and the effect size is very small, then you vwill need much more than 10 times the number of variables. It is just an internet rumor that is not true.

I would not run a regression with less than 30 cases (at which point the CLT kicks in) although I would prefer analysis with hundreds of cases to be reasonably sure of the results. This is true whether you have Multicolinearity or not.

I would not run a regression with less than 30 cases (at which point the CLT kicks in) although I would prefer analysis with hundreds of cases to be reasonably sure of the results. This is true whether you have Multicolinearity or not.

@GAL Can you provide some clarification on what your data looks like since it's not clear to me what you actually have.

I disagree.

Time to go back to descriptive statistics

When I predict (which is exclusively in time series) I use univariate models such as ESM or state space models. I think it is generally agreed they predict better at predicting time series than multivariate models and are robust to assumptions (they make almost no assumptions at all if you are making point estimates). They are of course much easier to do as well.

I use multivariate models for one thing only, to see the relationship of variables to each other.

Side note, I don't see any issues with the meta-data presented by the OP. As @Dason noted, the person used ellipses to signify this is a partial presentation of data. The biggest issue for me was not showing us the percentages and sample size, so we could truly evaluate sparseness concerns. @GAL also did not tell us what the modeling issue was.