Hi all,

I have a data set with 3 proposed dichotomous independent variables: sex (male or female), side (left or right), and region (lower or upper). I want to know which are significant / affect my dependent variable: distance.

Apparently, in the medical research community, it is common to use a univariate test to filter out independent variables that are ultimately thrown into a multivariate test.

In order to abide by this, my initial idea was to do a single factor ANOVA as a variable filter and then do a linear regression on the passed variables. However I am getting caught up on a few things...

1) What is the value of a post hoc test in this case? Looking at each category, it is clear that their individual distance distributions are non parametric (I mean.. they kind of are, but are definitely left skewed).

2) If a post hoc test is necessary.. which would I use? Wilcoxon seems to be the most appropriate for non parametric data, but how "non normal" does it really have to be?

3) Assuming two variables are passed as being potentially significant and I can throw them into a multivariate test.. should I just use a simple linear regression?

4) As a bonus: is this a simple task in Excel?

Thanks to whomever answers. The overlap in statistical methods and general concepts has made me overthink this!

5) My understanding is that ANOVA is just a less powerful, but more general, form of regression analysis. Why am I even supposed to use it?!

I have a data set with 3 proposed dichotomous independent variables: sex (male or female), side (left or right), and region (lower or upper). I want to know which are significant / affect my dependent variable: distance.

Apparently, in the medical research community, it is common to use a univariate test to filter out independent variables that are ultimately thrown into a multivariate test.

In order to abide by this, my initial idea was to do a single factor ANOVA as a variable filter and then do a linear regression on the passed variables. However I am getting caught up on a few things...

1) What is the value of a post hoc test in this case? Looking at each category, it is clear that their individual distance distributions are non parametric (I mean.. they kind of are, but are definitely left skewed).

2) If a post hoc test is necessary.. which would I use? Wilcoxon seems to be the most appropriate for non parametric data, but how "non normal" does it really have to be?

3) Assuming two variables are passed as being potentially significant and I can throw them into a multivariate test.. should I just use a simple linear regression?

4) As a bonus: is this a simple task in Excel?

Thanks to whomever answers. The overlap in statistical methods and general concepts has made me overthink this!

5) My understanding is that ANOVA is just a less powerful, but more general, form of regression analysis. Why am I even supposed to use it?!

Last edited: