Which non-parametric(?) test? 2 groups, 4 dependent v, 2 control v

pp12

New Member
#1
Dear all,

I would like to compare two groups (treatment: yes/no) at one point of time (end of the program) on four dependent variables (of which two seem to correlate with each other). Additionally, I have two control variables (which I suppose moderate the effect of treatment). I have roughly 100 participants divided between the two groups (not totally equal). The overall idea is comparing four different skills of two similar groups of which one received an educational program.

I have found some parametric tests in the internet which I found to be useful (multivariate multiple linear regression; hierarchical regression analysis with two layers; principal component regression; partial least square regression). The problem is, almost none of the assumptions are met (and frankly; I also did not understand which test would have been better for what reason). As I was searching for non-parametric tests, I only found robust t-tests or Kruskal-Wallis-test/Jonckheere-Terpstra Test (but I only have two groups; can I include covariates there, though), but those do not seem to allow controlling for other variables.

I have also heard of a possibility to transform my data with a Log to normally distributed data (so that I could use parametric tests), but was told that most likely it would not work for all my variables. To be honest, I did not really understand why. What do you think of it and could you please explain?

I work with SPSS (and in worst case could maybe try R, but would really prefer not to).

Could anyone please help me in choosing the right (non-parametric) test? Thank you so much!
 

Karabiner

TS Contributor
#2
What scale level do your variables habe, is it interval?

Do you wish to analyse them separately, or do they jointly operationalize one construct, so that you want to analyse them together?

You only mention the irrelevant assumtion ("normality"), but you say, almost all other assumptions were not met. Which ones were this?

With kind regards

Karabiner
 

pp12

New Member
#3
Hello and thank you for your fast reply. It took me a bit, because I had to go over the data again as I was unsure about some aspects (and frankly, I did not expect such a fast reply, thank you!).

1. If I understand it right, they are all interval scaled. The control variables are age and time span in months. The dependent variables are test scores (of testing different school-related skills). The first are raw values = "items correctly identified"; the second and third are correctly identified items transformed to norm-adjusted scales which go roughly from 1-14, the fourth is the added score of different subscales. The independent variable has two groups.



2. I would prefer to analyse the dependent variables separately, because two variables have 3 and 5 subscales which I would also like to compare between the groups. But then there is the alpha-error Inflation if I don't selectively choose my tests, isn't it? Also two of the four dependent variables seem to correlate with each other, but the others not (if I run a correlation without any assumption testing). I would like to later display the outcomes for all 4 variables one by one, though (and of the two with the subscales, I assume that only some subscales would become significant).



3. When I test for the assumptions for the dv, the criteria are met for some of the dv and for some they are not.

- E. g., testing assumptions for a multiple linear regression or hierarchical regression (separately for each dependent variable), following assumptions are not met:

A. Normal distribution of dependent variable: met for none of the dependent variables or their subscales (highly skewed to the left = participants have very low results). Sample size: N=98.

B. No multicollinearity between predictor variables as in higher as .7: The independent variable correlates with one control variable at .87 and with the other at .58

C. linear relationship between independent variables and dependent variable: Given for three variables (but of one of those only on 4 out of 5 subscales). So for one variable and one subscale it is not given.

D. Homoscedasticity (Standardised residues between -3 and 3?): Violated for one variable and three subscales of another variable (but only 1-3 data points each).

E. I have outliers in three of the dependent variables (2-3 for each), but might I just exclude them from the specific test?

Please apologize my insecure explanations, I still feel pretty insecure applying statistics. Thank you and with kind regards!
 

Karabiner

TS Contributor
#4
A. Normal distribution of dependent variable: met for none of the dependent variables or their subscales (highly skewed to the left = participants have very low results). Sample size: N=98.
There is no assumption that the dependent variable has to be normally distributed,
neither for ANOVA, nor for regression.


BNo multicollinearity between predictor variables as in higher as .7: The independent variable correlates with one control variable at .87 and with the other at .58
You might perhaps consider to leave out the first control variable, but if the standard error of the predictor variable(s)
do not explode, you could leave it in.


linear relationship between independent variables and dependent variable: Given for three variables (but of one of those only on 4 out of 5 subscales). So for one variable and one subscale it is not given.
Homoscedasticity (Standardised residues between -3 and 3?): Violated for one variable and three subscales of another variable (but only 1-3 data points each).

Ok, this might sometimes be complicated. But you interested in the group effect in the first place,
and if you use ANOVA, then the regression-specific problems do not aplply.


I have outliers in three of the dependent variables (2-3 for each), but might I just exclude them from the specific test?

No, only if they are due to measurement error or coding error or something else which might invalidate them



All in all, there is no such thing as a multifactorial nonparametric analysis, but anyway
you can use ANOVA or MANOVA, as far as I can see, and deal with the assumption problems
one by one.

With kind regards

Karabiner