help with pre /post design and different subgroups

#1
Hello everyone,
im trying to finish my thesis but non-surprisingly have trouble with the statistics of it.

The study im conducting is concerned with an intervention for children and seeks to foster 2 dependent variables (knowledge and help-seeking) which are considered to be protective factors for psychopathology. There is no control group, all students (200+) received the treatment.

All variables are measured via questionnaire.

My first hypothesis was that the treatment is effective for the whole sample and paired sample t-tests indicate that it was.

In my second hypothesis I want to know if the treatment is as effective in four a priori defined (high risk) subgroups as it is in the whole sample. Naturally I expect to find no difference as high risk subgroups should benefit (at least) equally as the rest.

I have parametric data for the four subgroups: depressed, suicidal, impulsive and avoidant. I plan to define cut-off values to transform these data into categorial variables (i.e. depressed /not-depressed).
My rationale would now be to perform a mixed ANOVA with the within subject factor "time" and the between subject factor "risk group".


The following questions now arise for me:
1) Is a mixed ANOVA the go to procedure?
2) At least one of my subgroups is not normally distributed. Also for its questionnaire there is no pre-existing cut-off value. How do I define a cut-off and what arises from the assumption of non-normality? For info the highest quartile still contains 30+ subjects.
3) Its safe to assume that the subgroups are not independent which is also signified by considerable overlap (depressed and suicidal). Are there any problems the result from this? I know I could still perform the mixed Anova just as well right?

Thanks so much for your help. I hope this is my last run-in with statistics of this kind as im really not fond of doing research.

Best Regards
 

hlsmith

Less is more. Stay pure. Stay poor.
#2
Why do you need a cutoff value? Categorizing continuous variables just results in a loss of information and doesn't allow you to examine non-linear effects.
 

Karabiner

TS Contributor
#3
My first hypothesis was that the treatment is effective for the whole sample and paired sample t-tests indicate that it was.
It is just indicative that there is a pre-post difference.
I have parametric data for the four subgroups: depressed, suicidal, impulsive and avoidant. I plan to define cut-off values to transform these data into categorial variables (i.e. depressed /not-depressed).
So you have 4 scores which you want to categorize? "I have a depressed subgroup and I want to transform depression data into depressed/non-depressed" makes only limited sense.
My rationale would now be to perform a mixed ANOVA with the within subject factor "time" and the between subject factor "risk group".
That looks like a possible solution.
2) At least one of my subgroups is not normally distributed.
A group is a group. Groups cannot be normally distributed.
3) Its safe to assume that the subgroups are not independent which is also signified by considerable overlap (depressed and suicidal). Are there any problems the result from this? I know I could still perform the mixed Anova just as well right?
This is quite unintelligible. So you could either try to describe how you plan to do the categorizing/grouping, in such a way that it is comprehensible; or, if you have 4 predictors, you could simultaneously use them as (categorical or interval scaled) covariates in the mixed ANOVA.
 
#4
Why do you need a cutoff value? Categorizing continuous variables just results in a loss of information and doesn't allow you to examine non-linear effects.
Well I want to show that highly depressed students benefit equally from the treatment. That's why I want to create a highly depressed sub group.

It is just indicative that there is a pre-post difference.

So you have 4 scores which you want to categorize? "I have a depressed subgroup and I want to transform depression data into depressed/non-depressed" makes only limited sense.
Yes. That's what I want to do because I want to show that highly depressed students (same as highly impulsive etc.) benefit equally from the treatment i.e. significant main effect but no significant interaction.


A group is a group. Groups cannot be normally distributed.
Pardon my French. So one of my 4 between subject variables (impulsivity) is not normally distributed.

This is quite unintelligible. So you could either try to describe how you plan to do the categorizing/grouping, in such a way that it is comprehensible; or, if you have 4 predictors, you could simultaneously use them as (categorical or interval scaled) covariates in the mixed ANOVA.
I am asking for help because I do not have any idea on how to do the categorising /grouping. If I treat my supposed between subject factor as covariates then id only have a one-factor repeated measures ANOVA. I dont know if that works in the same way a mixed ANOVA would.
 

Karabiner

TS Contributor
#5
If you use the 4 scores simultanously as co-variates, then you will have 4 interactions with
the repeated-measures factor.

You can do 4 separate mixed ANOVAs, each with a categorized variable as between-subjects factor.

You can count the number of values above cut-off for each participant (so in an extreme case.
someone has k=4 values above cutoff for depression, suicidality, impulsivity, avoidant), and
use it as covariate or factor in the repeated-measures ANOVA.

So one of my 4 between subject variables (impulsivity) is not normally distributed.
Well, that doesn't matter in a sample > 200 (and it is not the variable's distribution which is
considered, but the residuals of the anova model), but it is nonetheless surprising that suicidality or
depression scores appear normally distributed in a non-clinical sample.

You did not yet describe the conceptual background, the theory behind this, or the practical impact
of the study. For example it is unknown why there are 4 variables, why there is a mix of
clinical (suicidality, depression), and psychological (impulsivity, avoidance) variables, and
whether they are equally important. Also it is unknown which kind of population we are speaking
of. If this were "normal" students, a medium or severe depression would be expected in only 10%
or so of the sample. All this matters if a typology of the participants is wanted.
 
Last edited: