Removing outliers when there are multiple ANOVA and correlational analyses in a single results section

#1
I would be grateful for opinion on which of the two options below (or an alternative) is best:

Summary of study: In a single results section, different ANOVAs are run on the different metrics – raw scores (such as RT, d prime, accuracy) and also collapsed/composite/index scores (for example combining RT and d prime, or normalised scores). Then these behavioural measures are correlated with survey data measures, and finally multiple regression analysis is run with behavioural and survey data.

Definitions:

Genuine outliers = participants with below chance-level accuracy, d-prime scores >3SDs below condition mean, participants not following instructions/technical issues
Outlying values = data points simply >3SDs away from condition mean

Option 1: Remove genuine outliers at the start, but note outlying values in analyses on a per analysis basis

Remove genuine outliers from whole dataset
Run each analysis for RT, d prime, normalised scores, correlations etc.
In each case (separately for each analysis), note if there are outliers (e.g. > 3SDs from condition means).

If data normally distributed:
Run the analysis with the outliers in.
Run the analysis with the outliers removed.
If results and assumptions not affected
Write up with outliers in – to keep the same N across analyses (and note outliers and the above analyses)

If data not normally distributed:
Remove outliers
if data now normally distributed:
Run analysis with outliers removed
Run non-parametric analysis with outliers included
If results are the same:
Write up non-parametric test with outliers in (keeps same N for all analyses) (note all above analyses)

If data not normally distributed:
Remove outliers
if data still not normally distributed
Run non-parametric analysis with outliers included
If results are the same:
Write up non-parametric test with outliers in (keeps same N for all analyses) (note all above analyses)

If a participant did not complete a survey, however, run the correlational analyses and regression with this smaller N

Option 2: Remove all outliers and outlying values at the start

Remove genuine outliers from whole dataset

Then check all raw scores, collapsed score/indices, and survey measures for outliers (> 3 SDs from condition means)

Remove all these outliers, including participants that did not also complete the survey measures (i.e. losing their behavioural data)

Run each of the analyses with this N

If there are new outliers and/or data not normally distributed
Run parametric tests only


My concern with Option 2 is that there are no checks to determine whether these outlying values (kept in or removed) affect the findings. Also participants with no outlying values in raw scores are removed for having outlying values in collapsed/composite scores from all analyses. Plus behavioural data is removed due to missing survey data from purely behavioural analyses.

However, to remove all these outlying values (participants) at the start, and then to test whether the removal of each one affects results for each of the separate analyses and also in combination requires a huge number of analyses!

Would be grateful for any suggestions or improvements to Options 1 or 2.

Thank you!
 

Karabiner

TS Contributor
#3
I do not understand why you consider normality. That seems completely
irrelevant.

You did not mention your sample size.

Before analysis, you remove data which can be identified as false, ok.
Then you perform the analyses.
What you mean by outliers then? Simply values > 3SD from the mean?
Why do you consider them as outliers, what does that mean to you?

You must have substantial considerations, not just arbitrary threshholds.
For example, reaction times are often extremely right-skewed. Maybe
it would make more sense in your research to consider log-transformed RT.

With kind regards

Karabiner
 
#5
I do not understand why you consider normality. That seems completely
irrelevant.

You did not mention your sample size.

Before analysis, you remove data which can be identified as false, ok.
Then you perform the analyses.
What you mean by outliers then? Simply values > 3SD from the mean?
Why do you consider them as outliers, what does that mean to you?

You must have substantial considerations, not just arbitrary threshholds.
For example, reaction times are often extremely right-skewed. Maybe
it would make more sense in your research to consider log-transformed RT.

With kind regards

Karabiner
Thank you for your answer :) I was considering normality because some variables may be normally distributed, others not. So some suitable for parametric analyses, others for non-parametric. Sample size is 30. By outliers, I mean outlying values that were for example very low accuracy/not following instructions/technical issues - for example, accuracy of 0% in a condition and so on. I just termed these genuine outliers (maybe bad data is better). I understand that what I termed 'outlying values' could for example be from another population or may again indicate lack of effort/attention during the task etc. Some of the variables appear normally distributed, others not. The >3 SDs from means has been a threshold suggested at which one should remove participants in the case of normally distributed data because they indicate for example lapses of attention/effort throughout the task. For non-parametric it has been suggested one can used the IQR*3 method. When variables in one study are both normally and non-normally distributed, suggestion is just use IQR*3. Like you say though, I wasn't clear on why these 'outlying' values should be removed at all. Should not non-parametric tests be run with these outlying values in (with checks to determine whether the results change with them removed) and not just remove outlying values at the start before conducting all the analyses?
 

Karabiner

TS Contributor
#6
I was considering normality because some variables may be normally distributed, others not. So some suitable for parametric analyses, others for non-parametric. Sample size is 30.
Normalitity of the unconditional distribution is almost never relevant (except for the significance test of the Pearson coeffcient).
Only in case of small samples (n < 30), the distribution within the respective subgroups (e.g. in case of a t-test), or how the error of the
model is distributed (e.g. in case of a linear regression) might be relevant.
Should not non-parametric tests be run with these outlying values in (with checks to determine whether the results change with them removed) and not just remove outlying values at the start before conducting all the analyses?
Non-parametric here would be tests on rank transformed data. These are of course robust against
very high/very low values, but they answer questions different from those answered by a "parametric"
test such as t-test of oneway analyis of variance etc.

Wth kind regards

Karabiner