My data consists of values of a dependent variable (call it X) from two groups of subjects (call them group A and group B).

In my field, X can be calculated from raw data using a variety of preprocessing steps, some of which are binary choices (i.e. choose to perform a specific step or omit, either is valid) and some of which have more than 2 options. Let's say in this example there are 4 "preprocessing" steps so:

Raw data->Step 1 (4 options)->Step 2 (2 options)-> Step 3 (2 options)-> Step 4(7 options)-> X

So in this example you have 4 x 2 x 2 x 7 = 112 total ways to preprocess the raw data to arrive at values of X.

My dataset consists of values of X obtained using all 112 ways of performing this analysis for the same set of subjects (groups A & B).

The questions I would like to answer are:

1) Is there an overall difference in X between groups A & B (regardless of method chosen)?

2) Does any particular choice in preprocessing method modulate (i.e. mitigate or exaggerate) this difference in X between groups A & B?

I understand that these samples are obviously not independent since they are derived from identical raw data from the same subjects--should these be considered "repeated measures"? Is regression analysis appropriate in this situation?

Any help would be greatly appreciated!

In my field, X can be calculated from raw data using a variety of preprocessing steps, some of which are binary choices (i.e. choose to perform a specific step or omit, either is valid) and some of which have more than 2 options. Let's say in this example there are 4 "preprocessing" steps so:

Raw data->Step 1 (4 options)->Step 2 (2 options)-> Step 3 (2 options)-> Step 4(7 options)-> X

So in this example you have 4 x 2 x 2 x 7 = 112 total ways to preprocess the raw data to arrive at values of X.

My dataset consists of values of X obtained using all 112 ways of performing this analysis for the same set of subjects (groups A & B).

The questions I would like to answer are:

1) Is there an overall difference in X between groups A & B (regardless of method chosen)?

2) Does any particular choice in preprocessing method modulate (i.e. mitigate or exaggerate) this difference in X between groups A & B?

I understand that these samples are obviously not independent since they are derived from identical raw data from the same subjects--should these be considered "repeated measures"? Is regression analysis appropriate in this situation?

Any help would be greatly appreciated!

Last edited: