Using Generalized Estimating Equations to test item validity on a scale


I am creating a new measurement scale for withdrawal from cannabis, and have daily withdrawal data from 50 people for 1 week before abstinence (baseline smoking as usual) and then for 2 weeks of abstinence. There are 26 items on this new scale, and essentially what I want to is to rank the items by the amount that they increase above baseline during the withdrawal phase...I am calling this "validity" - and it can be conceptualized as taking the integral (area under the curve) between the average baseline score and the score given on each day of abstinence - for every item on the scale.

I did exactly this - i.e. calculated the integral between the mean daily withdrawal score and the average baseline score for the week of smoking as usual - and used that single value to rank the scale items.

All well and good, but what I really need is to put some stats next to the items - that somehow reflect my "integral" approach (I.E. parameter values that also reflect a descending order inline with the sorted integral values). In theory this should be possible.

However I am having a lot of difficulty making my stats match the ranking given by the integral...

Because I have such repeated measures data - I am using Generalized Estimating Equations - to allow for the within person correlations in their answers. I build 26 models (1 for each item on the scale) - with withdrawal score as the dependent variable, and time (days in abstinence) as the independent. Actually because withdrawal tends to increase at first and then decrease, I am adding time squared - to get a quadratic effect - which is most always significant. I also include the baseline score as the covariate in this model.

So I get a bunch of parameters out - i.e. slope parameters and wald chi square statistics - for each item - and I want to use these parameters or stats to describe the trajectory of withdrawal through abstinence - ultimately describing which items are most valid (i.e. respond more to abstinence)..... does anybody have any ideas for why non of my wald stats or slope values are matching my original integration ranking of the items? Also, I can get my stats program (SPSS) to output predicted mean values for each item.....when I take the integral of these predicted values and rank the items - they rank differently to the original integral ranking on the original raw values....any ideas why this is?

I realise that I am going "off road" with my approach to stats here - not using the traditional approaches and perspectives - so..

Thanks for any body who can give any alternative perspectives on this,

Just to add some more information on the various approaches that I have taken so far/research that I have done on this topic..

As my overarching aim is to rank my 26 scale items based on some measure of validity to cannabis withdrawal - and I decided to conceptualise it as the integral between baseline and each day of abstinence withdrawal values - I first ran my GEE model using time as a categorical variable - comparing each day of abstinence to the average of the baseline week. This was great as it allowed me to specify on which day of abstinence withdrawal scores first were significantly above baseline - and how long withdrawal for that item lasted (i.e. was statistically above baseline). However, I then decided that this was using too many degrees of freedom (14df - for the 15 time points - 14 abstinence days vs 1 average baseline day) for the small sample size.. So instead I wanted to try using time as a continuous variable. Doing this gave me the additional bonus of being able to include the quadratic time term to capture that inverted U shape that this withdrawal seems characterised by.

My plan at the outset was to rank the items by the wald statistic that is output for the GEE in SPSS. However when time is continuous, I have to include the baseline data as a covariate - instead of as a reference group as per time as a categorical variable. Thus I end up with only a set of what essentially look like multiple regression parameters (slopes and standard errors with a wald chi square stat for each parameter). I was going to try and rank the items based on some interpretation of the two parameters describing time (the linear and quadratic terms) - but interpreting quadratic terms is not very easy - at least in this comparative context.. So I then went on to calculate the integral under the curve for each item by simply calculating the integral under the "predicted" values of withdrawal scores for each item (predicted values output by spss when requested).

However, ranking items by this "integral of the predicted withdrawal scores" doesnt seem to match any ranking of any of the parameters for each model (e.g. linear slope, quadratic slope, wald chi squared etc). Why do I want a statistic to put in my ranked table? Because I want to publish this in a good journal - and it has to have a statistic. And if it has a statistic by each item - and it matches the "integral" rank - then I will know its good solid stats.