# Overthinking a 'simple' count variable analysis over time

#### Bcteagirl

##### New Member
Hello everyone. New member, hoping I will be able to contribute and help out!

I have been working on this for about 5 days, and gotten to the point where my brain is going in circles. Hopefully you can help me understand what type of analysis this requires.

I am conducting what should be a ‘simple’ analysis of count data of a rare condition/outcome. The data I have to work with is simply counts by year. I have been asked to identify whether or not this condition is increasing. No covariates, I have 11 observations (11 years).

My first step was to take a look at the data graphically. Using graphs I discovered one outlier in the third year, however having few data points and not being asked to make any predictions, I am hoping to leave it in for now (It is in the same direction, just more extreme, see graph at bottom of pdf). I created two and three year moving averages, graphed each, and am using the three year moving average.

Using SPSS I played with the curve estimation function with time, with the original data, two year moving average, and three year moving average. I found that a quadratic function fit the data best, a linear function not at all (significant, rsquare=.85). (See bottom of attached pdf for pretty graph with two lines of fit and confidence intervals from SPSS curve estimation). Double checked this in excel curve fitting as well to be on the safe side. The count data is not significantly skewed or kurtotic.

However I understand the spss curve fitting analysis may not be appropriate due to being count data, autocorrelation (does this apply for independent health events? These would be different people each year) etc. Also likely the confidence intervals would be off?

This took me on a wild goose chase through various high level tend analysis texts which gave me a headache. I am likely over-thinking things at this point as what I need is a very simple p-value to attach saying yes this is a quadratic relationship, that uses an appropriate (or at least acceptable to reviewers!) test.

Eventually I settled back down into Poisson and negative binomial models, both of which I am attempting to learn. I created a quadratic variable of the three year moving average count (count squared). Not being certain about the dispersion in the data (I have not tested this before, am working on it) I conducted a negative binomial model on the data (attached). Getting turned around, so I hope I have it right.

The issue here is that while the test of model effects is significant, the omnibus test is not? The confidence interval also crosses zero. It looks as though this model is a bust, which lead me to think that I am doing something wrong.

Not sure what my next step should be at this point. I thought I should check in case I am barking up the wrong tree entirely.

Questions:
1) Am I correct in that the curve estimation is a good first step, but would not be appropriate as a ‘final answer’ for obtaining a p-value for a shape of a count curve?

2) Is negative binomial analysis what I am looking for here?

3) If so, what would be my next step?

So many grateful thanks for any help you can give me, even if it is simply directing me to resources!

I am using SPSS 22. Reasonably familiar with other regressions (more Logistic, and more cox survival analysis).

#### Bcteagirl

##### New Member
Looking at this again.. would a time series analysis be better? 'spss curve estimation can be used with time series data to generate curve estimation regression models'. It sounds as if I can use this model, after checking the assumptions rather than start over with poisson or a separate time series analysis?

Edit: Ok at this point I am treating the SPSS curve estimation as a time series, and have tested autocorrelation and found none using:

1) Runs test of residuals (Around mean or around median: I am assuming mean would be what I would use) is non-sig.
2) Graphically no evidence of autocorr outside the bartlett two standard error bars (Or partial autocorr).
3) Sig tests associated with this in SPSS non significant except for one, which is .44 or so... so I am hoping I am good here since the other tests are negative.

Last edited: