Include Month Variable in GLM?

Buckeye

Active Member
#1
Hi, I think I've confused myself on this topic. In general, is it acceptable practice to include a month variable in cross-sectional data for a non-time series application? I've collected two years worth of data in a single instance. I'm studying repair times for vehicles and I believe there may be month-to-month variability and certainly a COVID impact. I'm not building a time series model. i.e. I don't care to forecast repair times in the future. I simply want to create categorical variables for spring, summer, fall, winter etc.

Thanks
 

hlsmith

Less is more. Stay pure. Stay poor.
#2
Yeah that seems fine. I will think about it more later, but should be OK. For clarification you wouldn't have cross-sectional data correct? You just have retrospective data. You could have collected it at a cross-section of time and had people recall stuff - which could have recall bias - I guess that could be a scenario as well.

But if the data was pre-existing and we originally collected when it was generated or a set lag after the service - that would be retrospective. Not trying to get in the weeds - just clarifying.
 

Buckeye

Active Member
#3
Ya. That's the case. The data is pre-existing and I decided on an end date for the data pull which is also in the past. It's all in our databases.