ar.ols(df$y, order.max = 1)

ar.ols(df$y, order.max =2)

My dataset is as follows: I do have yearly data and calculate generational averages, whereas one generation is equal to 30 years. In order to allow for overlapping generations, I calculate yearly moving averages. E.g. the y of 1915 contains the average y for the generation 1900-1930, the y of 1916 the average y for the generation 1901-1931 and so on. For my previous analysis this worked fine.

Now I want to determine the correlation between two subsequent generations (AR 1 process) or run an AR(2) process respectively. E.g. I want to regress the y of 1915 on the one of 1885 (as the second one is the average for the generation 1870-1900) in order to find the correlation between the two generations. In a previous post see my related post I got some great advices on how to run those models while using the moving averages.

However I discovered that my approach using moving averages had some drawbacks as it creates a time dependency in the data even if there wouldn't be one fundamentally. I therefore suggest to use sharp generation calculations for this AR(1) calculation. In my opinion, there should be no problem if I reduce the dataset in a way that I only have the data point 1915 (as the average of the generation 1900-1930) and the data point 1945 (as the average for the generation 1930-1960) and so on. Is that correct? Can I then calculate an AR(1) process for group means (or generation means) with this reduced dataset?