Hello,

I am studying a predator-prey relationships using field data collected biannually over 5 years. For example, here is some abundance data:

pred,prey

23.0,0.1

31.5,0.7

18.0,1.4

26.0,2.6

25.5,3.3

36.7,5.0

52.3,20.9

38.7,11.1

47.0,13.9

43.3,13.7

I wish to estimate the numerical response (coefficient of linear regression predator ~ prey). I am using R for statistical computing:

I checked the cross-correlation, and there is no lag:

Autocorrelations of series predXprey, by lag

-3 -2 -1 0 1 2 3

0.121 0.278 0.599 0.925 0.463 0.468 0.131

Both series are autocorrelated and partial acf suggested first-order:

Autocorrelations of series pred, by lag

0 1 2 3 4 5

1.000 0.489 0.396 0.074 -0.288 -0.293

Durbin-Watson test

data: pred ~ 1

DW = 0.8386, p-value = 0.01694

alternative hypothesis: true autocorrelation is greater than 0

Autocorrelations of series prey, by lag

0 1 2 3 4 5

1.000 0.502 0.347 0.146 -0.207 -0.331

Durbin-Watson test

data: prey ~ 1

DW = 0.7899, p-value = 0.01272

alternative hypothesis: true autocorrelation is greater than 0

Here's the fitted model:

Call:

lm(formula = pred ~ prey, data = mydata)

Residuals:

Min 1Q Median 3Q Max

-7.6614 -1.6837 -0.8583 2.2727 6.8586

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 23.6214 2.1015 11.240 3.52e-06 ***

prey 1.4571 0.2116 6.885 0.000126 ***

Residual standard error: 4.534 on 8 degrees of freedom

Multiple R-squared: 0.8556, Adjusted R-squared: 0.8376

F-statistic: 47.4 on 1 and 8 DF, p-value: 0.0001264

There was no autocorrelation in the linear model residuals, although it is not easy to estimate autocorrelation from such short time series. I could fit a generalised least squares model with AR(1) errors, but I don't think it is required.

Autocorrelations of series model$residuals, by lag

0 1 2 3 4 5

1.000 -0.435 0.018 -0.239 0.279 -0.067

Durbin-Watson test

data: model

DW = 2.8667, p-value = 0.09089

alternative hypothesis: true autocorrelation is less than 0

I've been reading about autocorrelation and regressions. It seems that pred and prey are cointegrated. The strong autocorrelation in the prey series is driving the strong autocorrelation in the pred series. There is no autocorrelation in the pred-prey model. The regression is not spurious.

So how do I interpret the Ordinary Least Squares (OLS) regression results?

R-squared is the same as a Pearson correlation and it must be wrong. Autocorrelated variables are not independent.

I've read that the OLS estimates for cointegrated regressions are unbiased but the t-statistic diverges at rate T^(1/2) for I(1) processes, where T is the time series length. Type I errors can result. A rough corrected t-statistic and P-value for the slope estimate is:

> 6.885/(sqrt(10))

[1] 2.177

> 2*(1-pt(2.177228, 8))

[1] 0.061

But this statistic surely does not precisely follow a t-distribution.

Is there a simple way to make inferences from my data? Some limitations and problems I can see:

1) short time series (T = 10)

2) fractional autocorrelation

3) deterministic time trend (at least in the first few years where there is strong prey population growth)

4) cointegration

Most of my references are from econometrics, where they have long time series.

Stephen.

I am studying a predator-prey relationships using field data collected biannually over 5 years. For example, here is some abundance data:

pred,prey

23.0,0.1

31.5,0.7

18.0,1.4

26.0,2.6

25.5,3.3

36.7,5.0

52.3,20.9

38.7,11.1

47.0,13.9

43.3,13.7

I wish to estimate the numerical response (coefficient of linear regression predator ~ prey). I am using R for statistical computing:

I checked the cross-correlation, and there is no lag:

Autocorrelations of series predXprey, by lag

-3 -2 -1 0 1 2 3

0.121 0.278 0.599 0.925 0.463 0.468 0.131

Both series are autocorrelated and partial acf suggested first-order:

Autocorrelations of series pred, by lag

0 1 2 3 4 5

1.000 0.489 0.396 0.074 -0.288 -0.293

Durbin-Watson test

data: pred ~ 1

DW = 0.8386, p-value = 0.01694

alternative hypothesis: true autocorrelation is greater than 0

Autocorrelations of series prey, by lag

0 1 2 3 4 5

1.000 0.502 0.347 0.146 -0.207 -0.331

Durbin-Watson test

data: prey ~ 1

DW = 0.7899, p-value = 0.01272

alternative hypothesis: true autocorrelation is greater than 0

Here's the fitted model:

Call:

lm(formula = pred ~ prey, data = mydata)

Residuals:

Min 1Q Median 3Q Max

-7.6614 -1.6837 -0.8583 2.2727 6.8586

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 23.6214 2.1015 11.240 3.52e-06 ***

prey 1.4571 0.2116 6.885 0.000126 ***

Residual standard error: 4.534 on 8 degrees of freedom

Multiple R-squared: 0.8556, Adjusted R-squared: 0.8376

F-statistic: 47.4 on 1 and 8 DF, p-value: 0.0001264

There was no autocorrelation in the linear model residuals, although it is not easy to estimate autocorrelation from such short time series. I could fit a generalised least squares model with AR(1) errors, but I don't think it is required.

Autocorrelations of series model$residuals, by lag

0 1 2 3 4 5

1.000 -0.435 0.018 -0.239 0.279 -0.067

Durbin-Watson test

data: model

DW = 2.8667, p-value = 0.09089

alternative hypothesis: true autocorrelation is less than 0

I've been reading about autocorrelation and regressions. It seems that pred and prey are cointegrated. The strong autocorrelation in the prey series is driving the strong autocorrelation in the pred series. There is no autocorrelation in the pred-prey model. The regression is not spurious.

So how do I interpret the Ordinary Least Squares (OLS) regression results?

R-squared is the same as a Pearson correlation and it must be wrong. Autocorrelated variables are not independent.

I've read that the OLS estimates for cointegrated regressions are unbiased but the t-statistic diverges at rate T^(1/2) for I(1) processes, where T is the time series length. Type I errors can result. A rough corrected t-statistic and P-value for the slope estimate is:

> 6.885/(sqrt(10))

[1] 2.177

> 2*(1-pt(2.177228, 8))

[1] 0.061

But this statistic surely does not precisely follow a t-distribution.

Is there a simple way to make inferences from my data? Some limitations and problems I can see:

1) short time series (T = 10)

2) fractional autocorrelation

3) deterministic time trend (at least in the first few years where there is strong prey population growth)

4) cointegration

Most of my references are from econometrics, where they have long time series.

Stephen.

Last edited: