Edit: I agree with Jesper. Why are you including both income and GDP? What's the motivation? You should also consider adding interest rates which is quite common in the literature.

Edit: percentages and dollars doesn't matter. Just natural log all the variables and they'll be of a similar scale (100% standard to log these variables in cointegration analysis).

You don't want OLS with these variables. First, make sure that your GDP and Income data are in current prices / real prices. Your data is in this form so that's good. This is important since a sufficiently powered statistical test will always detect a relationship between inflation and any nominal price series since inflation constitutes a portion of the variance of this series. You might also consider adding the stock market levels to the set of variables if it aligns with your research question.

These variables are non-stationary and most likely cointegrated. You should first do a Johansen test, using lag length determined (by the inconsistent but best for this test) AIC over a VAR in the log-levels of the data. Granger argues against the usage of log-levels for monthly or quarterly economic data but it seems to be accepted practice in order to have comparable scales and limit the extent of uneven growth rates in the levels series.

If you determine that the cointegrating rank is 1 or 2 through the maximum eigenvalue and the trace tests then you want to specify a VECM (with the same lag length as you determine in the preceding step). Estimate this with OLS with the cointegrating vector that corresponds to the largest eigenvalue from step one. Once the VECM is specified you want to look at the statistical significance and magnitude and sign of the error correction term's coefficient estimate in all 3 lines of your VECM system. This will tell you which variable is responsible for adjusting and then correcting any disequilibriums.

Important: Include an unrestricted constant in the restricted VECM model such that you're accouting for the growth properties in your GDP and income variables. Time trends are not needed but are sometimes used in the literature.

If you determine that your cointegrating matrix is full rank then something has gone wrong since these variables are definitely I(1). I still haven't figured out what this means.

Once you've done this it's good to look at the forecast error variance decompositions and the impulse response functions of the VECM. This will tell you the speed and direction of the impulse response of one series to an unexpected unit or standard deviation shock in another series. It may tell you that one variable in the system does not react to a shock to another variable which is also an interesting finding.

If you don't detect a cointegrating relationship in step one, you should simply specify a VAR differences and conduct impulse response analysis. You can not estimate IRFs with a VARL when the variables aren't cointegrated because the IRF will not be root-n-consistent and will converge to a random number instead of the population value; also the finite sample properties are not pretty (however one working paper suggests it's fine for < 10 time steps). Alternatively, you should apply Johansen (2000) and Joyeux (2007) and specify an exogenous break in the trend or levels of the cointegrating equation or unrestricted constant. If there is no gigantic financial crisis or other "obvious" exogenous shift you should determine it endogenously with Luktepohl, Saikkonen and Trenkler (2004). Another way would be to go back to Engle-Granger (1987) and use the Gregory and Hansen (1996) method for detecting an endogenous level shift. Alternatively you could use the Seo (2006) test for a null of no linear cointegration against an alternative of threshold cointegration. Note that even if you find linear cointegration in step 1 it's good to use Hansen and Seo (2002) to test for a null of linear cointegration against an alternative of threshold cointegration such that you're most accurately modelling the true data generating process.

Once you've done this it's interesting to look at the "causal" properties of the system through Granger causality and instantaneous causality. Both of these are tests that look at whether the MSE of an optimal prediction of one series is improved by adding the next-period observation of an alternative series to the information set (instantaneous causality) or alternatively the entire time series of an alternative series to the information set (Granger causality). To test for Granger causality it's by far the best to apply Toda and Yamamoto (1995), where you run a VAR in levels of order p+m+l, where p is the optimal SBIC VAR-levels (VARL), m is the maximum order of intergration of any of the variables (i.e., 1 in the data that you're using), and l is the additional lags needed to ensure that the residuals are not serially correlated by an LM test. This process ensures that a Wald statistic on the test that p+l lags of one series is asymptotically Chi-squared distributed (and it also has nice finite sample properties). So once you reject the null hypothesis you have determined Granger causality.

For instantaneous causality look at the causality(.) function in the vars package.

Advanced modelling options that you may be interested (depending on the data, theory and research question) in include time-varying cointegration using Chebyshev polynomials (

http://econ.la.psu.edu/~hbierens/TVCOINT.PDF) for which Matlab and GAUSS code is available; granger causality with structural breaks; non-linear granger causality; controlling for exogenous variables; applying pre-processing on the time-series before you do any analysis using signal processing methods (e.g., extracting a cyclical I(0) component from an I(1) trending variable); markov switching vecms; structural vec/vars; having the regime state be determined by an exogenous variable; smooth transition var/vecs.