Paired or unpaired test?

Forum experts,

I'm doing some environmental data anlalysis and I'm struggling with the correct statistics approach. I have data from 4 3-week studies where a number of parameters were collection both upwind and downwind of an emissions site. I have 1-min data that I have also averaged into 1-hour intervals. I want to determine if the difference between upwind and downwind measurments during these time periods was statistically significant.

So far:
My sites are East and West of the emissions source, about 1/2 km away in each direction (>1km distance between sites). I have grouped data into easterly and westerly flow regimes to separate the upwind and downwind observations from the rest of the data (I used 30 degree bins centered on 90 and 270 degrees).

Is is appropriate to use a paired comparison test (I'm looking at the paired t test or the Mann-Whitney test)? The sites are not collocated, which means there is transport time between the samples measured at either location. I was thinking I could do the test on the 1-hour averages to kind of smooth out the effects of transport time between sites.

Many thanks for any help or advice you can give me!



New Member
Dear Cassarch

I think that you could probably used a paired t-test if I understand your study design correctly.

You would compare upwind versus downwind measurements for each 1-hour time interval.

I think that you are right in that you have to make sure that the time interval is much greater than the transport time - you might even need to use 2h intervals here dependent on the wind speed/transport time. I have no idea, however, how large the time intervals versus the wind speed would have to be to smooth out these effects (I'd guess if transport time is 5 minutes (i.e. windspeed = 12kph), 1h intervals are reasonable).

I guess that you would group data from each hourly (or other) time interval, calculate the difference in the 2 values measured, and then perform a t-test compared against the null hypothesis that difference = 0.

As long as sample size > 30 or so (which, I guess from your data it is) the t-test should be fine due to the central limit theorem even if the underlying data set is not normally distributed (I think central limit theorem can be applied with sample sizes greater than as little as 10 or so).

I hope this helps. I'd be interested to know if anyone else agrees.

You might also want to look at whether the magnitude of differences between measurements are correlated with windspeed as I suspect they might be.

With best wishes

Thank for the response, Simon!

Another question:

Someone asked if I was going to consider autocorrelation of the data. I'm not sure I understand the concept of autocorrelation or how it would apply to these data. Would this be similar to looking at something like the gaseous data (ammonia in this case) compared to wind speed, as you suggested?