Comparing pooled variances?

Hi all,

I have a problem where I want to prove that the data belonging to invidivuals of one group has a higher variance than the other.

The experiment involves a parameter of movement that changed in one group of organisms, but without a clear trend. The parameter remains stable in individuals belonging the control group, whereas it either fluctuated in the test group both up and down depending on the individual. I want to show that it changes, but it is hard to test as I am measuring the variances rather than a trend (where regression could be used).

There is a significant effect there. In a crude (and weak) analysis I computed the variances of the individual organisms, and then compared the variences between the control and test group in a simple independant samples t-test using SPSS. The data was "normally" distributed and with unequal variances (although in this case the variance refers to the variance of the variances... quite confusing), and the t-test showed a significant difference... just. The results of this t-test comparison are attached.

A few things bothered me about this analysis:
-The data for each individual organism is compressed into a single value, weakening the comparison.
-It is a two-stage test. A stastical test of stastical outputs: not very beautiful.
-The t-test used is quite basic and antiquated. A more robust test would be preferable.

Would it be possible to compare the variances of two populations when the data is stored as individuals.*

*unfortunately pooling the data prior to calculating variance isn't possible

Alternatively, would there be some kind of sideways manner to attack this problem such as converting the data somehow?



Well-Known Member
The original question wasn't so clear, I thought maybe some example data will help me or others to understand
I see, then I will try to outline the principles: the experiment was to try to quantify changes in movement caused by disease. The data here describes the ratio between two parameters of movement: essentially whether the movement is fast and jerky or slower and laborious. A series of measurements were taken over a period of days to quantify this change in the movement. Each organism tested thus has a number of values (between 3 and 7) depending on the duration of the experiment, (which was decided by how long the disease took).

What was found in the experimental group (n=18) is that from day to day, the value would change, and quite drastically in some cases. However not all organisms in reacted in the same way, so there was no overall trend when the data was pooled together. In some cases the value would increase throughout the experiment, decrease in others, and in some it would increase and decrease depending on the day.

To take a couple of representative individuals (to 2 significant figures):

Experimetal #1:
Day 0 - 3.3
Day 1 - 3.6
Day 2 - 3.5
Day 3 - 3.8
Day 4 - 2.4
Day 5 - 2.9

Variance = 0.24

Experimental #2:
Day 0 - 2.7
Day 1 - 2.4
Day 2 - 3.2
Day 3 - 3.9
Day 4 - 3.0
Day 5 - 3.8

Variance = 0.34

.... and then of the control group (n=10):

Control #1:
Day 0 - 3.4
Day 1 - 3.4
Day 2 - 3.0
Day 3 - 3.0
Day 4 - 3.4
Day 5 - 3.3

Variance = 0.04

Control #2:
Day 0 - 3.8
Day 1 - 3.7
Day 2 - 4.1
Day 3 - 3.3
Day 4 - 3.9
Day 5 - 3.7

Variance = 0.06

If one was to compare the means of the variances, the mean variance of the experimental group was 0.17, and 0.07 for the control group. I want to prove statistically that the variance of this value is higher in the experimental group. I did a t-test to compare the means of the variances, which was significant, but not very robust.

Would be be possible to compare the variance difference between the experimental and control groups, even though the data is split into individuals?

Many thanks for taking the time to read this! - It is always hard to get the head round new concepts, expecially taking tests of tests and variances of variances. Maddening almost!
I gathered that the statistical requirements of this experiment a little too specific (I am surprised that I couldn't find anything on the subject). For the purposes of people reading the thread though who might be in a similar situation, I had a simple numerical workaround:

Simply put, what I did was to normalise the data from every individual so that it has a mean value of 1, which enabled me to pool the data from all the individuals. I then did a simple F-test of the pooled data which showed unequal variances between the test and control groups (hooray).

What I mean by normalising the data is that each data point for an individual would be multiplied by the recriprocal of the mean. For example, the mean value of the first individual was 3.2. The recriprocal of this is 0.308. A data point making up the mean was 3.55, which multipled by 0.308 gives a "normalised" value of 1.09. This was repeated (using functions... not by hand!) for all individuals from the test and control groups, and the data pooled together.

The purpose of this is to bring the levels of different individuals together, whilst maintaining the variance. This enabled testing of the pooled varience (which wouldn't have been possible previously).


Ambassador to the humans
I don't think I understand - wouldn't multiplying by the reciprocal of the mean also impact the variances? Why not just subtract the mean instead of multiplying by the reciprocal if you wanted to eliminate the mean but preserve the variances?
The data points for each individual would be multiplied by the same number, so all the values could become bigger or smaller but will maintain the same relative size with each other (and thus the variance). The goal was to standardise (this is perhaps a more accurate word than normalise) the values of different individuals to the same level.

Subtracting the mean might work, in fact I think that it could hugely amplify any differences in variance. The problem would be that the perceived variance would scale with the mean. If one creature would have much higher values then after subtracting the mean, the differences would still be relative to their initial size. This would make individuals with a higher mean appear to have more variance than those with a lower mean value, even is the variance "spread" is relatively the same. It wouldn't necessarily bias the results one way or another, but for my purposes I would rather play it safe and "fair".

What would be interesting will be to subtract the standardised mean from the standardised data. That will be tomorrows job!