# Not Statistically Significant.... or is it?

#### Stinky

##### New Member
I am comparing 2 groups to see if there are differences in GPA across 4 years of high school. Group A has a higher mean for all 4 years, BUT... the results did not meet significance in any individual year (Independent Samples t-test).

Question:
Does the fact that a fractionally higher GPA was achieved across all 4 years by Group A over Group B mean anything?

#### hlsmith

##### Not a robit
Let us see the parameters and sample size,etc.!

#### rogojel

##### TS Contributor
hi,
I just did a t-test for each year - the differences in the means are so snall compared to the std devs that there is no reason to believe they are due to anything but chance , in each year. (p values of roughly 0.5).

Now, the question could be, how can you get 4 years where each time group A is better then group B, by chance? The probability of that would be 1/16 = 0.06, so, that would not be significant either.

BTW it is a bit suspicious that you have EXACTLY the same nunber of participants in each group, each year. Is that really so?

regards

#### Karabiner

##### TS Contributor
how can you get 4 years where each time group A is better then group B, by chance?
Or rather, what is the chance that one group will better than the other in each of 4 trials (i.e. either A > B throughout, or B < A throughout) (2/16, I suppose)?

With kind regards

K.

#### rogojel

##### TS Contributor
Or rather, what is the chance that one group will better than the other in each of 4 trials (i.e. either A > B throughout, or B < A throughout) (2/16, I suppose)?

With kind regards

K.
Hi,
I think that would be a different question. As I actually observed group A being ahead of group B in all 4 years, I would ask what the chances are of seeing A ahead of B in the absence of a systematic effect. A bit like a one-sided vs. two-sided test?

regards

#### Stinky

##### New Member
hi,
...
BTW it is a bit suspicious that you have EXACTLY the same nunber of participants in each group, each year. Is that really so?

regards
No--- that was by design. The "Yes" Group A n was used. Group B was a random sample (generated by SPSS) to match the n of A, taken from about 1,000 participants. BTW--- comparing the A to all ~1,000 of B produces the same result, with a slightly wider margin of difference.

It feels very UNlikely that Group A would have a (minimally) better mean value each year. The tiny difference can be dismissed for each individual year (per the t-test), but for that result to happen 4 years in a row is interesting to me. I need to address that in my findings, but I do not want to over-reach in my interpretations.... LOL.

What can I say about the fact that Group A had that higher score 4 years in a row...??

#### rogojel

##### TS Contributor
It has the same chance as getting 4 heads in a row in a coin toss- nothing to get really excited about Probability of 1/16=0.06

#### hlsmith

##### Not a robit
Two important questions, were people randomly selected to be in A? So what was their comparative GPA prior to, I am presuming, receiving an intervention?

Second, If the two groups are slightly different in 2012, guess what, they are going to continue to be different, year is not independent of the prior year.

You need a baseline measure for both groups, then plot their GPA across time with 95% confidence interval to visualize the time series.

Something like the following. Also, I don't see any harm in using the full B sample, unless a priori you said you were going to use the subsample. A greater sample size will equal smaller standard errors. Just report that you ran it both ways, and the full sample was the post hoc test.

http://www.jimmunol.org/content/jimmunol/196/12/5047/F9.large.jpg?width=800&height=600&carousel=1

#### hlsmith

##### Not a robit
P.S., If these are the same student across time, there is obvious dependence, BUT, you also have to account for the attrition. Does it represent systematic bias? So if you lose people in A did you than just randomly remove people from B, if so that seem like a bad idea for sure.

#### Stinky

##### New Member
P.S., If these are the same student across time, there is obvious dependence, BUT, you also have to account for the attrition. Does it represent systematic bias? So if you lose people in A did you than just randomly remove people from B, if so that seem like a bad idea for sure.
Each year had to be it's own thing, and this is not a pair-wise study tracking individual change. The data shows a somewhat transient nature to students-- coming and going during the four years. If I used the same students and only those who had data for all four years, my n would probably drop by 20% ~ 40% --- and the similarity of the two groups would likely increase even more-- both groups would be committed students who show up to school and work to get decent grades, producing even tighter means.

I wanted to compare all "B" group students (~1,000) to the n of the A students, but was told I couldn't for this study. Those comparison stats look very similar-- would not have made a difference.

#### Stinky

##### New Member
Two important questions, were people randomly selected to be in A? So what was their comparative GPA prior to, I am presuming, receiving an intervention?

Second, If the two groups are slightly different in 2012, guess what, they are going to continue to be different, year is not independent of the prior year.

You need a baseline measure for both groups, then plot their GPA across time with 95% confidence interval to visualize the time series.

Something like the following. Also, I don't see any harm in using the full B sample, unless a priori you said you were going to use the subsample. A greater sample size will equal smaller standard errors. Just report that you ran it both ways, and the full sample was the post hoc test.

http://www.jimmunol.org/content/jimmunol/196/12/5047/F9.large.jpg?width=800&height=600&carousel=1
Group A students had one of three 8th grade teachers identified as having certain characteristics. The study looked for differences based on having been a student of one of those teachers. The exact makeup or cases of the n each year for Group A does change, as some students dropped, moved/changed schools, etc., and then some also came back to finish high school. I agree with you about the full sample for B-- but I was told to do a random sample of the available B cases that matched the n of the A group. Hah--- just trying to get through this, so I just did as I was told. The whole-group stats are very similar anyway-- not a different result:

EDIT: Even the attrition rate is very similar, with 2015 numbers at 79% of 2012 for A, and 77% of 2012 for B I think this study shows that other factors play greater influence than the characteristics looked at here.

Last edited:

#### hlsmith

##### Not a robit
If students left and came, shouldn't the n-value stay more constant across years? I would call out whoever said you had to match the same number each year. I get why people think this is a good idea, but if you have the data use it. You say it doesn't make a difference, but it gets you closer to the truth.

Quite a few rigors are needed to accurately model these data. However, I think a big question is what will these data be used for. If the results will actually impact decisions, you need to make sure you adhere to best practices. If the results are for a class homework, well you may be able to settle for less than perfect.

P.S., It is a little surprising all of your random samples are bigger than the mean.

#### Stinky

##### New Member
If students left and came, shouldn't the n-value stay more constant across years? I would call out whoever said you had to match the same number each year. I get why people think this is a good idea, but if you have the data use it. You say it doesn't make a difference, but it gets you closer to the truth.

Quite a few rigors are needed to accurately model these data. However, I think a big question is what will these data be used for. If the results will actually impact decisions, you need to make sure you adhere to best practices. If the results are for a class homework, well you may be able to settle for less than perfect.

P.S., It is a little surprising all of your random samples are bigger than the mean.
Clearly more left the district and did not return than the inverse of that-- and I agree with you on the full use of all cases in the data. I also was, as you were, surprised that the random sample generated by SPSS for each study year was actually a higher and closer mean. In the end, these groups are just too similar to find significant differences. I am very appreciative of the discussion here-- it confirmed the thinking I had in writing this up. I also looked at attendance--- and those numbers were even tighter than GPA-- a fraction of a day difference between groups over a 180 day school year.

This data is part of a dissertation-- I hope to have the draft writing done today. I have a lot to talk about in my limitations section! back to work...

#### hlsmith

##### Not a robit
Thanks for sharing. Good luck.

P.S., There is a newer commercial book "Weapons of Math Destruction". If you are interested in Education, it has a reoccurring example in the book about teacher ratings.

#### Dason

P.S., It is a little surprising all of your random samples are bigger than the mean.
Not 2014$$\hspace{.1cm}$$

#### hlsmith

##### Not a robit
I was referencing that the random sample means were larger than full sample mean:

2012: 2.42 > 2.33
2013: 2.43 > 2.35
2014: 2.51 > 2.43
2015: 2.70 > 2.62

#### Stinky

##### New Member
I was referencing that the random sample means were larger than full sample mean:

2012: 2.42 > 2.33
2013: 2.43 > 2.35
2014: 2.51 > 2.43
2015: 2.70 > 2.62
That is the target group (Group A) versus ALL other students rather than an equal n sample

#### GretaGarbo

##### Human
First I thought that this is a typical homework. And I thought that it was supposed to be solved in the following way:

1) do the t-test for for the difference of the GPA scores for each of the years. (What rogojel did above.) Then the standard error will be known (or at least estimated) and thus the variances of these differences.

2) what is the sum of these differences over the 4 years? And what is the variance of that sum? (The variance of the sum is the sum of the variances. (I assume that they can be assumed to have zero correlation.)

3) What is the mean of that sum (divide it by 4) and what is the variance of that overall mean. Then you will have the t-test for the overall mean.

A little bit interesting is that I have no intuition if this will be significant or not. Although I have some experience as a statistician.

Show us the result!