Is this possible?

New Member
I've got a set of data split into age groups and the number of people in each age group. I want to test if the number of people in one of the age groups is statistically significantly larger than the mean. Is this possible and what test do I use? I've looked at using a t test but it makes no sense with a population size of one.

For example :

Age Range Frequency
>=15 and <=20 100
>20 and <=25 256
>25 and <=30 278
>30 and <=35 313
>35 and <=40 356
>40 and <=45 489
>45 and <=50 510
>50 and <=55 567
>55 and <=60 620
>60 and <=65 677
>65 and <=70 712
>70 and <=75 707
>75 and <=80 512
>80 and <=85 310
>85 and <=90 178
>90 and <=95 32
>95 and <=100 0

And I want to find out whether one of the frequencies is statistically significantly different to the mean? I've scoured online but I can't find any examples of this type of problem at all. Or, rather than a test of any sort, would I just work out the standard deviation and, assuming I want a 95% confidence level, work out 1.96 times the standard deviation and say that anything 1.96 standard deviations above or below the mean is statistically significantly different?

Last edited:

Dason

I don't really understand what your actual question means here. What would it mean for the frequency in a bin to be different than "the mean"? What exactly is "the mean" referring to here? The average frequency? Would that really be all that interesting to know?

I think the better thing to do here is take a step back and tell us what research question you're actually trying to answer. Don't try and put it in terms of any test or anything like that. What is it that you're trying to figure out?

New Member
I guess what I'm trying to do is to tell if the increase in frequency we see between the >35 to <=40 bracket and >40 to <=45 bracket is significant. It's for a report that has to be presented to a panel. I'm probably getting confused and the mean has nothing to do with it. I find stats very confusing with lots of grey areas and it's very rarely taught well compared to other subjects .

Karabiner

TS Contributor
Why that particular difference? For example, there are 4 absolute differences
which are larger than that between [35-40] and [40-45]. And why do you want
to perform a test of statistical significance here - what do you try to achieve
in addtion to the descriptive statistics?

With kind regards

Karabiner

New Member
I just wanted to know whether that particular difference was significant enough to comment on or not, and whether there was any test I could to to show that it was significant enough to comment on? I can explain the other, larger, absolute differences so I don't need to do any further analysis on those particular differences (a combination of mortality rates at the upper end of the age ranges and a much smaller total population in the >=15 and <=20 range with regards to the >20 and <=25 range (the total populations in the ranges from >20 and <=25 up to >70 and <=75 are roughly equal)).

I've got a feeling that I'm just overthinking things here, and that I just need to do something simple rather than apply any test. I just want to be clear that I'm going about things the right way though and want to be as statistically sound as I can be.

Thanks and sorry for being unclear.

CamilleJosion

CaJosion
Hi,
A simple Chi-square test should do it. Compare the distribution you observe to the expected distribution, here a uniform distribution, though you might want to use frequencies from a population pyramid. You can find a tutorial at
https://help.xlstat.com/customer/en...al-goodness-of-fit-test-with-xlstat?b_id=9283
You can also do it with another software if you're not using XLSTAT as this a pretty common approach (the Monte Carlo option of XLSTAT is nice though for small samples).