Is this possible?

#1
I've got a set of data split into age groups and the number of people in each age group. I want to test if the number of people in one of the age groups is statistically significantly larger than the mean. Is this possible and what test do I use? I've looked at using a t test but it makes no sense with a population size of one.

For example :

Age Range Frequency
>=15 and <=20 100
>20 and <=25 256
>25 and <=30 278
>30 and <=35 313
>35 and <=40 356
>40 and <=45 489
>45 and <=50 510
>50 and <=55 567
>55 and <=60 620
>60 and <=65 677
>65 and <=70 712
>70 and <=75 707
>75 and <=80 512
>80 and <=85 310
>85 and <=90 178
>90 and <=95 32
>95 and <=100 0

And I want to find out whether one of the frequencies is statistically significantly different to the mean? I've scoured online but I can't find any examples of this type of problem at all. Or, rather than a test of any sort, would I just work out the standard deviation and, assuming I want a 95% confidence level, work out 1.96 times the standard deviation and say that anything 1.96 standard deviations above or below the mean is statistically significantly different?

Thanks in advance
 
Last edited:

Dason

Ambassador to the humans
#2
I don't really understand what your actual question means here. What would it mean for the frequency in a bin to be different than "the mean"? What exactly is "the mean" referring to here? The average frequency? Would that really be all that interesting to know?

I think the better thing to do here is take a step back and tell us what research question you're actually trying to answer. Don't try and put it in terms of any test or anything like that. What is it that you're trying to figure out?
 
#3
I guess what I'm trying to do is to tell if the increase in frequency we see between the >35 to <=40 bracket and >40 to <=45 bracket is significant. It's for a report that has to be presented to a panel. I'm probably getting confused and the mean has nothing to do with it. I find stats very confusing with lots of grey areas and it's very rarely taught well compared to other subjects :(.
 

Karabiner

TS Contributor
#4
Why that particular difference? For example, there are 4 absolute differences
which are larger than that between [35-40] and [40-45]. And why do you want
to perform a test of statistical significance here - what do you try to achieve
in addtion to the descriptive statistics?

With kind regards

Karabiner
 
#5
I just wanted to know whether that particular difference was significant enough to comment on or not, and whether there was any test I could to to show that it was significant enough to comment on? I can explain the other, larger, absolute differences so I don't need to do any further analysis on those particular differences (a combination of mortality rates at the upper end of the age ranges and a much smaller total population in the >=15 and <=20 range with regards to the >20 and <=25 range (the total populations in the ranges from >20 and <=25 up to >70 and <=75 are roughly equal)).

I've got a feeling that I'm just overthinking things here, and that I just need to do something simple rather than apply any test. I just want to be clear that I'm going about things the right way though and want to be as statistically sound as I can be.

Thanks and sorry for being unclear.
 
#6
Hi,
A simple Chi-square test should do it. Compare the distribution you observe to the expected distribution, here a uniform distribution, though you might want to use frequencies from a population pyramid. You can find a tutorial at
https://help.xlstat.com/customer/en...al-goodness-of-fit-test-with-xlstat?b_id=9283
You can also do it with another software if you're not using XLSTAT as this a pretty common approach (the Monte Carlo option of XLSTAT is nice though for small samples).
 
#7
Camille, thanks for the reply; that's brilliant. I haven't got XLSTAT but at least I know how to go about it now with some degree of statistical "soundness".

I always find google searches for stats problems to be rubbish.
 

Dason

Ambassador to the humans
#8
I mean... it will tell you overall that your distribution doesn't follow a uniform distribution. I'm not sure if that answers your question though. I don't really see why you need a test though. If you're just deciding on whether or not to comment on it it sounds like you're convinced it's a larger jump than you would expect. But you haven't actually laid down any concrete notions on what it is you actually expect. I highly doubt you expected a uniform distribution.