# Which test to use? Comparing population with subset from this population.

#### kire001

##### New Member
Hi, here is my example:

In the survey, I have 1000 answers and 70% of respondents answered Yes. Now I take answers only for a specific subset of this group- let's say, men. For example, out of 1000 people, there are 400 men and this subset have 60% of positive answers. What I want to know is - whether answers of this subset are significantly different.
I tried to find the proper test but mostly samples need to be independent...
I don't know if I should use some proportion test or whether I can consider 1000 people as population etc... I am really new to statistics so probably I am asking bad questions, but will appreciate pointing me in the right direction.

Thanks for any help.

#### hlsmith

##### Not a robit
Different from women or the whole? The whole includes them, so depending on how many men there are the whole is just a weighted version of the subsample.

If you are comparing the genders, a chi-sq or Fisher's exact test can be used, the latter for small samples. If this is just a binary comparison, the best bet would be to calculated a rate difference with a confidence interval.

#### kire001

##### New Member
Men vs the whole and whole includes the subsample.

#### hlsmith

##### Not a robit
Many people feel compelled to conduct formal tests for trivial/descriptive statistics. In this scenario, I am not sure a test is needed, since any difference is just based on the prevalence of males in the full population and the two groups are not independent. For all intents and purposes, just reporting percentages and count values for the group and sub group should suffice. Why is a standardized test needed and what does it actually say? Doing a whole bunch of unnecessary tests in a study usually just functions to drown out the actual import ones and also make me personally think that false discovery could now come into play.

#### kire001

##### New Member
Thanks for the answer! Can we say that this subset is significantly different ( with 95 % of probability ) than the whole set? I have a comparison with percentages, counts, etc... but wanted to mark subsets that are significantly under average ....