p-values from Monte-Carlo

I have data from two different groups and they both have 19 subjects. I find a statistical measure for the two groups and claim that one is greater than the other. But this difference could have occurred due to some outliers in either of the group. To check for the robustness of the result I do the following. I randomly pick 10 subjects from each group and find the statistical measure. I do this for 100 different subsets of 10 subjects. Then I find the mean and SD of the 100 runs to do a two sample t-test and then to find the corresponding p-value. My question is what degrees of freedom (n) do I use to find the t and p value. Even though I have 100 observations from the 100 runs, I don't think I can use 100 since they are not independent observations. Many of the subjects are repeated in the subsets. Can anyone please help me to find the right n ?
obviously with replacement... I have only 19 subjects and I sample 10 subjects at a time... if I do without replacement I can select only 2 subsets, one with 10 subjects and another with 9... Since I replace the subjects I can do 100 random subsets of 10 subjects...
I am not sure if I have answered your question about 'replacement'... the answer I gave is with a layman's understanding of what replacement is.


TS Contributor
For each t-test, if the groups you are comparing each have n=10 subjects, then dof would be 10+10-2 = 18. The formula is:

(n1 - 1) + (n2 - 1) = n1 + n2 - 2
Thanks John for your response... But I have a question with your answer... what you are saying is that n depends on the size of subsample... and it does not matter the how large the size of the population is or how many times I subsample the population?


TS Contributor
I just re-read your approach and now I understand what you've done....however I don't think it's a good approach.

If you want to check the robustness of your original result, you should take many (i.e., 100 or more) random subsets of n=10 from each group, and for each of these random subsets of n=10, do a t-test, then see how many t-tests have a significant p-value. In other words, do many t-tests of n=10 vs n=10....
Thanks again for your reply but I cannot do what you suggested.... I am sorry that my email wasn't clear enough... for a particular subset of 10 subjects I find just one statistic (one number, it is basically a correlation between two features of this subset).... and not 10.... and hence I do not have a standard deviation to find the t value. And that's why I did it for 100 different subsets and found the mean and standard deviation of the correlation from this 100 runs... I am sorry if my method is confusing...

Is there another established way of testing for robustness that can be applied to my problem... thanks
I just find the correlation coefficient between the two features for the two groups separately and see that for one group it is higher than the other... and want to confirm that this result was not caused by a few outliers in one of the groups....
I am not sure if I have answered your question about 'replacement'... the answer I gave is with a layman's understanding of what replacement is.
If you sampled with replacement, then a particular data point could end up in a sample of 10 multiple times. Sampling without replacement would limit each of the 19 data points to appearing in each sample of 10 one time.

The way you estimate a population's variance from a sample depends on if you sample with replacement or not. But I think this is a moot point, given what has been said above.