Standard deviation of an average of a number of results

#1
I have written a computer program that does a monte carlo simulation of certain game playing situations to determine how likely various outcomes are for a given decision. A typical result from 5,000 tests may look like this:

Outcome 0 1 2 3 4 5 6 7 Ave Outcome
Results 45 558 1,518 1,421 977 396 81 4 2.85


The outcome is a measure of success in the game so average outcome for a given decision is the key number to look at.

Calculating the standard deviation for each individual result is easy but what I really want to know is how reliable the 2.85 average is. ie I want to calculate a standard deviation for it so I can say there is a 95% chance of the correct answer being 2.85 +/-2*SD. Hence the user can adjust how many tests they do in order to get a result average that is as accurate as they need.

How could I do that calculation ?
 
Last edited:

Dason

Ambassador to the humans
#3
I mean... They have the data. They can just compute the percentiles directly. No need for the standard deviation if that's the goal here.
 

hlsmith

Less is more. Stay pure. Stay poor.
#4
Yeah, @Dason is right here. In MC simulation studies you use the percentiles of interest. So order all of the outputs and find the 2.5 and 97.5 values.
 

katxt

Well-Known Member
#5
But what is asked for is a little more than that.
I want to calculate a standard deviation for it so I can say there is a 95% chance of the correct answer being 2.85 +/-2*SD. Hence the user can adjust how many tests they do in order to get a result average that is as accurate as they need.
Find the SD of your sample 5000 outcomes. The SD in your formula above is the SE or standard error of the mean of these 5000 outcomes. It can be calculated as SE(for sample of 5000) = SD(outcomes)/sqrt(5000)
To estimate the sample size needed to give a particular desired SE, scale the SE from your sample of 5000.
SE(from sample size n) = SE(from sample size 5000)*sqrt(5000/n) Solve this equation to find n for a given desired SE.
 
#6
I did some tests to get a handle on the likely answer. Doing 20x1,000 tests got 20 results for what I showed in row 1 (ie the 2.85 average) that varied between 2.78 and 2.93 suggesting the SD (if that is the right term to use in this case) of about 0.035. I also did 5x5000 tests and as expected got a much tighter clumping of results which suggested a SD of about 0.02. However different test conditions have wider or narrower sets of curves (and in real life a result range is 0-13 rather than 0-7 although that is very rare.

Note this is not an academic calculation - it is an attempt to provide a useful guideline for users to come to a real life conclusion about strategy in a given situation in the game. Hence it does not have to be exact - just close enough for sensible conclusions to be drawn.

Having said that however repeatedly making a decision with an average result 0.05 worse than the optimum will almost always lead to defeat and it is a game with imperfect information so making decisions that go with the odds is important.

FYI the SD of the most likely result (ie the outcome that occurred 1518 times) is about 33 with 5,000 tests and about 15 (from outcome of 300 or so) with 1,000 tests.
 

katxt

Well-Known Member
#7
OK. These results are consistent with the theory above.
The SD of the sample data is about 1.2 so samples of 1000 will have a SE of about 1.2/sqrt(1000) or 0.038. This is more or less your value of 0.035.
If you have samples of 5000, you can expect the SE to be about 1.2/sqrt(5000) or 0.017. You got about 0.02.
So, to answer your original question, our best estimate for the CI is 2.85 +/-2*0.017.
If you want to know what would happen if you had taken a sample of 500, the SE of the mean of would have been 1.2/sqrt(500) or 0.054.
 
Last edited:
#8
Thanks katax for that, the details of your calculation allowed me to find my error.

Turns out my problem was in my coding and I was getting results that I could see were wrong. It was an issue due to the results array using col 0 as a description of the row so I was reading results from col 1 to 14 which were in fact results for outcomes 0-13 and I was multiplying the sqrt(Ave-outcome) by the outcomes in the wrong column. Pretty stupid but these things happen.