# [Excel] Determine errors that would have been encountered if required confidence for sample size was met (when it wasn't)

#### SerenityNetworks

##### New Member
In my case, I perform an accuracy audits where, for example, my required accuracy is 99.90% and my required confidence level is 90.00%. In my case, the calculations are binomial; that is, a single test is either right or it's wrong, then I move to the next. There is no resampling. I use Excel to calculate minimum sample sizes, confidence, and allowed errors. When an audit reveals too many errors for a sample size and confidence to both exceed the required minimums, I need to calculate penalties based on the number of errors exceeding the minimum required accuracy. This is fine when the measured accuracy is below, for example, 99.90%. But I'm stumped as to how to proceed when the measured accuracy passes the requirement, but the required confidence fails. For example:

I know that if I have a sample size of 3889, and I only have 1 error, then the formula "=1-BINOM.DIST(1,3889,1-0.999,TRUE)" will tell me my confidence level is 90.01%. My measured accuracy would be "=1-(1/3889)" or 99.97%. This meets both my measured accuracy and confidence level requirements.

But if I have 2 errors then my confidence level is 75.53% (failing), even though the measured accuracy is 99.95% (passing). So my accuracy is okay, but it's not at the required confidence level. The audit fails.

Summarizing the first paragraph with the example: My issue is that I need to assign penalties for failing to meet the required accuracy and confidence, but the penalties are based only on the number of errors beyond the required accuracy. With a sample set of 3889 I have to get 4 errors before my measured accuracy drops below 99.90%, even though the required confidence fails with both 2 and 3 errors. I'm stumped on how to assign penalties in a case where the measured accuracy passes the requirement, but the required confidence fails.

Is it possible, and would it be correct, to be able to calculate backwards to determine 'n' by saying something like, "With 3889 samples at 74.53% confidence, then it's expected that 'n' errors would have occurred if the confidence were at 90.00%." If so, how would this be done? Would there be a margin of error that could be calculated? If so, how? I'm stumped on how to move forward with this problem.

Note: I have rudimentary understanding of statistics, but not near enough to feel confident that any answer I come up with on my own would be appropriate. I need guidance, both for a solution and to understand the solution well enough to explain it to others. I also posted this question on an Excel message board before I found this forum. I have not received any answers. I will keep an eye out everywhere so the efforts of those that may help aren't wasted.

I'd appreciate any experienced thoughts on this matter.
Andrew

#### katxt

##### Well-Known Member
Can you explain a bit more fully what you mean by "confidence" in this context please.

#### SerenityNetworks

##### New Member
In this case, I'm talking about what I understand to be confidence level. In my example, the 90% confidence level that is required says that if the audit were repeated over and over again (above and beyond the 3889 samples we audited), the results would match the results from the actual 3889 samples 95% of the time.

Using Excel, with 3889 samples, 0 errors would give me 100% measured accuracy (pass) with 97.96% confidence (pass) = audit pass.
Using Excel, with 3889 samples, 1 error would give me 99.97% measured accuracy (pass) with 90.01% confidence (pass) = audit pass.
Using Excel, with 3889 samples, 2 errors would give me 99.95% measured accuracy (pass) with 74.53% confidence (fail) = audit fail.
Using Excel, with 3889 samples, 3 errors would give me 99.92% measured accuracy (pass) with 54.46% confidence (fail) = audit fail.
Using Excel, with 3889 samples, 4 errors would give me 99.90% measured accuracy (pass) with 34.95% confidence (fail) = audit fail.
Using Excel, with 3889 samples, 5 errors would give me 99.87% measured accuracy (fail) with 19.77% confidence (fail) = audit fail.

For the audit with 5 errors, I know what to do to calculated penalties, since the measured accuracy is below 99.90%. But for the audits with 2, 3, and 4 errors that failed to meet the required confidence level, I cannot determine a way to calculate penalties, since the measured accuracy is above 99.90%.

Last edited:

#### katxt

##### Well-Known Member
OK, I see it now. The auditor is prepared to take up to a 10% risk of allowing a faulty batch to pass. I'm not sure what you mean by penalties for 2, 3 and 4 errors, but I'm relating the situation of industrial attributes sampling. For single sampling, you would pass the 0 and 1 errors and fail the rest. In double sampling you give 2, 3, and 4 errors a second chance, and I think that is what you have in mind here. In general you take some more samples, find the total errors and the total sampled and repeat your excel binomial trick. For example, to give the 2, 3 and 4 errors cases a second chance, take another 4115 to get 8004. We are allowed a total of 4 errors to pass both parts of the test, so the 2 error case could afford 2 more errors, and the 3 error case 1 more error and the 4 error case must be error free to pass.

An alternative is sequential sampling. Just keep sampling and record total samples and total errors. After each sample do your binomial trick until you get either pass/pass or fail/fail. Set up Excel to do this automatically. In fact, you can do this from the very start of your audit and often you will get a result before 3889 have been taken

#### SerenityNetworks

##### New Member
No, sorry. I have not been clear. This is not industrial testing where double sampling or any kind of resampling is occurring. This is vehicle detection testing. A vehicle passes and it is either detected or it is not detected. In my example, 3889 vehicles passed the detection area. If 1 or less vehicles are missed being detected then measured accuracy is greater than 99.90% (a pass) and the confidence level is greater than 90.00% (a pass). So the audit of that vehicle detection area passes.

If 5 vehicles are not detected then the measured accuracy is 99.87% (a fail) and the confidence level is less than 99.00% (also a fail). To keep it simple, I'll use some made up penalty numbers. Let's say the penalty is $1000 for every 1/100th below 99.90% measured accuracy. Therefore, if 5 vehicles are missed then the measured accuracy is 99.87% and the penalty would be$3000.

My dilemma is when the vehicle detection area fails the audit because of a confidence level less than 99.00%, but the measured accuracy is >= 99.90% How can I calculate a penalty when the basis for the penalty is a measured accuracy below 99.90%? This is the case with 2, 3, and 4 detection failures. In all cases the measured accuracy is greater than 99.90%. The reason for the audit failure is because the confidence level for all three cases is less than 99.00%.

I'm stumped. That's why I'm wondering if there is some way to calculate backwards. Perhaps something like, we had 3 errors, so the confidence level is 54.46%, but if we had a confidence of 90.00% then we would expect to have had 'n' errors instead of just 3. That would drop the measured accuracy below 99.90% and then we could calculate penalties. But that's just an idea. I have no idea if it is valid or not. I'm not a mathematician or statistician by any stretch of the imagination. I'm just looking for some sort of valid and fair solution to my dilemma.

Thanks again,
Andrew

#### katxt

##### Well-Known Member
So when you say penalties, do you mean financial penalties?
Perhaps something like, we had 3 errors, so the confidence level is 54.46%, but if we had a confidence of 90.00% then we would expect to have had 'n' errors instead of just 3. That would drop the measured accuracy below 99.90% and then we could calculate penalties. But that's just an idea. I have no idea if it is valid or not.
It seems you already have that. To get 90% you need 1 error, so 3 errors is 2 over the limit. so you let them through and penalize them 2 units. kat

#### SerenityNetworks

##### New Member
Hmmm. That seemed a great idea at first glance. But it has some inconsistencies. With a smaller sample size (3889) the one who misses 5 would be penalized the same as one that missed 4. For larger sample sizes (38890) the ones who miss 32 to 40 are penalized more heavily than one that missed 41. I'm not seeing how the method will work. Is there something else I'm missing?

#### katxt

##### Well-Known Member
Why not just use the same formula for blue ones? kat

#### SerenityNetworks

##### New Member
The blue formulas are the ones per contract. The yellow is a gap in the contract for which we are trying to help find a fair and accurate method for assessing penalties.

#### katxt

##### Well-Known Member
OK. All I can suggest is that you make a graded scale between no penalty and the contract penalty; 1x3/4, 2x3/4 and 3x3/4 for the first lot. and 1/11. 2/11. 3/11 etc for the second
I think the problem is one for management, not a statistician. so I'll bow out now. Good luck with the problem. Cheers, kat

#### SerenityNetworks

##### New Member
Thanks kat. That's the solution I was pondering this morning. I can't think of anything better and it is consistent across all ranges of sample sets. I appreciate the confirmation. I agree to that this will end up being a management decision. I just want to offer a well thought through suggestion.

Thanks again,
Andrew