# Testing 'black box' sampling claim - which test to use?

#### philpq

##### New Member
I have what could be called a 'black box system' that samples populations of known size n that have a known proportion of defects. The black box system claims that it adjusts sample sizes so that in 99% of all samples it will find one or more defective items.
I have run tests with the system 13,00 times and get the following frequencies for the number of defective items in a sample:
x f
0 64
1 270
2 745
3 1429
4 2001
5 2169
6 1995
7 1635
8 1231
9 716
10 386
11 212
12 86
13 44
14 13
15 0
16 4
This gives a mean of 5.59 and SD of 2.37 for the number of defects found in a sample. I'm puzzled as to how to use this data to either accept or reject the 99% claim of the black box. What hypothesis test do I use? I thought of using the difference of two proportions i.e. the proportion of samples having one or more defective items is 0.99 in the black box claim and 0.995 in my testing, but because I don't know the sample sizes in the 'black box' I can't calculate the standard deviation.
How can I use this to prove or disprove the claim?
Can I use fewer tests?
Can I put a confidence interval on the 99% claim?
Thanks in advance for any enlightenment that can be offered.
PS: Population sizes are very large (hundreds of millions) if this makes any difference.

#### Dason

I don't think I understand your concerns completely. If the claim just relates to how many samples will contain at least one defective item - how does this relate to the sample size? The way I read that then all you need is the total number of samples and the number of samples that contain at least 1 defective item (which give 13000 and 12936). Clearly from this data you don't have any evidence to go against their claim.

If what you are actually interested in is something different then you'll need to be more specific.

#### hlsmith

##### Less is more. Stay pure. Stay poor.
I wonder if a rate model or beta regression may be of interest, but instead of a null comparison, you can use a constant (1%), but as @Dason mentioned it would help if you posted their claim verbatim.

#### philpq

##### New Member
I don't think I understand your concerns completely. If the claim just relates to how many samples will contain at least one defective item - how does this relate to the sample size? The way I read that then all you need is the total number of samples and the number of samples that contain at least 1 defective item (which give 13000 and 12936). Clearly from this data you don't have any evidence to go against their claim.

If what you are actually interested in is something different then you'll need to be more specific.
Thanks Dason. I agree that it seems clear from the data that I have no evidence to against the claim, but I am trying to make this more formal. Can I do a hypothesis test on the 99% claim?