# Rare event probability

#### jansch

##### New Member
Hello,
I am stuck with a probability problem regarding rare events:

lets say I want to automatically detect cars passing by with a camera. I suppose that works pretty well already, meaning that the event of an error is rare (lets say 1/100).
I want to find out how much I have to drive around (or better: how many cars I have to detect) to make assumptions about my error rate. Also I would like to comment on the significance of this assumptions. Simply speaking: how many samples (cars) do I need to come to which sinifiance of an error rate.

When I looked at literature, I always come across hypothosis testing such as t-tests and alike. However, I think I need another approach since I am not looking for a mean. When I do not find an error for 100 events, surely my error rate does not have to be 0.
I came across this paper (https://www.ling.upenn.edu/courses/cogs502/GoodTuring1953.pdf) clearly stating that r/N (r=events, N=total samples) doesn't make much sense if r is very unlikely.

I hope someone can help me out with ideas on how to tackle the problem.
Thank you.

#### fed2

##### Member
you have to use an 'exact test' like 'fishers exact test' in this case.

#### Miner

##### TS Contributor
What is your desired level of confidence? What margin of error are you willing to accept?

#### hlsmith

##### Less is more. Stay pure. Stay poor.
There are two types of possible errors, false positives (slow cars ticketed) and false negative (fast cars not ticketed). How you though about how this impacts your analyses?

#### jansch

##### New Member
Thank you for the replies.
Fisher's exact test I will have a look at, is new to me, thanks for the hint.
The accepted margin of error is not fixed. Ideally I want to learn about the approach and than be able to determine for example "for a certain confidence interval (lets say 90%) I need to detect X cars".

The two types of error I have taken into account. the general problem applies to both error I would say, however, counting "cars" rather than time(or distance) makes more sense to false negatives, right? There were X cars and I missed Y of those.

#### katxt

##### Active Member
This may be useful. There is a statistical rule of thumb called the rule of three which says that if you have N successes without a failure, you can be 95% sure that the error rate is less than 1 in N/3. So, if you observe say 600 cars without a miss, you can be 95% sure that the error rate is less than 1 in 200. Conversely, if you want to be 95% sure that you are missing less than 1 car in 500, you need to have 1500 successes without failure.