I work at a taxi company. Recently I have discovered errors in the assignments. Errors in assignments seem to show up randomly. There are more than 800k driving assignments yearly. There is no way I could sample all of them. Since the assignments can be considered random I was thinking of sampling based on what I experience in my assignments, since the assignments are random. The exact same error is experienced randomly by other drivers as well. Would it possible to measure an overall value ( mean value ) based on my own assignments with 95% confidence and level and +-3% error, given large enough sample size as measured in my assignments?

Can you describe what an error is? For you to be a good sample of the population, your shifts, etc. would need to be comparable to others if errors are not random across time and place. Otherwise you may need to up weight your times and locations to represent the population

I presume I need to change the way I think. What I'm actually sampling is error and no error. There are only two possible outcomes. Assignments with or without errors over period of time. My sample size last year was 581 assignments. Of those there were 72 ( 12%) assignments with errors. Can I, presumed random assignments, calculate the confidence level for 12% error in the assignments, given 5% tolerance level, based on my sample size n=581. In other words, will my sample size be enough to say anything about the overall number?

