P charts based on binomial distribution - Statistical Process Control

#1
Hello all,

I have a question related to the use of p-chart in a process where there are defect rates measured.

The control limits for this type of chart is calculated with +/- 3 sigmas around the mean (with mean = np and sd = sqrt(n*(1-p)*p))

As it is using i +/- 3 sigmas it assumes that we can approximate the binomial distribution by a normal distribution, am I right?

NB: for a normal process, control limits computed with mean +/- 3 sd gives 0.27% probability of being out of control limits

From approximating the binomial with a normal distribution, it is commonly said that the binomial distribution must meet this criteria:

np >= 5 and n*(1-p) >= 5

Why this condition is never mentioned (as I know) in most statistical books dealing with SPC? (Such as Statistical Quality Control - Douglas C. Montgomery)

In practice, when having a binomial distribution with very small p and using control limits proposed (np +/- 3 * sqrt(n*(1-p)*p)), the control limits when following the control limits formula is not appropriate.

Has anyone dealt with a control chart monitoring defect rates with small values of p? For your information, n = 72000 and p = 0.000009

Thank you.
 

Miner

TS Contributor
#2
Welcome CedricLVQ,

As a preface, you were fortunate to get the attention of one a very small number of people that would have a clue about a p-chart. There are two other forums that specialize in quality topics, including SPC. The largest one is the Elsmar Cove, and the other one is Quality Forum Online. Talkstats rarely gets into industrial statistics.

Regarding your question, I suspect that your real problem is the large sample size required to detect the small p. P-charts and the formulae used to calculate control limits never envisioned sample sizes as large as 72000. Sample sizes of this magnitude cause the control limits to become too tight, and normal variation in p will cause many false alarms. There is an adaptation to the traditional p-chart called the Laney p'-chart that compensates for large samples. See On the charts: A conversation with David Laney for an overview.

You may also consider using an IMR chart instead of the p-chart. Dr. Wheeler has often recommended this approach. Another potential option would be to try a rare events G-chart. While created to look at the time between rare events, they may work for the number of units between rare defectives.

If this doesn't solve your issue, try reposting in the other forums. There are people there that have much more experience with p-charts than I do. My experience has been with Xbar/R and IMR charts.
 

Miner

TS Contributor
#3
One final note that may help you understand the greater issue. This concept has been mistakenly taught for decades. Control charts are NOT statistical tests. They are EMPIRICAL tests. The control limits were empirically determined to strike an acceptable balance between the economic costs of missing a process signal (shift) vs. the economic costs of over-reacting to process noise (false alarms). Therefore, don't get hung up on whether normal approximations to the binomial distribution are valid.

W. Edwards Deming said the following:
  • “It is true that some books on the statistical control of quality and many training manuals for teaching control charts show a graph of the normal curve and proportions of area thereunder. Such tables and charts are misleading and derail effective study and use of control charts.”
 
#4
Thank you Miner for all these valuable informations. I didn't know the existence of such forums, there are gonna help me in the future for sure.

Your comments and the interview of David Laney about the misuse of p-charts confirmed my fear about the use of p-charts even if I was more concerned about the normal approximation.

I am bit skeptical about what W. Edwards Deming said. I am one among many others that think that before applying a control chart, one must know how the data at hand are distributed. Therefore one can apply the right chart and minimize the rate of false alarms.

In a case where data are not normal at all (let say skewness largely positive), applying a normal based control chart such as the X chart would generate a high amount of out of control limits.

In my case, as I failed to identify a distribution, I opted for the method of quantiles: based on my empirical cumulative distribution function, what is the value closer to 99.73% quantile.

What do you think about this approach?

Thank you.