Calculating Ppk and Cpk. Am I doing this right?

#1
Hi all.

I am trying to make sure that I am doing this the right way. I hope you guys will look through my calculations.

First, I have a list of 42 measurements. They are presented lower on the page. LSL = 4.070, USL = 4.090. Now, I want to calculate Ppk from the data. I take the mean and SD of the complete data (42):

Ppk_upper = (USL-Xbar)/(3*sd)
Ppk_upper = (4.090-4.079476)/(3*0.002640095)
Ppk_upper = 1.32879

Ppk_lower = (Xbar-LSL)/(3*sd)
Ppk_lower = (4.079476-4.070)/(3*0.002640095)
Ppk_lower = 1.19646

Ppk = 1.19646, as this is the lowest
-------------------------------------------------

Cpk_upper = (USL-Xbar)/(3*sigma)
Cpk_lower = (4.090-4.079476)/(3*0.001224)
Cpk_upper = 2.86601

Cpk_lower = (Xbar-LSL)/(3*sigma)
Cpk_lower = (4.079476-4.070)/(3*0.001224)
Cpk_lower = 2.58061

Cpk = 2.58061, as this is the lowest

For Cpk I group the values below in pairs of two - as this is the sample size. Then I take the mean of each pair, and then take the mean of the means.

I find sigma for Cpk by taking find the mean range by subtracting the measurements in a pair from each other then taking the mean of them.

Sigma = mean of ranges/D2
Sigma = 0.001380952/1.128
Sigma = 0.001224


4,078
4,079
4,076
4,076
4,081
4,079
4,082
4,084
4,08
4,08
4,081
4,084
4,077
4,079
4,078
4,08
4,079
4,079
4,077
4,078
4,08
4,081
4,078
4,078
4,077
4,078
4,078
4,077
4,082
4,08
4,075
4,08
4,081
4,08
4,083
4,082
4,087
4,085
4,077
4,074
4,079
4,079
 

Attachments

Last edited:

Miner

TS Contributor
#3
Cabal,

Welcome to Talkstats. I am afraid that you may have been waiting a long time for a response. I am probably the only person that has a clue what you are asking. I can tell that you are using the correct formulas for Ppk and Cpk, but cannot confirm that you are using the correct standard deviations for each without the entire data set. You mentioned that you were using 42 data points, but there are only 32 shown, so I cannot duplicate your results. Provide the entire data set and I will confirm your calculations. I also noted that the subgroups provided were not in a state of control, so Cpk would not be valid.

For future reference, if you have quality specific questions, I recommend the following discussion forums:
Disclaimer: I am a moderator on both forums.
 
Last edited:
#5
Miner,

Thanks for your reply - I will check out the other forums. Thanks for agreeing to check my calculations. I have edited the first post, there should now be 42 measurements. I also attached a picture - to visualize the groupings (samples of 2) and the range. I am looking very much forward to reading your findings!

hlsmith, yes - a lot of fancy acronyms for an old microbiologist such as myself :)

Thanks,
Nicholas
 

Miner

TS Contributor
#6
I found one error in your calculation. The correct range for subgroup 20 is 0.003. This makes the correct Rbar = 0.001429.

I am including the results as calculated using Minitab for your comparison.

The Xbar/R control chart is showing indications of over-dispersion in the Xbar chart. This may be an indication that you are not using rational subgroups. If you can provide more information about this process, I may be able to determine the cause.

1616698358707.png
1616698377068.png
 
#7
@Miner Thank you very much for this! I hope it is OK for me to ask some questions - I am trying to understand, and not just using automation. Which is where I believe we are at fault at my company. This is not my area, I just got hooked on trying to figure it out.

My target value is 4.080. Regarding Minitab, where do you tell it to use the sample groups? It says 42 samples in all on your graph, but I cannot see where it is stated that it should use them as 21 groups of 2 samples each.

Is the only difference between calculating Ppk and Cpk the SD/sigma? Earlier I was under the impression for a day that Ppk used the mean for the entire dataset (not sample groups), and Cpk use the mean of the means of the samples. I now believe that both Ppk and Cpk utilizes the mean of the means of the samples. I am correct?

Thanks,
Nicholas
 

Miner

TS Contributor
#8
Ask away.
  1. I entered a subgroup size of 2 in the screens below.
  2. That is correct. Both use XdoubleBar as the mean. The difference is in the standard deviations. Cpk uses the within subgroup variation, StDev (Within), which is estimated using Rbar. Ppk uses StDev (Overall). The formulas used by Minitab may be found here.

1616761208086.png 1616761244065.png
 

noetsi

Fortran must die
#9
This reminds me of the wonderful years of Sick Sigma. My dissertation was on the implementation of TQM in government (not that this actually went anywhere in the public sector).

Good people like Miner actually use it. To bad it is not used much in government.
 

Miner

TS Contributor
#10
Use it and teach it. I just finished teaching a green belt class yesterday, and will be teaching another green belt class, plus a black belt class next month. Been doing this for my company for over 12 years.

I don't know if they still practice it, but the US Army was big into Six Sigma, particularly in Logistics.
 

noetsi

Fortran must die
#11
They were in the nineties. Probably still are in logistics, but I would guess not elsewhere.

I took a black belt class but never got a license associated with it. We gave up on TQM because of political realities. There is not a lot of interest anymore in formal improvements. Sadly no one with influence pushes change and we get the same amount regardless of how well we do. Without that there are not going to be difficult efforts to improve in service industries. At least not very often.
 
#12
Hi again

I was wondering - regarding the UCL for the range chart - what is the reasoning behind that level? If I understand it correctly, that is a calculated level, yes? or should it be specified?

Thanks!
 
Last edited:

Miner

TS Contributor
#13
The upper control limit (UCLR) is a calculated value based on Rbar. This is a link to the formula.

The UCLR is the upper limit for expected within-subgroup variation. Variation above this limit is indicative of a change in within-subgroup process variation. There is not typically a LCLR unless the subgroup size is large.

The UCLXbar and LCLXbar likewise set limits for the expected between-subgroup variation.

It is important to note that while these control limits were set at 3 standard deviations, they were established as an economic tradeoff between the expense of missing a true process change vs. the expense of chasing a false alarm. 95% limits (2 StDev) were initially considered, but were rejected because the cost of chasing false alarms was too high.
 

Miner

TS Contributor
#15
that is really interesting miner.
Industrial statistics is all about reducing costs. Variation causes rejects, and rejects cost money. Therefore, reduce variation. Control charts identify the unusual variation (worthy of expending resources to reduce) from background variation. If the background variation is still too great, that is where DOE and six sigma projects come into play.
 

noetsi

Fortran must die
#16
Many years ago I did a lot a research on TQM (it is what my dissertation is in). I always thought a higher quality product was what industrial statistics was about. :)
 

Miner

TS Contributor
#19
Unfortunately, you are correct, but manufacturers are slowly coming around to realize that it is true. Trying to inspect quality in does increase costs, but designing quality in and taking variation out of the process does reduce costs.
 

noetsi

Fortran must die
#20
I think the logic that the US business community uses is to create new markets and exploit them rather than to create high quality products. My own views, and I am certainly not an engineer, is US culture is not well designed generally to pursue quality or for that matter industrial engineering. This was particularly obvious in the TQM phenomenon.