Applying the Q test twice to the same data set (Why not?)

#1
Hello Everyone.

Scenario

You apply the Q test to a data set and find that you may indeed remove a particular datum from the data set.

Question

Why can't you apply the Q test a second time to the same data set, given that there are still plenty of degrees of freedom remaining?


Thank you for your attention with this question.

Billy Wayne McCann :wave:
 
#3
Hello John.

Thank you for your response.

JohnM said:
I assume this is a Q test to detect outliers?
Yes. Sorry I wasn't more clear.

Here's a link that describes why it's unwise to use it more than once on the same set of data:

http://mathforum.org/library/drmath/view/52720.html

theoretically, you could run out of data....
While I appreciate your input, John, I'm still not really finding a mathematically rigorous reason why the Q-test could not be repetitively applied to the same data-set.

If you do, in fact, "run out of data", to me, that would indicate that your data points are inconsistent, that is, that you need to redo your experiment and/or measurements.

In other words, if you perform your measurements and/or experiment carefully, you ought not get outliers at all (as is most often times the case, I've found).

If you do get one, and the Q-test confirms that it is an actual outlier, then you could treat the data set minus the outlier as the true data set and apply the Q-test again.

Basically, you could pretend that the outlying data never existed - that you never even performed that particular measurement (as it was a confirmed error) - and then apply the Q-test to the data-set as if it were the first time you performed it.

But, to my mind, if you keep getting outliers, then most likely you've been sloppy with your procedure.

Keep in mind, though, that these are merely the thoughts of an undergraduate chemistry student, with limited experience in a laboratory. Real world "stuff" may be different. Just some thoughts though.


Billy Wayne
 

JohnM

TS Contributor
#4
Actually, there's nothing rigorous about outlier detection and elimination - it's a matter of professional judgment, for the most part. You can use any test you want, but you also need a valid reason for removing the data point.