Wilcoxon rank sum test on Cross Validation Results

Hi Guys,

I would like to compare the recognition times of two different object detectors. I have let's say 1000 labeled data.

Then I do a 10x cross validation (900-training 100-testing). At each cross validation I get 100 detection times (samples) from each method. Therefore, at the end of 10x cross validation I get 1000 sample times for each detection method.

I would like know if it is OK to do a Wilcoxon rank sum test on these 2 by 1000 sample times to compare the performance ?

thank you for your help.

Last edited:


TS Contributor
Hi gajan,

Why would you be using a non parametric test? With your sample size you could use a common t-test (probably a paired test would be more useful). Unless you have some incredible non-normal data, there shouldn't be any problem. By the way, perhaps the Kappa statistic for inter-rater agreement may be valuable for you:


Hope this helps.
Hi terzi,

Thank you for your answer.

1000 Labelled sample was just as an example. Normally I get around 400 labelled samples. May be even this size is still enough to do a paired t-test ?

But what I really want to make sure is that: Can I consider samples that are generated from a cross validation as independent samples/observations and use it on a t-test/Wilcoxon rank sum test ?

Thank you for your time.


P.S: Thank you for referring me to Cohen's_kappa. I am planning to use it in my other tests.