I would like to compare the recognition times of two different object detectors. I have let's say 1000 labeled data.

Then I do a 10x cross validation (900-training 100-testing). At each cross validation I get 100 detection times (samples) from each method. Therefore, at the end of 10x cross validation I get 1000 sample times for each detection method.

I would like know if it is OK to do a Wilcoxon rank sum test on these 2 by 1000 sample times to compare the performance ?

