I wrote a scientific paper measuring ovarian volumes after two different surgical procedures performed on either ovary.

The measurements were at three different time-points (1, 3 and 6 months). Evaluated patient numbers were different at the three time-poimts since some of them got pregnant and ovarian volumes would not be accurate anymore. I used a paired t-test at each timepoint.

No significant difference was present at the first two timepoints, whereas at the third (with a smaller numebr of patients, 40 instead of the initial 51) p was 0.04.

The referee questions that I should use a 0.01 instead of a 0.05:

"with many comparisons the effect on the type I error rate is to dilute it, hence it is common to consider 0.01 instead of 0.05". May I answer that I agree, however that we chose to stick with the 0.05 threshold since it is not correct to change methods AFTER the results have been analyzed, and we only add a call for caution for a possible type I error in the Discussion?

Thank you

The measurements were at three different time-points (1, 3 and 6 months). Evaluated patient numbers were different at the three time-poimts since some of them got pregnant and ovarian volumes would not be accurate anymore. I used a paired t-test at each timepoint.

No significant difference was present at the first two timepoints, whereas at the third (with a smaller numebr of patients, 40 instead of the initial 51) p was 0.04.

The referee questions that I should use a 0.01 instead of a 0.05:

"with many comparisons the effect on the type I error rate is to dilute it, hence it is common to consider 0.01 instead of 0.05". May I answer that I agree, however that we chose to stick with the 0.05 threshold since it is not correct to change methods AFTER the results have been analyzed, and we only add a call for caution for a possible type I error in the Discussion?

Thank you

Last edited: