Multiple testing correction - What constitutes a test?

I am not how to tackle my experimental setup in terms of multiple testing correction:
I have three performance measures (m1,m2,m3) that I collected from 20 individuals. On top of that I have two measures of some anatomical feature. I correlated every anatomical feature with all the performance measures: a1 correlated with m1,m2,m3 and a2 correlated with m1,m2,m3.
How many tests does that result in? I assumed that it is 2*3 = 6 tests, so that I use a 0.05/6 Bonferroni threshold. But in a couple of papers in the same topic I read that people only correct PER category, i.e. 3 tests for anatomical feature a1 and 3 tests for anatomical feature a2, meaning a Bonferroni threshold of 0.05/3 applied on each p-value.
What is the correct procedure?

Thank you very much in advance