Comparison of psychophysical methods for sensitivity to small changes

Hi all

I have two different psychophysical clinical test methods I can use to measure a particular quality. For the sake of explanation, lets say I want to measure hearing quality.

Consider the following:
- Both tests are design to quantify 'hearing quality' but they do so in different ways. Test method A measures hearing sensitivity at a higher frequency using one method, whereas test method B measures hearing sensitivity at a slightly lower frequency using a different method. Each test method produces a real number result, however they are on different scales (slightly higher frequency sensitivity vs slightly lower frequency sensitivity) so they are not directly comparable.
- Both test methods have some 'noise' involved. Of course I can take the average of repeated measures to get a better estimate of the 'true' hearing sensitivity, however subjects become fatigued over time (and this affects their sensitivity) so I do not want to repeat too many times (say no more than 3 times).

I want to run a clinical investigation to determine which test method is 'better', meaning which is more powerful in discriminating between two different levels of hearing quality.

How would I do so? What would be the optimal design for such a study? Is there some standardized design for testing methods against each other like this? Ultimately I want to be able to say 'test method A is better than B since it is more sensitive to small changes in hearing quality'.

Note: I can artificially reduce hearing quality (for example by adding ear plugs). So perhaps I should test subjects with and without earplugs using both methods in randomized fashion, then see which shows the greatest statistical power in discriminating the conditions?
Would a gauge R&R type design / analysis be appropriate?

Perhaps if I can create a linear transform from one set of test method results to the other (assuming they correlate in a linear fashion) I can then directly compare them?