When comparing a diagnositic test (questionnaire) to a reference, let us say for disease and no disease, you could calculate sensitivity and specificity.
For example if the reference says 10 people have the disease and 90 not, and your test you are evaluating says you have 9 people with the disease and 91 not: this will give you very high sensitivity and specificity for your test.

Now I was wondering this analysis doesn't say anything about whether you are talking about the same 9 (from 10) patients with the disease in both tests.
It could be that your test indentified 5 (from 9) other patients than those 10 identified through the reference.
So despite a high sens. and spec. result your test is not that good.

How is this tested, can it be tested? And how is it reported?