Dealing with variation in subjective qualitative data

I am helping conduct a review of scholarship applications.

There were approximately 275 applicants that were reviewed by various individuals.
There seems to be a lot of variation in scoring among the reviewers.

Issue being not everyone reviewed each individual, so those who have high scores were most likely reviewed by "easy" scorers while those with low scores were reviewed by "harsh" scorers.

There are 4 categories of scoring, one category which is weighted 50%.

What is the best way to account for this variation?

We obviously want those most deserving to have higher scores than seem to be outputted without having to recalculate everyones application scores.