I have the following dataset that I think about as a matrix. My "rows" are individual competitors. My "columns" are races. The intersection of a row and a column are an actual race result (# of seconds to complete the race). My goal is to scale each race result so they can be compared. I have the following assumptions - each race has a normal distribution of results for the competitors involved. The race results vary because of the distance/course difficulty/weather, etc. Ideally, I would just compare the means of different races and adjust the results accordingly. I don't believe I can do this because each race has a different combination of competitors, so the strength of the field has a large effect on the mean.

How I am currently thinking of this is that I want to "solve" for the following: Assign each competitor a rating. Assign each race a regression formula with the input being the competitor rating and the output being the expected race time. I want to come up with the best combination of competitor ratings and race formulas that would minimize the variance of the expected race time vs. the actual race time across the whole matrix.

Can anyone point me in a good direction to start thinking about/implementing this?