Hi all, hope someone can point me in the right direction to help me with the problem below!
The Data
Lets assume that the dataset I have is solely Set-Level win/loss data for every single professional tennis match in the past X years.
PlayerA, PlayerB, Set#, playerA_win
Nadal, Federer, 1, 1
Nadal, Federer, 2, 1
Nadal, Federer, 3, 0
Apologies, wasn't sure how to stick in a proper table!
The Problem
I am looking to come up with a predictive factor based on 2 player's historical performances, however I am not keen on using something like a basic 'win%' because, as we all know, we can't just assume that the quality of players each player has played are the same and therefore win%s aren't particularly predictive when it comes to being used in ML.
Another strategy I have previously tried is the 'common opponents' method, which looks at each player's (PlayerA & PlayerB) performance vs players that the other player has played. These results are then compared and used to calculate the difference in quality of both A & B. For example, if PlayerA beats PlayerC 60% of the time and PlayerB beats PlayerC 75% of the time then we can calculate AvsB with the assumption that performance is transitive between players. However, I have found that this does not take into account performances against players that the other player hasn't played, for example if PlayerA plays PlayerC/D/E and gets battered by D & E, but PlayerB has only played PlayerC then we are left with large gaps in the predictive quality of our final number.
Appreciate this is long winded but have been racking my brains for days and in desperate need of extra thoughts!
Thanks
The Data
Lets assume that the dataset I have is solely Set-Level win/loss data for every single professional tennis match in the past X years.
PlayerA, PlayerB, Set#, playerA_win
Nadal, Federer, 1, 1
Nadal, Federer, 2, 1
Nadal, Federer, 3, 0
Apologies, wasn't sure how to stick in a proper table!
The Problem
I am looking to come up with a predictive factor based on 2 player's historical performances, however I am not keen on using something like a basic 'win%' because, as we all know, we can't just assume that the quality of players each player has played are the same and therefore win%s aren't particularly predictive when it comes to being used in ML.
Another strategy I have previously tried is the 'common opponents' method, which looks at each player's (PlayerA & PlayerB) performance vs players that the other player has played. These results are then compared and used to calculate the difference in quality of both A & B. For example, if PlayerA beats PlayerC 60% of the time and PlayerB beats PlayerC 75% of the time then we can calculate AvsB with the assumption that performance is transitive between players. However, I have found that this does not take into account performances against players that the other player hasn't played, for example if PlayerA plays PlayerC/D/E and gets battered by D & E, but PlayerB has only played PlayerC then we are left with large gaps in the predictive quality of our final number.
Appreciate this is long winded but have been racking my brains for days and in desperate need of extra thoughts!
Thanks