Converting exposure values with log for weighted average method

#1
Hello,

I have a couple of transaction types with different risk scores in a scale between 0 to 10 [0-10]:
Code:
Trans_type_1_risk = 8 and Number_of_trans_type_1 = 1.000.000 (~0.04% of Total_nbr_of_transactions)
Trans_type_2_risk = 4 and Number_of_trans_type_2 = 1.000.000.000 (~40%  of Total_nbr_of_transactions)
Trans_type_3_risk = 9 and Number_of_trans_type_3 = 2.000.000 (~0.08% of Total_nbr_of_transactions)
Trans_type_4_risk = 5 and Number_of_trans_type_4 = 1.500.000.000 (~60%  of Total_nbr_of_transactions)
I want to calculate an overall risk score by using some "weighted_average" method or something like that.
Code:
Overall_risk = (Trans_type_1_risk * Nbr_of_trans_type_1 + Trans_type_2_risk * Nbr_of_trans_type_2 + Trans_type_3_risk * Nbr_of_trans_type_3 + Trans_type_1_risk * Nbr_of_trans_type_4) / Total_nbr_of_trans
However, when I use the number of transactions as weight, then the weights of the most risky transactions types (type_1 and type_3) become insignificantly low (~0.04% and ~0.08% respectively).

I don't want to underestimate the most risky transactions types 1 and 3, so I came up with the idea of using the log functions for the number of transactions. Thus:
Code:
log(Nbr_of_trans_type_1) = log(1.000.000) = 6 which is ~19.7% of (6+9+6.3+9.2) so the new Weight_1 becomes 19.7%
log(Nbr_of_trans_type_2) = log(1.000.000.000) = 9 which is ~29.4% of (6+9+6.3+9.2) so the new Weight_2 becomes 29.4%
log(Nbr_of_trans_type_3) = log(2.000.000) = 6.3 which is ~20.7% of (6+9+6.3+9.2) so the new Weight_3 becomes 20.7%
log(Nbr_of_trans_type_4) = log(1.500.000.000) = 9.2 which is ~30.2% of (6+9+6.3+9.2) so the new Weight_4 becomes 30.2%
With this log-conversion and new weights, the risky trans_types 1 and 3 can contribute to the result significantly (19.7% and 20.7%) and still the trans_types 2 and 4 contributes to the results the most (29.4% and 30.2%) due to their number of transactions. So everything looks perfect with this method.

Code:
 Overall_risk = (8 * 19.7% + 4 * 29.4% + 9 * 20.7% + 5 * 30.2%) / 30.5 = 6.125
However, I have difficulty to motivate "How I could use the log-conversion for the weighted-average calculation" according to the statistical principles. Does my method make any sense statistically?
 
Last edited:
#3
What do you intend to do with the calculated overall risk?
Are there penalties associated with each type of risk?
Hello @katxt ,
The main point here is that I want to calculate an overall risk score which represents all transactions risks fairly. Like I mentioned previously, weighted average method seems fair at first sight. But with this calculation, the impact of the high risk transactions types 1 and 3 is almost zero due to their very small volume compared to the other transaction types. That's why I used log-function to have better distributed weights. My question is, if this new weight distribution makes any sense statistically?
 

katxt

Active Member
#4
does this new weight distribution makes any sense statistically?
It seems too artificial to me, I'm afraid. A statistical justification needs more information, like a combination of risk, frequency and cost, and a purpose for the weighted risks.
when I use the number of transactions as weight, then the weights of the most risky transactions types (type_1 and type_3) become insignificantly low (~0.04% and ~0.08% respectively).
Probably they are insignificantly low, unless they have very severe consequances.