Orthopaedic calculator for lower limb injuries

Dear all. I am a long time follower, first time poster. I would be grateful for your advice on our study on compound leg fractures. We are trying to use the data we have on outcomes (dependent variables below) from 151 patients managed in a specialist trauma centre to correlate them with a score that has been developed by the same hospital. The components of the score are shown as independent variables below. Until now the score has been validated to distinguish those who need an amputation and those whose limbs can be salvaged based on the total score. We now would like to take it one step further and create a model where the score can be used to accurately predict these dependent variables so that the score can be used by other centres who see these injuries less frequently. It should be noted that a total score of e.g. 10 does not necessarily lead to the same outcome for each of the dependent variables as the independent variables are not necessarily thought to be equally important. For example, for the functional score, the muscle injury score will more likely be a determining factor compared to the bone or skin scores. As such, for each of the dependent variables we would like to create a predictive model where we would like to individually add a weight to each of the independent variables. We ideally want to have a calculator where the individual bone, skin, muscle and medical conditions scores can be entered and the calculator will show the expected number of surgeries for that patient, or the expected functional score for that patient. We would like to have a calculator for each of the dependent variables below. The scatterplots of the skin, bone, medical conditions scores are roughly linear when compared to the dependent variables. The muscle score seems to be an inverted parabola. All dependent variables are not normally distributed despite attempting to use log transformation. I am not sure how to ascertain what distribution they are using SPSS. Spearman’s correlation shows a significant association between all independent vs dependent variables (except for the functional score, which is not significant when compared to any of the dependent variables). Kruskal-Wallis tests showed significance across some but not all of the categories of the independent variables when compared to each of the dependent variables (except functional score). In order to achieve our calculators, can we use any form of regression analysis or cox proportional regression? Is it possible to assign weights to the variables and if so, how can I do that and how can I use the weighted variables after that? I am using SPSS and Excel and have run many models but not sure which is the right one. I have found that for a given dependent variable, a skin score of 2 and 4 only is significant but not the others. I have also tried to use the gradient of the scatterplots to compare the independent variables manually and tried to create my own model in Excel but this only yielded an approximately 30% success rate. Please see the variable outline below. Thank you.

Dependent variables (all skewed to the right even with log transformation):
  • Number of inpatient days
  • Number of surgeries
  • Bony union time in months
  • Total length of treatment in months
  • Functional score
Independent variables (ordinal/categorical):
  • Bone injury score (1 to 5)
  • Skin injury score (1 to 5)
  • Muscle injury score (1 to 5)
  • Other medical conditions score (0 to 8), 2 points for each condition
Confounding factors:
  • Age
  • Sex


TS Contributor
As a general remark, you only have 151 observations, and you will base your scoring on impressions from graphical displays and descriptive statistics, many statistical significance tests, and a lot of trials (different weighting methods, tranformations of the dependet variables etc.) to produce a model which fits to the data best. All this will probably result in overfitting, the resulting "best" scoring method will perhaps not be well generalizable beyond the n=151 patients in your sample.

There's also the question why tests of significance should be important here?

Any weighting scheme derived from the n=151 sample data will suffer from considerable standard errros of the single parameter estimations included. Therefore, I'd personally follow Jacob Cohen's advice (given in "Things I have learned (so far)" to keep things simple and to use unit weights. This would mean to just start with a simple summation of the 4 independent variables (maybe "other conditions" should be adjusted somehow to a 1-5 range) and check the validity / predictive value of that sum score. As long as there are no serious pre-existing reasons for assuming largely different weights between independent variables, and if independent variables aren't redundant, of course.

Just my 2pence