Using/converting KM survival data

#1
Hi,

I’m having difficulty with the format of some survivorship data for a patient safety study using registry data…

In one dataset I have typical individual patient data with survival time, which is all good for generating K-M curves and comparing each intervention.
However, in the other dataset (same interventions, different data source), I have a detailed summary survival data table – this includes number at risk, time to failure, cumulative failures, KM function estimate and 95% CI’s for each intervention.

Whilst the individual patient dataset is straightforward to work with, what I’m struggling with is how best to make use of the second dataset. Can anybody suggest how I might be able to go about reconstructing pseudo-individual patient data from the latter, with a view to putting the data in the same format to then allow KM curve generation and combined dataset analyses? I suspect (hope!) there may well be a straightforward solution, but coming from a medical background (non-statistician!) this is not readily apparent!

Any thoughts much appreciated,

Dan
 
Last edited:

Miner

TS Contributor
#3
Focus on the "Years Since Intervention" and "Cumulative Failures" columns. Difference the "Cumulative Failures" column and match the differences with the correct "Years Since Intervention" rows. This will give you years matched with the number of patients that survived that long.
 
#4
Focus on the "Years Since Intervention" and "Cumulative Failures" columns. Difference the "Cumulative Failures" column and match the differences with the correct "Years Since Intervention" rows. This will give you years matched with the number of patients that survived that long.
Thanks Miner. It's probably me not following exactly, but the difference between each row of the 'cumulative failures' column gives the same as the 'number of failures' at each respective time point? And time matched with the number of patients that survived that long equates to the 'number at risk'?
To try and get back to individual patient data, if i difference the number at risk and number of failures, I'm thinking this will give me close enough to individual patient data for those that did not experience failure, but whose follow-up has otherwise ceased by the time stated?