Cox Proportional Hazard Analysis

#1
Please I have a dataset of 8000 clients linked to HIV treatment. There are 7 variables of interest and we are assessing loss to follow-up and missing data relationship. We plan to use Cox Proportional Hazard analysis. Lost to follow-up is the event of interest. Each of the 7 variables are missing in a proportion of the clients in the study. We have also created an 8th variable that categorized the dataset into clients with and without missing data. I intend on using SPSS for the analysis.
Please, as a note, all the clients missing any of the 7 variables will be excluded before including the variables in the model.
My question is, can I develop a valid cox model with LTFU as the dependent variable and the 7 variables as predictor variables and also include the 8th variable of the missing status as a predictor?
 

fed2

Active Member
#2
so the '8th variable' indicates that one of the other '7 variables is missing', or what is the 'missing data' it indicates.
 
#3
so the '8th variable' indicates that one of the other '7 variables is missing', or what is the 'missing data' it indicates.
Thank you for your response.
A proportion of the 8000 clients have at least one missing data in any of the 7 variables. The 8th variable called missing status was created where the 8000 clients were categorized into those that had no missing data and those that had at least one missing data.( So 2 categories, missing and not missing)
 

hlsmith

Less is more. Stay pure. Stay poor.
#4
If you are doing list wise deletion won't the 8th variable just be all 0s?

Can a client die? If so, how do you treat that competing event? How much missingness is there, show in table with combination. And what is the cause of the missingness?
 
#5
Thank you for your response. Yes, you are perfectly correct, with listwise deletion, the 8th variable will be all 0s, I apologize for my lack of clarity, my initial note was incomplete. The initial analysis was a Kaplan Meier where all the clients missing any of the 7 variables were excluded and survival distributions were used to compare the different categories of each variable using a log-rank test. The 8th variable of missing status was also subjected to a Kaplan Meier comparing the clients with and without missing data. Then following that we wanted to do a Cox analysis to determine the probabilities of survival (the longer a client stays in the treatment program, the less likely the client will be lost to follow up) with respect to missing data. So in developing the Cox model, will it be valid to include the predictor variables with missing as one of their categories? or we include just include the 8th variable of missing status and control for gender and age?

Yes the client can die and if officially notified, such a client is left-censored, also clients that are officially transferred are left-censored and those that remained in the treatment program as at the end of the study were right-censored, however, those that died without notification are regarded as lost to follow up including those that left or stopped treatment for whatever reason unknown to the center.

Missingness is generally less than 10% in all variables except gender, age where there are no missing variables and in weight and viral load counts where missingness is 19 and 51% respectively.
The causes of missingness varies from non-collection to non-documentation due to negligence to intention.