I have two data set (training and validation) for building and validating a Cox model.
With the training data set I fitted a cox model using stepwise selection method.
The significant variables in the model were the only variables included in the validation model. Is this the right approach?
While validating the model I realized that the variables are not significant in the validation model and also the assumptions of the cox model do not hold (I checked the assumption on the validation data). Should I ignore the fact that the variables are insignificant and go ahead in making corrections for the problem with model assumptions in validation data?
Thirdly,in both training and validation data I have a variable 'treatment' with three groups. In training the groups are Standard, New drug and mixture, while in validation data the groups are Standard, New drug and X (is a treatment which is different from mixture in training data). Is it right to include this variable in both model or should I eliminate the groups that are not match; mixture from training data and X from validation data or should I work with it like that? I am not sure how this affects my analysis.
Thanks for your responses.
With the training data set I fitted a cox model using stepwise selection method.
The significant variables in the model were the only variables included in the validation model. Is this the right approach?
While validating the model I realized that the variables are not significant in the validation model and also the assumptions of the cox model do not hold (I checked the assumption on the validation data). Should I ignore the fact that the variables are insignificant and go ahead in making corrections for the problem with model assumptions in validation data?
Thirdly,in both training and validation data I have a variable 'treatment' with three groups. In training the groups are Standard, New drug and mixture, while in validation data the groups are Standard, New drug and X (is a treatment which is different from mixture in training data). Is it right to include this variable in both model or should I eliminate the groups that are not match; mixture from training data and X from validation data or should I work with it like that? I am not sure how this affects my analysis.
Thanks for your responses.