Multivariate analysis for Medicine Thesis


I am writing my thesis for medicine about postoperative complications of a disease (IBD). The dependent variable in this case is whether the patients had a post-operative complication or not (binary) within 30 days of their operation. The sample size for my study is n = 62.

I have collected numerous (32) independent variables (age, sex, different lab values, time of operation, etc), some are continuous and others are categorical variables.

I have so far done an univariate analysis (mann-whitney for continuous and fishers exact test for the categorical variables)

I now wish to do some form of multivariate analysis, considering multiple logistic regression as an option, but am struggling finding the suitable test, and also interpreting the outcomes when computed using RStudio.

Any help or guidance would be greatly appreciated.

Thank you in advance :)


Less is more. Stay pure. Stay poor.
Why did you not consider survival analyses?

You likely have enough data to power zero predictors in a model beyond an intercept or have an underpowered model with a single predictors and intercept.

The way to think about it, say you had the following variables:
comorbidity y/n
treatment y/n
...29 other variables.

You have more predictors than events. Do you know the sparsity in these data, you will likely have no females that are old, comorbid, and treated. How can you get an estimate for such a patient or make conclusions. Your standard errors will be huge. Also if you find something it will likely not generalized to another sample and will be spurious. Have you had any coursework in regression modeling? Also, initial bivariate examination of covariate can be a spurious pursuit as well. Does not control for mediated or moderated affects between variables as well as collinearity.
Thank you for your prompt and detailed response.

Regarding survival analysis, I was under the impression it's use was more for observational studies where multiple collections of data are made for the same variables at different time points for each subject? Correct me if I'm wrong. In this case, as mentioned, it's a retrospective study, looking simply at patient documents to see which patients had a particular surgery, and how many of these patients ended up having complications. Without repeated collections of any variables over time.

With regards to the variables, I am well aware there are way too many to draw any form of conclusions from them (also due to the small sample size only two variables came out to be significant in the univariate analysis, namely age of the patient, and also if the patient had sever anaemia after the index surgery). However, due to the lack of courses provided by my university regarding medical statistics, I do lack a certain understanding in the whole procedure. For instance, how would one choose which variables to evaluate in a multivariate analysis, assuming total number of variables is too large? This without going into some form of bias.
By bivariate analysis would you be referring to for instance a two sided ANOVA analysis?

Sorry for my ignorance, I can guarantee I have been researching and reading about these things for the past few weeks without much success, hence have tried to find guidance here on a forum.

Thank you so much for the help
Sorry just to add, the time from operation until complication was collected in the database, however I wanted to simply study predictive factors (if any) of having a complication after this specific surgery, regardless of when I'm the 30 day period after surgery this occurs


Less is more. Stay pure. Stay poor.
Bivariate means two variables, independent and dependent. Univariate is one variable, like a mean. Multivariate means multiple dependent variables.

survival you can use fixed variables like you have, then you just need time to event, which can be discrete or conti nuous (days to event or time as say 30 days). When you ignore time you lose information.