Logistic Regression as Data Transform?


New Member
Hello everyone, new here, glad to be here!

I have what might be an unorthodox question. I use the matlab function 'mnrfit'


to compute a logistic regression model given input data. The matlab function 'mnrval' performs the inverse and outputs probabilities given the model from mnrfit and a set of data.

Recently I ran logistic regression on the dataset of a work colleague. Small sample, only 46 cases with 4 predictor variables. The stats returned by matlab for the model fit were not signifcant, even though a scatterplot shows clear effects of two variables.

So here's what I did. I ran mnrval on the same data, which effectively used the best-fitting logistic regression model to *transform* the data from 0's and 1's to probabilities. The resulting probabilities were approaching normal distribution though with such a small sample it was not perfectly normal.

However, running ANOVA on these probabilities gets statisticaly significant main effects for the two variables that were non-significant in the mnrfit.

My question, how awful is it to convert 0's and 1's to continuous variable (probability) using logistic regression, then use that for parametric statistics?

Thanks in and advance for any/all comments. Hope this was not too wordy!



New Member
P.S. - I should add that matlab's ANOVA function has an 'interactions' setting while its mnrfit function does not. Running ANOVA without interactions also fails. So mnrfit and anova return similar statistics but anova is easier because interactions is just a setting while computing an interaction term manually from mnrfit output was going to be a slog.