Estimating Covariate Effects on Likert-Scale Response Data

#1
I am working on a problem where we elicit medical residents perceptions regarding the importance of specific core medical competencies (professionalism, collaboration, medical expertise, etc.) on a 5 point Likert scale. For example, "In your current residency role how important is professionalism in your day-to-day activities". We elicited responses from two schools (School1, School2) and from residents in 5 types of post-graduate residency programs (Family Medicine, Psych, Surgery, Internal, Other). We have data on approximately 800 respondents. Our hypothesis is that the culture surrounding the specific schools in the sample, and the specific residency programs under question may influence the self-reported importance of these roles.

Obviously, the response data live on 5 points. And in our case, the responses are fairly skewed to the right, with many residents finding these core roles important.

I have graphically investigated the relationship between Perception (5 point Likert Scale) and my two independent variables (School, Residency Program). I have used mean plots, also considered mosaic plots. I have a sense of what is going on qualitatively but want estimated covariate effects, p-values, etc...corresponding to the impact of School, Residency and their interaction on self-reported importance.

Ideally, I could run this as a 2-way ANOVA (or linear regression model). However, diagnostic investigation of residuals suggests an issue of non-normality.

I considered the proportional odds logistic regression model next; however, in many instances the assumption of "proportional odds" was not satisfied.

From there I moved on to the multinomial logistic regression model; however, given the distribution of the response data (skewed to the right), very little information exists in cells {1}, {2}, so estimated covariate effects (relative to these response levels) appear to be estimated without a great deal of precision.

Next, I considered non-parametric tests...extensions of Kruskal Wallis test from one-way layouts to two-way (and higher order) layouts. I found the following: http://www.tandfonline.com/doi/abs/10.1207/s15327906mbr1503_4#.Ud1uCdpzY3E However, I could not find a statistical package to implement such a method. Does anyone know of one? I wouldn't expect these tests to elucidate the size of the covariate effect; however, they may at least be able to derive a valid overall p-value (something like the global F-test in regression/ANOVA)...or perhaps even more valuable LRT style p-values corresponding to each covariate effect (i.e. School, Program, School*Program).

I imagine this is a fairly common problem in applied research settings...just wonder what other people have done in the past?

Thanks Chris
 

noetsi

Fortran must die
#2
Logistic regression does not assume normality of any kind as far as I know (including in the dependent variable). I have never heard that it would influence either the parameter estimates or standard errors.

I believe vacant cells will weaken your power and of course if you have too little variation this can lead to partial or full data seperation, although it would have to be extreme for this to be an issue.

If this is a fairly common problem in the applied research setting I have never seen it mentioned (and I read a lot of logistic regression analysis these days). I have never seen anyone even raise skew as an issue in logistic regression (as compared to linear regression).

Likert scale data is normally considered ordinal. Why did you chose multnomial logistic regression rather than ordered logistic regression?