Ordinal Regression: Parallel Lines Test

#1
Hi All,

I'm analyzing survey results with many responses coming in the form of Likert scales. In this case, I actually have two Likert scale predictors (in the model as nominal variables) predicting my ordinal dependent variable. My two predictors each have five categories (levels on the scale). My dependent variable also has 5 levels.

My test is failing the proportional odds assumption so I've done two things to explore further. First I've run individual logistic regressions on each of the cutpoints in my dependent variable. These did not yield significant relationships for most IV-DV combinations (cells).

I also did the parallel lines test for each of category of my predictor variables (inputted as dummies) on my ordinal dependent variable. In most of these categories, the parallel lines test is failed (i.e. the proportional odds assumption is updheld). However, in a couple of these categories, I have no observations (nobody responded Very Poor or Poor on one of my predictor Likert scales). Thus I cannot get a parallel lines P-value for these categories.

I guess that these categories are what's causing my full model to fail the parallel lines test. My question is whether, since nobody responded Very Poor or Poor, I can report odds ratios from my full model without fear that these predictor categories will throw off the results.

Many thanks to whoever might take the time to answer this question. I would appreciate it tremendously. Have a nice day All!
 

maartenbuis

TS Contributor
#2
I think you are on the right track with your two checks. You would want to look at the estimates of an ordered logit model without the proportional odds constraint, a so-called generalized ordered logit model. I suspect that this is what you tried with the separate logistic regressions. However, you should in that case not use the catories of your dependent variable, but create a new dependent variable for each logistic regression indicating whether the value of y is less than or equal to the outcome category of interest. In fact, the Brant test uses exactly that method of approximating the generalized ordered logit model. The test of interest is whether or not the coefficients of the explanatory variables are equal across logit models, not whether these coeficients are themselves signficant. Much of the work behind the Brant test is not so much getting the estimates, but getting the variance covariance matrix right.

You are also on the right track when you wanted to see the test for each variable separately. However, you should not do this by estimating ordered logit models with only one variable at the time. Instead you should use the Brant test to also report these separate tests based on the full model. In Stata you would use the -brant- command with the -detail- option, which will output both the estimates of the generalized ordered logit model and the tests for each variable separately. The -brant- command is part of the spost package.
 
#3
Hi maartenbuis,

Thanks so much for the response.

The two things you mentioned are actually what I did. My description was probably unclear then, which led to the confusion.

What I've done to arrive at these results is to run separate logistic regressions with new dependent variables, each indicating whether Y is less than each of my DV categories (excluding the bottom one). So my DV's were level 2 or above vs. not; level 3 or above vs. not; etc.

On the seconf point, I've also done what I believ you said. I estimated ordinal regression models, not logit ones, for my dependent variable (a scale item). My predictors (also scale items) were coded as individual dummies for each level on my predictor scales. So: a 1 for level 2 and a 0 for all other levels; a 1 for level 3 and a 0 for all other levels; etc.

My question is whether the fact that it is the bottom categories of the predictor variable that are causing the parallel lines test to fail, and if the reason is that there are no observations in these categories (levels), if I can still use the overall odd's ratios from my full model.

Thanks again for your help. And have a nice day.
 

maartenbuis

TS Contributor
#4
I don't think that the empty categories in your predictor variable is a problem; it just means that the indicator variables for those categories will automatically drop out of the model and thus cannot cause any problems with the proportional odds assumption.

Let me make sure you I got what you did in the second point. Say you have two predictor variables x1 and x2, then you did one ordered logit model with both x1 and x2 as predictors and used the Brant test to test whether the proportional odds assumption was true for x1 and x2 separately. You did not estimate two ordered logit models one with x1 and one with x2. If this is a correct summary of what you did, then we are in agreement.
 
#5
On the second point that's not quite what I did. I have two predictor variables, each with 5 categories. I created a dummy for each category in my predictor variables (10 dummies total). I estimated 10 separate ordinal regression models with a single dummy as the only predictor in each model. My dependent variable, of course, was an ordinal variable (a Likert scale). In each one of these 10 models, I included a parallel lines test (I am using SPSS which performs this test with a simple check of the box). The results of these parallel lines tests showed that the proportional odds assumption was upheld (no problem) for all of my dummy predictors for which there were observations. But for the dummies which had no observations, obviosuly the test of parallel lines did not return a p-value.

What I think you described in your last post is what I consider to be my full model. In other words, both predictors, x1 and x2, each with 5 categories, predicting an ordinal dependent variable (also with 5 categories). Just like with my predictor variables, the dependent variable has very few observations in the bottom categories. In fact when I run the regression, it says that 47.5% of cells have frequencies of 0. Yet all my coefficients are significant, the overall model fit (-2 log likelihood)is significant at .000, and the odds ratios (exponentiated form of my coefficients) all seem reasonable. The model looks like a good one other than these cells with frequencies of 0.