Categorical regression question

#1
Hi, everyone

Hope the experts here can help me out with a question.

I have a database with several respondents providing data on multiple-choice questions. One of those questions is whether the respondents volunteer or not.

I got a request to perform a regression analysis to see whether the results of some of the other questions could be used to predict whether the responder is more or less likely to volunteer.

So it looks like both the independent and dependent variables are categorical in their nature, which brings me to my question.

Since both variables are categorical - would the regression analysis be really the best tool to use here? I suggested using t-test to compare the frequencies for each question between volunteers and non-volunteers - would that be a better option to see if any questions can "predict" volunteering?

Please let me know if any clarification is required.
Any advice is much appreciated!
 
#2
The T-test (or regular linear regression analysis) is not appropriate here. Use Logistic Regression. It will answer your question directly and you may use an odds ratio or probability.
 
#3
Thank you very much for your response.

From what I read regarding logistic regression, it works when independent variable is continuous and the dependent variable is categorical. While in my example both variables are categorical. Am I missing something?

I am looking for a type of regression that would allow me to make a statement like "those who answered "E" on this question are 50% more likely to volunteer". Although i'm not sure 50% more likely than what, I'm coming back to the pairwise comparisons.

Any guidance you (or anyone else) could provide is much appreciated
 

Link

Ninja say what!?!
#4
From what I read regarding logistic regression, it works when independent variable is continuous and the dependent variable is categorical. While in my example both variables are categorical. Am I missing something?
Your understanding is incorrect. Logistic regression does work with independent categorical variables. What you'll need to do is create indicator variables and add them into the model as parameters.
 
#5
Thank you, Link.

Could you kindly confirm for me that in my case logistic regression is the way to go? (as opposed to linear regression)

By indicator variables do you mean "yes/no" varibales denoting each of the potential responses to the multiple choice categorical questions?

(My apologies for the multitude of questions, but google yields conflicting information and getting it from a live person is much easier to understand).
 

Link

Ninja say what!?!
#6
I think logistic would be fine in your scenario.

As for you question about yes/no, lets say that you have a variable for a question that asked "how far did you get in school?" The possible values are: high school drop out or less, high school grad, some college, college grad, and masters degree or above.

What you could do is create indicator variables to compare the different levels of school to a baseline (I'd probably use the high school drop outs as base). If there's limited data, I'd collapse the groups even more. So the indicators you'd create are yes/no for high school grad, yes/no for some college, etc. Then you'd use those indicators in your logistic model.

That should tell you the odds of volunteering, if say you were to graduate college (as compared to a high school drop out).

Hope that helps