interpreting logistic regressions... odds ratios v probabilities

#1
Hi everyone!


I am doing some analysis looking at various negative outcomes for children (child labour, begging, etc.) and comparing them with a range of households characteristics (child headed, large households, elderly headed households etc)

The results show some interesting findings. For instance, regression the binary dependent variable of 'begging' or not give me the odds ratio of 2.5 for child headed households. For the same regression, the margins commands shows probabilities of 0.57 for child headed households and 0.35 for adult headed households.

I am at a loss as the best way to interpret these findings in a manner which makes sense for the data...

Is it more intuitive to say child headed households are 2.5 times more likely to beg than adult headed households (or is this not correct interpretation), or is the better interpretation that child headed households are 1.6 times more likely to beg than adult headed households (0.57 v 0.35).


Can anyone assist?
 

obh

Active Member
#2
Hi Robbie,

I don't understand exactly what are your result.
What Formula did you get?

Generally, the logistic regression predicted probabilities for events base on the IVs.

You may also look at the coefficients, to understand how change in the IV will influence the probabilities.

Example:
t1 = 1.9333 - 0.8755 X1
Exp(-0.8755)=0.4167

Increasing x1 by 1, will decrease the odds of 1 in comparison to 0 by 58% (multiplied by 0.41)
 
#3
Thanks for your help...

The logistic output and margins output are below... my issue comes in understand which is best to use given my dataset and the work I am doing... is it better to say...

child headed households are 2.5 times more likely to beg than adult headed households (is this correct interpretation?), or is the better interpretation that the probability of a child headed household begging is 1.6 times greater than an adult headed households (0.57 v 0.35)?

I have always struggled with logistic regressions and their interpretation. Does one of the above, given the nature of the data and what I am trying to say, make more sense? Are my interpretations in italics abovecorrect?


. logistic begging i.child_head

Logistic regression Number of obs = 1,288
LR chi2(1) = 3.95
Prob > chi2 = 0.0470
Log likelihood = -836.89941 Pseudo R2 = 0.0024

------------------------------------------------------------------------------
begging | Odds Ratio Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
1.child_head | 2.51981 1.180173 1.97 0.048 1.006238 6.310081
_cons | .545676 .032052 -10.31 0.000 .4863365 .6122557
------------------------------------------------------------------------------
Note: _cons estimates baseline odds.


. margin child_head

Adjusted predictions Number of obs = 1,288
Model VCE : OIM

Expression : Pr(begging), predict()

------------------------------------------------------------------------------
| Delta-method
| Margin Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
child_head |
0 | .3530339 .0134158 26.31 0.000 .3267393 .3793285
1 | .5789474 .113269 5.11 0.000 .3569443 .8009505
------------------------------------------------------------------------------
 

obh

Active Member
#4
Hi Robbie,

Yes, the logistic regression is simple but confusing. I don't like odds I prefer probabilities :)
What software do you use?

I assume your Y is begging 1 or 0

Usually, you get the coefficient b, and Exp(b)
I assume that your software skip the coefficient and gives you directly the exp(b) as the odds ratio.

So if I understand your results correctly: the odds of child equals the odd of adult * 2.5
If for example (not your data), the odds of an adult to beg are 2 : 9,000
then the odds of a child to beg will be 5 : 9,000
2*2.5=5

In your case, the IV is also categorical. (not continuous), so the "increasing by one" change from a child to an adult.

You may use the following online to get a minimal interpretation.
The result may not be exactly the same, as in logistic regression there is no one solution to maximize the log-likelihood.
The first Y in the entered data will be the baseline (usually use the 0, not 1)
http://www.statskingdom.com/430logistic_regression.html

Ps the Pseudo R2 is small and the confidence interval is wide ...
There are several methods to calculate the pseudo R square, not sure what method you used.
 
Last edited: