# Beta Regression Coefficients

#### Martin Marko

##### Member
In a simple model, x is a continuous (normally distributed) variable predicting y. Since y values are proportions ranging from 0 to 1 (0%-100%), simple linear regression may give out-of-bounds estimates for some predicted values (i.e., lower than 1 or higher than 1).

Therefore, I have decided to use beta regression with boundaries from 0 to 1 (i used betareg() command in betareg R package; the software is however not important). While it is easy to interpret the unstandardized regression parameter from a linear model (see below linear model output: B = 0.126 indicating an increase by 12.6% of y if x rises by 1), I am not sure how to understand, transform, or use the parameters from betareg model to get a meaningful interpretation of the coef (see below - Beta regression output).

Output for linear regression model: lmMod = lm(formula = y ~ x)
Code:
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.57936    0.10849  -5.340 9.57e-07 ***
x        0.12591    0.01354   9.296 4.07e-14 ***
Output for beta regression model:betaMod = betareg(formula = y ~ x)
Code:
Coefficients (mean model with logit link):
Estimate Std. Error z value Pr(>|z|)
(Intercept) -4.85712    0.52580  -9.238   <2e-16 ***
[B]x[/B]            0.56796    0.06498   8.740   <2e-16 ***

Phi coefficients (precision model with identity link):
Estimate Std. Error z value Pr(>|z|)
(phi)    7.686      1.184   6.491 8.54e-11 ***
How can I interpret the parameter 0.567 in the beta regression output (together with the intercept)? Is there a way how to use 0.567 and get the increase of the absolute value in y (i.e., if x increases by 1, y increases by XX, since y is in %, the interpretation is easy).
Thank you! M.

Last edited:

#### hlsmith

##### Less is more. Stay pure. Stay poor.
I had a similar pursuit about 6 months ago. I believe I came across a SAS tech paper from a sas user group that gave a good description. Sorry I am not at my computer right now. Though I will see if I can find it.

#### hlsmith

##### Less is more. Stay pure. Stay poor.
It might have been Paper: 335:2011. Looks like they take on a logistic style interpretation.

#### Martin Marko

##### Member
Thank you a lot for helping,
logistic interpretation means B1 is log odds, right? So I can use exp(coefficientB1_value) to get "odds" ( = 1.792) which I don't understand at all.

Perhaps another way to go: I am considering to use the abovementioned simple linear regression and then define the "meaningful" range of its application (like, use linear regression equation to compute the value of x that would predict prob of y = 0 and then estimate upper-bound meaningful value of x that would predict y = 1). Does this make any sense? Not sure, but i really need to know an increase of X changes the value of Y (in %).

BTW, the relationship Y~X can be seen as linear: Thank you,

#### GretaGarbo

##### Human
The logit model:

log(p/(1-p) = beta*x

can be solved to:

p = exp(beta*x)/(1+exp(beta*x))

or

p = 1/(1 + exp(-(beta*x)))

It gives these numbers:
Code:
# the linear regression model parameter estimates
a <-   -0.57936
b <-    0.12591

a + b*8
#  0.42792
#seems reasonable

a + b*9
#  0.55383

# the beta-regression model with logit link:
alpha <-  -4.85712
beta  <-   0.56796

# log(p/1-p) = xbeta gives

# p =  1/(1-exp(-(alpha + beta*x)))

p0 =  1/(1+exp(-(alpha + beta*8)))
p0
#  0.4222753

p1 =  1/(1+exp(-(alpha + beta*9)))
p1
#  0.5632887

p1 - p0
#  0.1410134   changing from x=8 to x=9

# compare with the above linear model
0.55383  -  0.42792
# 0.12591

# they are two different models so they don't give exactly the same result
# but similar results
But if your original data were 0/1 success/failure then maybe it would be more natural to do the usual logit.

#### Martin Marko

##### Member
Many thanks for the transformation,

Best regards,

#### hlsmith

##### Less is more. Stay pure. Stay poor.
Can you post a histogram of your dependent variable values? Linear reg is acceptable given the bulk of values land near 0.5 with minimum dispersion.

#### Martin Marko

##### Member
Can you post a histogram of your dependent variable values? Linear reg is acceptable given the bulk of values land near 0.5 with minimum dispersion.
Sure,
just to mention that each data point represents a difficulty parameter of a test item which was estimated on ~200 individuals measure.
The issue of the linear/beta regression was to model of how theoretical complexity of an item (given by construction) relates to its empirical difficulty.

M Last edited: