Hello,
I am trying to perform a negative binomial regression in RStudio on a dataset but I want to specify a formula for overdispersion (shape parameter). I am able to match almost exactly what Stata produces but I am getting the negative of what Stata produces.
My negative binomial model form is: y=a*ADT^b+L^c and the shape parameter is theta=const*L^b1
I don't think I can use glm.nb because it doesn't have an option for a variable dispersion (as far as I know).
My R code is:
Notice the coefficients for lnAlpha (dispersion): in R const=1.2440 and in Stata cons=-1.244108
Is this a bug in R (Stata's results seem more intuitive)?
My R output is:
Call:
gnlr(y, dist = "negative binomial", mu = ~exp(a + b * lnADT +
c * lnL), shape = ~(const + b1 * lnL), pmu = list(a = -1,
b = 0.4, c = 0.7), pshape = c(0, 0))
negative binomial distribution
Response: y
Log likelihood function:
{
m <- mu1(p)
t <- sh1(p)
s <- exp(t)
-sum(wt * (lgamma(y + s) - lgamma(s) + s * t + y * log(m) -
(y + s) * log(s + m)))
}
Location function:
~exp(a + b * lnADT + c * lnL)
Log shape function:
~(const + b1 * lnL)
-Log likelihood 3910.424
Degrees of freedom 1933
AIC 3915.424
Iterations 34
Location parameters:
estimate se
a -1.0433 0.21282
b 0.3638 0.02558
c 0.6717 0.04123
Shape parameters:
estimate se
const 1.2440 0.2091
b1 0.3937 0.1216
Correlations:
1 2 3 4 5
1 1.00000 -0.94912 0.09922 0.04912 0.05547
2 -0.94912 1.00000 0.20187 -0.03999 -0.04530
3 0.09922 0.20187 1.00000 0.03079 0.03404
4 0.04912 -0.03999 0.03079 1.00000 0.95371
5 0.05547 -0.04530 0.03404 0.95371 1.00000
>
My Stata Output is:
Generalized negative binomial regression Number of obs = 1938
Wald chi2(2) = 400.28
Prob > chi2 = 0.0000
Log pseudolikelihood = -3910.4238 Pseudo R2 = 0.0432
(Std. Err. adjusted for 1938 clusters in RoadInventory_ID)
------------------------------------------------------------------------------
| Robust
Total | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Total |
lnlength | .6718231 .0405363 16.57 0.000 .5923734 .7512728
lnADT | .3639789 .0268313 13.57 0.000 .3113905 .4165673
_cons | -1.044298 .228067 -4.58 0.000 -1.491301 -.597295
-------------+----------------------------------------------------------------
lnalpha |
lnlength | -.3937177 .1193873 -3.30 0.001 -.6277125 -.159723
_cons | -1.244108 .2067863 -6.02 0.000 -1.649402 -.8388147
I am trying to perform a negative binomial regression in RStudio on a dataset but I want to specify a formula for overdispersion (shape parameter). I am able to match almost exactly what Stata produces but I am getting the negative of what Stata produces.
My negative binomial model form is: y=a*ADT^b+L^c and the shape parameter is theta=const*L^b1
I don't think I can use glm.nb because it doesn't have an option for a variable dispersion (as far as I know).
My R code is:
Code:
library(MASS)
library(gnlm)
CSVpath = "MYPATH/Urb2Undiv_D2.csv"
data=read.csv(CSVpath,header=T)
#Point to variables
y <- c(data$Total)
lnADT <- c(log(data$ADT))
lnL <- c(log(data$Length))
Lx <- c(data$Length)
ADT <- c(data$ADT)
SPF = gnlr(y, dist="negative binomial", mu=~exp(a+b*lnADT+c*lnL), shape=~(const+b1*lnL), pmu=list(a=-1,b=.4,c=.7), pshape=c(0,0))
Is this a bug in R (Stata's results seem more intuitive)?
My R output is:
Call:
gnlr(y, dist = "negative binomial", mu = ~exp(a + b * lnADT +
c * lnL), shape = ~(const + b1 * lnL), pmu = list(a = -1,
b = 0.4, c = 0.7), pshape = c(0, 0))
negative binomial distribution
Response: y
Log likelihood function:
{
m <- mu1(p)
t <- sh1(p)
s <- exp(t)
-sum(wt * (lgamma(y + s) - lgamma(s) + s * t + y * log(m) -
(y + s) * log(s + m)))
}
Location function:
~exp(a + b * lnADT + c * lnL)
Log shape function:
~(const + b1 * lnL)
-Log likelihood 3910.424
Degrees of freedom 1933
AIC 3915.424
Iterations 34
Location parameters:
estimate se
a -1.0433 0.21282
b 0.3638 0.02558
c 0.6717 0.04123
Shape parameters:
estimate se
const 1.2440 0.2091
b1 0.3937 0.1216
Correlations:
1 2 3 4 5
1 1.00000 -0.94912 0.09922 0.04912 0.05547
2 -0.94912 1.00000 0.20187 -0.03999 -0.04530
3 0.09922 0.20187 1.00000 0.03079 0.03404
4 0.04912 -0.03999 0.03079 1.00000 0.95371
5 0.05547 -0.04530 0.03404 0.95371 1.00000
>
My Stata Output is:
Generalized negative binomial regression Number of obs = 1938
Wald chi2(2) = 400.28
Prob > chi2 = 0.0000
Log pseudolikelihood = -3910.4238 Pseudo R2 = 0.0432
(Std. Err. adjusted for 1938 clusters in RoadInventory_ID)
------------------------------------------------------------------------------
| Robust
Total | Coef. Std. Err. z P>|z| [95% Conf. Interval]
-------------+----------------------------------------------------------------
Total |
lnlength | .6718231 .0405363 16.57 0.000 .5923734 .7512728
lnADT | .3639789 .0268313 13.57 0.000 .3113905 .4165673
_cons | -1.044298 .228067 -4.58 0.000 -1.491301 -.597295
-------------+----------------------------------------------------------------
lnalpha |
lnlength | -.3937177 .1193873 -3.30 0.001 -.6277125 -.159723
_cons | -1.244108 .2067863 -6.02 0.000 -1.649402 -.8388147