How were the p values calculated in the attached paper?

Screen Shot 2020-04-10 at 8.02.30 AM.png
This paper has recently gained traction in the media and I am attempting to re-calculate the p values reported in the regression analysis in Table 2 of this paper ( However, I appear to be obtaining different results. I am assuming that the outcome variable is the number of hospitalizations (e.g for those less than 18 year olds, the values would be 74, 21, 117, 62, & 335) and that the predictor variable is the year (2014,2015, 2016, 2017, 2018). However, upon analysis in R, I am calculating a p value of 0.172 for the regression coefficient. This appears to be an issue for the other p values as well but perhaps I am simply setting up the regression incorrectly?
My question is, how were these p values calculated? The paper is a bit vague. I've been asking around several forums and haven't been able to receive an answer. Therefore, if someone can help me out, it would be much appreciated.
Hmmm, Ill agree that it does seem a little gauche just looking at the table. What does the stats section of the article say?
" Methods | The National Electronic Injury Surveillance System(NEISS) provides national estimates of injuries that present to emergency departments across the United States ( We queried NEISS for injuries related to powered scooters (code 5042) from2014 to 2018, with keyword scooter in the description(n = 1037). We excluded non–e-scooter injuries (n = 49). We used NEISS complex sampling design to obtain US population projections of injuries and hospital admissions. Population estimates from the US Census Bureau ( were used for the direct method of age adjustment. The data source was public, deidentified, and was exempt from the University of California, San Francisco, institutional review board approval.Owing to the use of deidentified data, patient consent was not obtained. We applied linear regression to determine trends of injuries and admissions.We used Stata, version 15 (StataCorp);2-sided P values less than .05 were considered significant."

In the table that I posted (towards the bottom) it states that they predicted number of injuries or hospital admissions from the year and therefore, I'm assuming that they ran two linear regressions per age group. I ran the simple linear regression using R and I ended up with different p values for the regression coefficient pertaining to the year predictor. I'm not sure what other p value they could report based on linear regression.

If it helps, here's the code I used for the less than 18 years old group.

Year<- c(2014,2015,2016,2017,2018)
Injuries <- c(2304, 3404, 3245, 3298, 4843)

test<- data.frame(Year, Injuries)

test_model<- lm(Injuries ~ Year, data= test)

lm(formula = Injuries ~ Year, data = test)

1 2 3 4 5
-120.4 482.4 -173.8 -618.0 429.8

Estimate Std. Error t value Pr(>|t|)
(Intercept) -998936.4 338156.5 -2.954 0.0598 .
Year 497.2 167.7 2.964 0.0593 .
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 530.4 on 3 degrees of freedom
Multiple R-squared: 0.7455, Adjusted R-squared: 0.6606
F-statistic: 8.786 on 1 and 3 DF, p-value: 0.05934
Last edited:
Yes, what you have is linear regression as it is commonly understood. I would recommend just write the corresponding authors directly to obtain clarification and or codefiles, that's usually alot easier than back-guessing methods. Since it is surgeons they probably won't respond. My first feelings on looking at this were that rates/proportions like this are usually based on binomial and or poisson models/test of one type or another, but I am open to the world of possible analyses. Good luck on our quest.