"Quartile" Regression

Hello dear forum members,

I am conducting a replication study based on the paper by Gao, G. G., Greenwood, B. N., Agarwal, R., & McCullough, J. S. (2015). VOCAL minority and silent majority: How do online ratings reflect population perceptions of quality. MIS Quarterly: Management Information Systems, 39(3), 565-589. DOI: 10.25300/MISQ/2015/39.3.03
(Since the original paper PDF is too large to be uploaded, I include relevant screenshots in the attachment below)

My study, however, is different in terms of the (A) level of analysis (organizational vs. individual), and (B) measures of the outcome.

On page 574, they say:

"The above estimation assumes that the relationship between physician quality and the likelihood of being rated online [Equation 1, p. 572] is constant over the distribution of physician quality. To examine the robustness of our results, we employ another specification allowing the probability of receiving an online rating to vary across the lower, middle two (the excluded category), and upper quartiles of physician quality. Results are reported in Column 2 of Table 4."

I know what a quartile is and how to break a variable into quartile. But that basically implies creating a categorical variable that classifies an observation belonging to each of the quartiles. As such, I cannot wrap my mind around how they estimate Equation (1) allowing to vary across lower and upper quartiles with the sample size being the same (?!) (Table 4, Column 2, page 575) I.e., it’s not that they use an -if- condition, nor they use a quantile regression, obviously.

Can anyone give me a hint here, please?


Last edited:


Less is more. Stay pure. Stay poor.
I will initially acknowledge i only skim your text and ignored the reference. Did they do a piecewise regression? Where you can model different portions of the dependent variable. I have seen this with time series and I did some thing like it in lieu of a quadratic model (but didnt allow non linearity) where you can model "v" instead of parabola or a "v" with a flat bottom. I think there are ways to do it using a single model and indicator terms.
hlsmith, I appreciate your response. Let me look into piecewise regression. It seems that a single model with indicators could be the way to go.
hlsmith, thank you for the hint. I believe I was able to accomplish the task. In particular, I created two dummies that equal 1 if an observation fell in the lowest or highest quartiles, respectively. I than used each of these dummies as predictor in the model. The estimated coefficients, their signs, and significance level are consistent with those obtained by Gao et al.