Two factor factorial and regression models

I'm making up this scenario for illustration of my question.
Let's say I have factor A: low level 10 high level 30
And factor B: low level 5 high level 15
I have a quantitative response of interest. I can choose to compare means in the typical design setting. Or I can construct two regression models to study the relationship between factor and response. My question: What is the difference between treating these levels as categorical vs. quantitative?

first model: y = b0 + b1*Xa + b2*Xb + b3*XaXb
where Xa=1 if low level A and 0 otherwise. Xb=1 if low level B and 0 otherwise XaXb=1 if low level A and low level B

second model: y= b0 +b1*Xa + b2*Xb + b3*XaXb
where Xa and Xb are quantitative predictors

I understand that If I have more than two levels per factor the second model will save me some degrees of freedom. But when I have two levels, the benefit isn't as clear to me.
Last edited:


Active Member
Those models are the same; only the units of Xa and Xb differ. In both models, b1, b2, and b3 are the effect of a 1-unit change in Xa, Xb, and Xa*Xb, respectively. In the first model, a 1-unit change represents a change from the low high level of the of the associated variable to the high low level. Whereas in the second model, a 1-unit change is in whatever units Xa and Xb are measured.

If Xa and Xb each have only two levels of interest, then the first model has the advantage of being very easy to interpret: b0 is the mean of y at (Xa,Xb) = (0,0), b0 + b1 is the mean at (1,0), b0 + b2 is the mean at (0,1), and b0 + b1 + b2 + b3 is the mean at (1,1). On the other hand, if interest lies in predicting the value of y for values of Xa and Xb other than those present in the data set, then the second model would be easier to use.