restriction of range

noetsi

No cake for spunky
#1
I am running a regression where the predictor, interval(formally anyhow see below), is how many times a customer had a new counselor. The dependent variable is how much income they earned. The dependent variable has a wide range of values. But there are relatively few levels of the predictor (in practice, in theory they are not limited). The median is two and there are relatively few beyond six.

My question is can this relatively narrow range of distinct levels (there are about 5,000 cases) reduce the slope coefficient artificially. The number I arrived at, about a 130 dollars more a quarter for each new counselor, was smaller than I would have guessed. The histogram of the predictor (I know this does not violate the assumption of the model, I am wondering if it could still reduce the coefficient in practice).

1663683117895.png
 

noetsi

No cake for spunky
#2
I have now have a formal test if you are an idiot. Soon to be called the Hellein test.

1) Run linear regression with a single predictor.
2)Request a VIF/Tolerance test. To be safer add a test of collinearity.

If you puzzle more than five minutes, as I did, why the VIF/Tolerance value is strange and/or look up the strange result you are officially a statistical idiot. :p

Ok the single predictor is effectively ordinal (not in theory, but in practice). So I guess I don't have to worry about linearity. But I can't figure out if this suggests heteroskedacity or not (I have 5,000 plus points so it probably does not matter, but I would still like to know if this suggests hetero - which I often puzzle over).

1663697188066.png