I have an ordinal scale that ranks the county from highest income to lowest by median income (there are 67 in my state). I am trying to decide how to use that measure. One possibility is to just use 67 values, but the regression will assume this is an interval measure when it is not (here it is clearly ordinal unlike my other question) . Another is to build a dummy or set of dummies, but I have never seen a discussion how to build dummies in this case. Do I do it bottom half, top half of counties (one dummy). Top ten percent bottom ninety percent....

Nothing in the literature I have seen addresses how you should split the data if you build dummies in Vocational Administration (or anything actually that I have read).

Miner this is not for publication. It is for work - thanks for your comments about building useful models with this. I have 20,000 or so data points so I doubt it will have huge impact on p values. Technically I have the whole population not a sample (although one can argue I guess that it is a subsample of what could occur in the future). It is doubtful p values even apply in this analysis although I will use White standard errors in any case given your comments. I look at the residuals and other tests for violations of the assumptions.

But with so much data non-linearity is generally the only assumption I really worry about.