predicting discrete variable (age) from continuous data and transformations

Hi, i have data on age (years) and a continuous variable (weight for example). I want o predict the age using a linear regression. However, i understand that age is discrete and therefore may not be directly used. So i log transformed both the age and the weight but now i checked the residuals and they are not normally distributed but when i regress without the transformation they are normally distributed. Can you please help me on what i should do regarding predicting a discrete variable from a continuous variable?

Thank you.


New Member
There is no need to log transform the age. I've never heard the logic of log transforming a variable if it is discrete! Rather according to Wooldridge, the variables measured in years are used in their level forms.


Fortran must die
Age is normally considered interval unless you have transformed it somehow to make it categorical or ordinal. It works fine with linear regression unless you have problems like skewed data (which is tied to your data distribution not the measurement it is in).

Normally you log transform data if some barriar (like one end of the data being closed such as a proportion) creates heteroskedacity.
Thank you so much for the replies, i was really worried about this issue. So what i will continue to use the age in the regression WITHOUT transformation (as the residuals are normal like i said). Any papers to support this will be welcome. Thank you again.


Fortran must die
I can't think of any papers off hand but the norm for data like age is that (baring problems in the residuals) is that it is fine with OLS. I have never seen anyone raise an issue over this.