Let's say I am running a regression y=B1 x1+B2 x2.

In examining the y variable via histogram, I decided it needed to be transformed into ln(y) to give it a normal distribution, and to deal with a number of observations where y=0 I added 1 to the y variable before logging it --> y=y+1 and lny=ln(y).

After this I added 1 to both x1 and x2 variables.

Now I am trying to decide whether to log or square x1 and x2 for the regression, and I am examining scatterplots. Should I be looking at (scatter y x1) or (scatter lny x1 ) in order to determine if a transformation is necessary.

When i do: (scatter y x1) it looks like there is a linear relationship and x1 should be left as is, but when I do (scatter lny x1) it looks like x1 should be logged.

Any idea what I should do?