# Regression Analysis with Monetary Variables

#### Buckeye

##### Member
I have a general inquiry about conducting normal or logistic regression using monetary variables. Suppose I have a dependent variable representing amount paid which can take on [0, infinity). I suspect I would use some GLM and not OLS regression. Can you suggest an option? I've read in some cases that the transformation log(y+1) is used. My question, how would I interpret log(y+1) or log(predictor+1) if this were in fact the correct approach after considering alternatives?

Has anyone worked with this kind of data that can point me to resources?
Thanks.

#### Dason

Do you have 0s in your response variable?

#### Buckeye

##### Member
Do you have 0s in your response variable?
Yes, it's a right skewed continuous distribution which can take values into the 10s of thousands.

#### Dason

What proportion of the responses take the value 0?

#### Dason

Just so we are clear... By "contain 0" do you mean that 15% of the responses are exactly equal to 0? Or do you mean that 15% have a 0 somewhere in them (so like 10, 100, 1000, ...)?

#### Buckeye

##### Member
Just so we are clear... By "contain 0" do you mean that 15% of the responses are exactly equal to 0? Or do you mean that 15% have a 0 somewhere in them (so like 10, 100, 1000, ...)?
15 % are exactly equal to 0

#### Dason

What does a scatterplot look like if look at something like x against log(y) after removing the 0s from Y. I'm wondering if some sort of 0 inflated model might be useful for you.

I personally am not a big fan of the log(x + c) where c is some number (typically 1, sometimes something like .001) to make it possible to take the log of 0. As you mention in your first post it's not straightforward to interpret output for that kind of thing.

#### noetsi

##### Fortran must die
I think the correct answer to this is to consider the theory behind it. Many economic models do this, but they have a theoretical reason to do so. They change the reality in doing so.

For example in time series where the variance changes over time and stationarity is required, logging is used to stop it varying (or at least stop it varying so much).

#### Buckeye

##### Member
I appreciate the responses. 0 inflated model makes sense. I will keep digging. I suspect I will encounter these questions in the future. I work with financial/economic data on a day-to-day basis. Thanks noetsi.

#### noetsi

##### Fortran must die
You are welcome. I read it all the time, thankfully I rarely have to do it At least in economics you have theory to build on. In most disciplines there is not, from my readings anyhow, any thing like the level of theory that suggests whether logging for example is useful.