I’m writing my thesis about the question: is ruminative thinking an independent risk factor for predicting marihuana use (controlled for sex and depression). In this study there are 300 participants of which 60 used marihuana. To investigate this relationship I wanted to do a multiple regression analyses. However, after analyzing the data I found that the following assumptions of the multiple regression are violated:

Linear relationship between (a) the dependent variable and each of your independent variables.

Homoscedasticity

Normal distribution of residuals (errors).

Probably one of the problems is the way marihuana use (DV) is measured, which is: On how many days during the last 30 days did you use marijuana? (0-30 days). Because this study was for both marihuana and non-users, only 60 of the 300 participants said they used marihuana. Consequently, 240 participants answered 0 days, which is why the DV is far from being normally distributed. Should I transform the data in some way or should I use a non-parametric test? or is there an other option?

Thank you very much for every response

PS: I hope i give enough information to answer the question. If not, please tell me and i will provide it