- Thread starter MSem
- Start date

Follow-up, does your knowledge of the subject model support a non-linear relationship? Could transforming the dependent variable help as well?

You can likely use the linear term model, but I would also present the residual plots for both models as supplement items. This way the reader/audience can judge for themselves. Also, presenting the plot would allow readers to understand if using the linear model, where it may over- or under-predict the dependent variable.

P.S., Feel free to upload those images, so we can provide feedback. P.S.S., What is the sample size we are talking about here? I could see sampling variability coming into play for small samples or if the sample is very large, you would imagine the truth is better presenting itself.

You can likely use the linear term model, but I would also present the residual plots for both models as supplement items. This way the reader/audience can judge for themselves. Also, presenting the plot would allow readers to understand if using the linear model, where it may over- or under-predict the dependent variable.

P.S., Feel free to upload those images, so we can provide feedback. P.S.S.,

You can likely use the linear term model, but I would also present the residual plots for both models as supplement items. This way the reader/audience can judge for themselves. Also, presenting the plot would allow readers to understand if using the linear model, where it may over- or under-predict the dependent variable.

P.S., Feel free to upload those images, so we can provide feedback. P.S.S., What is the sample size we are talking about here? I could see sampling variability coming into play for small samples or if the sample is very large, you would imagine the truth is better presenting itself.

My sample size is large (more than 1000 observations). You can see the plots of my multiple regression model residuals agains independent continuous variables. I clearly see the pattern in the first one but the rest seem less obvious...

Do you mean logging dependent variable? No, I haven't.

Also, can you better describe your variables and how they are measured?

If plotting with diff symbols for categorical variables doesn't show much: theory says linear, then you may want to "ignore" the first two plots. You can try adding a curvature term and refitting the model, saving the new residuals and plotting them to see how much the plot changes. Take a look at your main interests in the model and see if the conclusions change much. If not, it's probably okay to pick the linear model. The assumption in that case can be considered "reasonably" satisfied. If there is a big change, then maybe you can consider something else.

The third plot may also be explained by two different groups, but it would still leave you with an unequal error variance. Again, try to fix the problem, and reexamine the model diagnostics and conclusions (from whatever tests you did) after the fix. If the changes are immaterial, there may be a violation of the strict assumption, but it may be satisfied in a looser sense.

Can also run this as a rank regression and see how your qualitative conclusions change. Again, if the conclusions are largely the same, it may suggest the violations are not so bad in practical terms.

Do you have any categorical variables that may be important to add to the model that generated these residuals? If so, try generating these plots (separately) for each categorical variable using the variable as a plotting symbol. In other words, if you have an independent variable with 2 levels, A and B, create these plots but use a different symbol for A cases and B cases. Maybe you have group A with the upward slope and group B with the downward sloping part of the residual plots.

Also, can you better describe your variables and how they are measured?

If plotting with diff symbols for categorical variables doesn't show much: theory says linear, then you may want to "ignore" the first two plots. You can try adding a curvature term and refitting the model, saving the new residuals and plotting them to see how much the plot changes. Take a look at your main interests in the model and see if the conclusions change much. If not, it's probably okay to pick the linear model. The assumption in that case can be considered "reasonably" satisfied. If there is a big change, then maybe you can consider something else.

The third plot may also be explained by two different groups, but it would still leave you with an unequal error variance. Again, try to fix the problem, and reexamine the model diagnostics and conclusions (from whatever tests you did) after the fix. If the changes are immaterial, there may be a violation of the strict assumption, but it may be satisfied in a looser sense.

Can also run this as a rank regression and see how your qualitative conclusions change. Again, if the conclusions are largely the same, it may suggest the violations are not so bad in practical terms.

Also, can you better describe your variables and how they are measured?

If plotting with diff symbols for categorical variables doesn't show much: theory says linear, then you may want to "ignore" the first two plots. You can try adding a curvature term and refitting the model, saving the new residuals and plotting them to see how much the plot changes. Take a look at your main interests in the model and see if the conclusions change much. If not, it's probably okay to pick the linear model. The assumption in that case can be considered "reasonably" satisfied. If there is a big change, then maybe you can consider something else.

The third plot may also be explained by two different groups, but it would still leave you with an unequal error variance. Again, try to fix the problem, and reexamine the model diagnostics and conclusions (from whatever tests you did) after the fix. If the changes are immaterial, there may be a violation of the strict assumption, but it may be satisfied in a looser sense.

Can also run this as a rank regression and see how your qualitative conclusions change. Again, if the conclusions are largely the same, it may suggest the violations are not so bad in practical terms.

Thank you very much! I'll try to include interaction and see how it looks graphically. Indeed, it could be the reason.