1. ### Outliers Parametric model

To conclude, I tried numerous transformations and the one that worked best was the box-cox approach. An example of it being implemented can be found here: https://www.kaggle.com/rtatman/data-cleaning-challenge-scale-and-normalize-data . I'd like to thank everyone for their feedback.
I can only upload this in txt format and its quite a small sample. the full thing can be found here as a csv: https://ufile.io/vgxm1 The goal is to construct a regression model to predict the funding of campaigns based on the other input variables. Even some pointers for non-parametric...
Thats a great observation Dason. Didn't think of that. Unfortunately for me this puts trimming out of the question. Any ideas on how to proceed? Thanks
Hi, thanks for replying. That was my impression as well. Here is the link to the data; https://ufile.io/vgxm1 4 features of crowdfunding campaigns on Kickstarter: ID, Backers Funding Goal (numeric) The data is raw, so I can post a jupyter notebook tomorrow if that helps. However it should...
I'm struggling to decide how to normalise my data for modeling. I am dealing with crowdfunded projects and a huge chunk of them (15%) raised $0 -$10 dollars, therefore failing. Those produce a very strong positive skew that is impossible to normalise. (tried log, z-scores, cubic). And will...