    Why do you need oversampling/undersampling?

    Assume original data contains 1000 goods and 1 bad I build a logistic regression and use the the model to score the bad and I get probability = 0.00001 Then I use oversampling/undersampling to increase/decrease the original data so now I have 1000 goods and 1000 bags if I use oversampling. Then...