- Thread starter datum
- Start date
- Tags cross validation model fit regression

I have not seen a good description on potential criteria to use on selecting between the candidate models beyond the lowest error and say parsimony. However, I feel like the form approach does not fully address the measure of dispersion around the means, unless you were able to run say a higher number of folds. Perhaps you can compare the lower 95% Confidence interval values?

To me the whole thing just seems like another subjective/contextual statistical decision.

Not sure what the KS is you're referring to.

There are typically 2 CV charts you might want to explore: training vs test error across increasing number of parameters (predictors) and training vs test error across sample size (as n increases). These will help you understand the bias vs variation trade-off, and if maybe more data is required to improve your model fit (or when it won't help) vs when you may need to complicate your model with more parameters (e.g., more k in k-means).