how to test the significance of two Machine learning models with one dataset

#1
HI all, am conducting research to prove the accuracy of Neural network greater than Logisitic then combine it with Genetic Algorithm (GA),

so I got the accuracy rate for Neural network and logisitic and the AUC , R2 and RMSE,
where Neural network is higher but with 4% approximately, so I want to know what statisifcal test to test the significance of the results for the same dataset?


and then I will apply GA , which get the new model with reduced attributes of the original dataset, and will reapply Nueral network and Logistic again, so how to test the significance for both tools Before and After GA?


In sum , ONE dataset with Neural and Log , how test the significance between them?

Two Dataset, Before Genetic Algorithm ( FULL Dataset) and After GA ( Reduced Dataset) both with Neual and log, how test the significance?
 

hlsmith

Less is more. Stay pure. Stay poor.
#2
Never heard of GA offhand, but I will probably hear about it constantly now. Could you run each with bootstrap, say 100 samples and models each, then conduct a basic two sample t-test and or plot a boxplot graph contrasting?


Could you take the parameters from the neural model and insert them into the logistic and try to compare parameters base on two logistic models?


Both of these are just random suggestions!
 
#3
Never heard of GA offhand, but I will probably hear about it constantly now. Could you run each with bootstrap, say 100 samples and models each, then conduct a basic two sample t-test and or plot a boxplot graph contrasting?


Could you take the parameters from the neural model and insert them into the logistic and try to compare parameters base on two logistic models?


Both of these are just random suggestions!

first of all thanks for reply, secondly,
I am doing x-validation for both models, each with its parameters as neural network parameters canot be on logisitic regression.

I tried to make paired t-test for both models , I guess it may be right, but Do i do that only for the accuracy? or do it for the whole model predicted outcome?

Secondly, what test should I use for Original dataset and the reduced dataset with using the same model so to test the significance of the Treatment "Genetic Algorithm"?

is just enough to compare , Accuracy rate, AUC, R2 and RMSE?
by the way am using rapidminer software.
 
Last edited: