Measuring Neural Network Performance

#1
Hi all,

I have a set of matrix where the last column is the RESPONSE VECTOR.

I have performed a backprop NN (neural network) in order to perform a nonlinear regression on this set of matrix.

Thus, in the end, the NN will spit out the PREDICTED RESPONSE VECTOR.

I was performing a One-Way Anova to obtaiin the F-ratio and p-value via the ANOVA1 matlab toolbox. The input was a matrix consisting of both the RESPONSE VECTOR and the PREDICTED RESPONSE VECTOR. The concanation was done column wise.

In this case, would the F-ratio and p-value be valid in determining whether the yielded NN model is good or not?

Is using ANOVA1 in anyway valid in this case?

Thank you very much.
 

DAV

New Member
#2
You essentially want a goodness-of-fit measurement. R^2 is frequently used. My personal opinion is that any p-factor you get will be useless.

http://en.wikipedia.org/wiki/Coefficient_of_determination

Presumably, you are trying to measure the model's predictive usefulness. Before jumping to a particular measurement you might want to consider your methodology.

You didn't say what is being used for testing the NN although it sounds like it is the original training data. Testing the fit of the model to the training data is often hopelessly optimistic. The accepted method is to gather fresh data when possible and evaluate the performance of the model against that. Or use some holdout method to evaluate performance. These range from a training-validation-test set (i.e., three different input sets) to a comprehensive hold-one-out and average over all. As most NN's are costly to train, the training-validation-test method likely would be the most convenient.

I can tell you from experience, though, that the holdout method can be just as optimistic but, still, it is one of the better approaches.

This applies to any model -- not just NN's. The testing of models against training information seems to be rampant practice in the soft sciences. It's as if they've never heard the term "over-fit".

You might want to look at the following book. It's mostly in English which makes it a relatively easy read and it covers things like model testing. There is also an interesting anecdote about a model that learned a very extraneous feature very well and when the feature was removed the model performance dropped dramatically. Something to keep in mind.

http://www.amazon.com/Practical-Neural-Network-Recipes-C/dp/0124790402

There are similar books as well.
 

maryo

New Member
#3
hello DAV,

ur post is really so usefull

me to I find problem in finding practical method to test and validate neural net model, I didn't know how to work this

is there some free books like the one in amazon.com? I couldn't books i need free ones

thanks
 

DAV

New Member
#4
The best discussions would be in a textbook. Offhand, I'm unaware of any free ones. You could try searching the subject on the web but most of the relevant pages may be too terse to help.

The basic idea is to use fresh data when possible or some hold-out method. By hold-out, that means to keep some part of the data from the training process. The hold-out methods give an idea of the performance of the model when trained with all of the data (theoretically, anyway). The idea is to predict a test set (i.e., the held back data) some way For nominal result variables often a confusion/contingency matrix is used; continuous or ordered result variables often use regression testing techniques.

Basic algorithm: Several iterations, each generating a <training, holdout> dataset pair are made; a model is built from the training dataset and evaluated (i.e., some measure value is determined) against the holdout set; then finally, all holdout set measurements are averaged to give some 'general population' performance.

The ultimate is to iterate through all of the data holding out one item then using N-1 items for training. This is essentially the bootstrap algorithm so that might help your search. Note: as always, if your dataset is not representative of the general population, you will get biased performance estimates.