# Computing an average for a data set that contains NULL values

#### Dnelson

##### New Member
Hi all!
We are working on some software that shows the mean for a set of students. Currently, the calculation is being done in a manner that includes the students that haven't yet been graded. I have questioned the logic that our developers used and they have asked me to validate the methodology that should be used.

For example, we have 5 students and only 1 has been graded. On a 1-5 scale, the student received a 5.

The current averaging is calculated as the sum of the scores (5) divided by the number of students (5). So 5/5=1

My understanding is that this should be the sum of the scores (5) divided by the number of scores (1). So 5/1=5

Is there a "right" way to handle this? If so, can anyone point me to an article that explains this so that I can talk further with our development team?

Thanks!
Dan

#### Dnelson

##### New Member
Does anyone have an answer to this? It would be greatly appreciated...

#### noetsi

##### Fortran must die
This is a question that depends on your research definitions. A mean is the sum of the scores divided by the number of units used. If your research question is the mean of those who have been graded than you would add up the scores of those graded (not those who took the test) and divide by the number graded. I can't think of any research question that would be validly addressed by adding up the scores of those graded and divide by the number who took the test.

Essentially you are assigning zeros to people who took the test and were not graded if you do what you mentioned above and that does not seem to get at what you want to know (although of course I don't know what your research question is). Many of these students may well score above zero when they are graded.