Calculate Standard Error over averaged responses or raw responses?

#1
Hi,

I would like to calculate the standard error of my Dependent Variable. However, I have psycholinguistic data, which uses multiple ppts and multiple items. Thus, there are 2 ways of doing this: either by calculating SE over averaged responses (example 1) OR by calcualting SEs over raw responses (Example 2)

1. Average over items by participant (as in Table 1). Then calculate mean and SE from this table.
Table 1.
1537366371896.png
..which gives mean 62.67 and SE = 16.63

2. List each observation on its own row (as opposed to average observation by ppt) - as in Table 2.
Table 2:
1537366407105.png

Because N is greater in example 2, the SE will be different (Mean =62.67 ; SE =17.3 ). The example I give is very simple, but the fact you can get different SEs is important - especially if I were using SE to compare means in two different conditions. My question is, which method is best for calculating the Standard Error?

Many thanks! Altohugh this is a simple Q, I really appreciate any insights.
Ryan
 

Attachments

Last edited:
#2
Hi,

First, if this is a sample data you need to calculate Sample Standard Deviation using (n-1) instead of n.

What is PPT? anyway if you want to calculate the standard deviation of a dependent variable you need to use all the data (table2)
1537441918726.png
Mean:62.66667 S:18.371173
 
#3
Thank you for your response - it makes sense! PPT refers to participant (i.e., PPT 1 is data observed from first participant ,whereas PPT 2 is data from second participant, and so on). I expected this was the answer, but it is interesting because a lot of standard text books don't discuss this issue: they simply present the summary data which averages a response for each participant (i.e., they have already averaged over items, as in example 1), and calculates SE from there (as in Example 1, above) - but I understand that would be the wrong approach and that the approach used in Example 2 is best practice. It would be good if such textbooks provided a footnote to explain this.

Many thanks,
Ryan
 
#6
OK so it's example 1? Or is there even a 'correct' choice? Example 2 might be more appropriate because it doesn't throw any data away i.e., it is calculating variance of every observation from the average. On the other hand, if we aren't allowed to use multiple responses from the same participant, then Example 1 would be more appropriate.
 
#7
Okay, now I understand your question ...

let's go to the edges:

example A. If this is repeated sample:
PPT1, condition A, item1, 33
PPT1, condition A, item1, 31
PPT1, condition A, item1, 35

In this case, it makes sense to take first the averages then calculate the standard deviation of the averages. Table1.

example B. If this would be a different random sample from the population, different PPTs , in this case, it makes more sense to use Table2.

PPT1, condition A, item1, 33
PPT2, condition A, item2, 31
PPT3, condition A, item3, 35
PPT4, condition A, item4, 33
PPT5, condition A, item5, 31
PPT6, condition A, item6, 35

In your example, I assume both options are statistically correct.
Now the question is "what do you want to show?"

If you expect a similar result for any item and condition, and the variance is mainly because of the person it is more like example A.
Then the standard deviation will be by PPT.

If you expect a different result for any item or condition it is more like example B.
 
#8
OK so it's example 1? Or is there even a 'correct' choice? Example 2 might be more appropriate because it doesn't throw any data away i.e., it is calculating variance of every observation from the average. On the other hand, if we aren't allowed to use multiple responses from the same participant, then Example 1 would be more appropriate.
I updated my answer, is it clear now?
"Table1" doesn't throw any data away, it just uses it in a different way

Both options are "correct" it is the question what do you want to show.

Another simple example of tests results:

subject,#students, mark
history,1000,80
Math,10,90

The average of any test at school is (1000*80+10*90)=80.099
The average of the subject mark is (80+90)/2=85

Both averages are correct, it is only the question of what do you want to show
 
Last edited: