Writing a mean loop.. (literally)

#1
So my test data looks like this:

Code:
structure(list(day = c(1L, 1L, 2L, 2L, 2L, 3L, 3L, 4L, 4L, 4L
), Left = c(0.25, 0.33, 0, 0, 0.25, 0.33, 0.5, 0.33, 0.5, 0), 
    Left1 = c(NA, NA, 0, 0.5, 0.25, 0.33, 0.1, 0.33, 0.5, 0), 
    Middle = c(0, 0, 0.3, 0, 0.25, 0, 0.3, 0.33, 0, 0), Right = c(0.25, 
    0.33, 0.3, 0.5, 0.25, 0.33, 0.1, 0, 0, 0.25), Right1 = c(0.5, 
    0.33, 0.3, 0, 0, 0, 0, 0, 0, 0.75), Side = structure(c(2L, 
    2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L, 2L), .Label = c("L", "R"), class = "factor")), .Names = c("day", 
"Left", "Left1", "Middle", "Right", "Right1", "Side"), class = "data.frame", row.names = c(NA, 
-10L))

or this:

Code:
day Left Left1 Middle Right Right1 Side
   1 0.25    NA   0.00  0.25   0.50    R
   1 0.33    NA   0.00  0.33   0.33    R
   2 0.00  0.00   0.30  0.30   0.30    R
   2 0.00  0.50   0.00  0.50   0.00    R
   2 0.25  0.25   0.25  0.25   0.00    L
   3 0.33  0.33   0.00  0.33   0.00    L

I would like to write a loop to find the standard error and average value for each day on the chosen side.. for example... On day 2.. this value would be the $right +right1 for the first two entries and $left + $left1 for the third.. the mean I want would be the average of these three numbers... does that make sense?

Ok.. So far I have this code:

Code:
td<-read.csv('test data.csv')

IDs<-unique(td$day)  

se<-function(x) sqrt(var(x)/length(x))

for (i in 1:length (IDs)) {


day.i<-which(td$day==IDs[i])   
td.i<-td[day.i,]

if(td$Side=='L'){ 
side.i<-cbind(td.i$Left + td.i$Left1)
}else{
side.i<-cbind(td.i$Right + td.i$Right1)
}

mean(side.i)
se(side.i)

print(mean)
print(se)

}

But I am getting error messages like this

Code:
Error: unexpected '}' in "}"
Obviously, I am also not getting the print out of means for each day.. Does anyone know why?
 
Last edited:

trinker

ggplot2orBust
#3
I would like to write a loop to find the standard error and average value for each day on the chosen side.. for example... On day 2.. this value would be the $right +right1 for the first two entries and $left + $left1 for the third.. the mean I want would be the average of these three numbers... does that make sense?
No this doesn't make sense.

Try providing an example out put of what you want instead. That makes it easier to help you figure out the R way to do this rather than fixing a broken for loop.
 
#5
No this doesn't make sense.

Try providing an example out put of what you want instead. That makes it easier to help you figure out the R way to do this rather than fixing a broken for loop.


I cant get the output to work so it is difficult for me to come up with an example.. But, I am just looking for a list of Means and SE for each day.. the list might look like this

And I do appreciate the help with the 'R' way.. I felt like I was on a sort of exponential curve to start but now I am stuck.. I don't want to just fix this loop, but if i can the loop to work I can re-work through it to understand what is happening..
Code:
day mean  se
   1  0.3 0.3
   2  0.4 0.2
   3  0.5 0.5
   4  0.6 0.4
   5  0.9 0.3
 
#6
so for day 2.. the mean would be an average of three numbers.. the first '2' entries would be $right + $right1, the second '2' would be the same, but the 3rd, because the $side == 'L' would be $left + left1.. This is why I was trying to implement the cbind() and if() function..
 

trinker

ggplot2orBust
#7
OK now I get it. The desired output is really helpful in explaining what you want. Generally if you cross post like this between threads link them. So go back to the stack.overflow site and link to here.

The reason you didn't get any where is because you weren't clear. As a help I suggest you always provide an example of what you want that makes it very clear to people your desired output.

I'll fix the function I gave you before to do what you want. As far as cbind goes it does not add things together it merely pastes things together into a matrix type structure. use rowSums (or in this case a simple + operator) to do what you want.
 
#8
Hmmm.. Interesting with cbind().. I was using it before and it worked.. I used it like this though cbind( A + B) and so maybe it did add them? But I could probably just create an object like you suggested like this both<-(right + right1) and that would add them..

anyway.. I think if this is helpful i will stick to this site.. I am used to forum boards and so the format seems clearer here.. Thanks!
 

trinker

ggplot2orBust
#9
@deangelr

I am still unsure of the process (mathematical) you're using to arrive at:

Code:
day mean  se
   1  0.3 0.3
   2  0.4 0.2
   3  0.5 0.5
   4  0.6 0.4
   5  0.9 0.3
Could you post the code you used to make this?

As I see it you can only go up to day three (from your original data set) yet your output has up to 5 days.

Code:
day Left Left1 Middle Right Right1 Side
   1 0.25    NA   0.00  0.25   0.50    R
   1 0.33    NA   0.00  0.33   0.33    R
   2 0.00  0.00   0.30  0.30   0.30    R
   2 0.00  0.50   0.00  0.50   0.00    R
   2 0.25  0.25   0.25  0.25   0.00    L
   3 0.33  0.33   0.00  0.33   0.00    L

Additionally if you rounded to get the values could you round two places rather than 1 as it makes figuring out what you're after much easier.
 

Dason

Ambassador to the humans
#10
I'm not sure they were giving the exact desired output as much as an example of what the structure of the output would look like. But I definitely think the exact desired output for the given data should be given.
 

trinker

ggplot2orBust
#11
If Dason's correct this will work but there's likely better approaches (I got lazy and used aggregate):

Code:
FUN <- function(dat, side = "L") {
    DF <- split(dat, dat$Side)[[side]]
    ind <- if(side=="L") 2:3 else 5:6
    stderr <- function(x) sqrt(var(x)/length(x))
    meanNse <- function(x) c(mean=mean(x), se=stderr(x))
    OUT <- aggregate(rowSums(DF[, ind], na.rm=TRUE), list(DF[, 1]),  meanNse) 
    OUT <- data.frame(OUT[, 1], data.frame(OUT[, 2] ))
    colnames(OUT)[1] <- "day"
    return(OUT)
}


FUN(td)
FUN(td, "R")
Yielding:

Code:
> FUN(td)
  day mean   se
1   2 0.50   NA
2   3 0.63 0.03
3   4 0.83 0.17

> FUN(td, "R")
  day  mean    se
1   1 0.705 0.045
2   2 0.550 0.050
3   4 1.000    NA
 
#12
@deangelr

I am still unsure of the process (mathematical) you're using to arrive at:

Code:
day mean  se
   1  0.3 0.3
   2  0.4 0.2
   3  0.5 0.5
   4  0.6 0.4
   5  0.9 0.3
Could you post the code you used to make this?
I did not use code to make that... Nor is the math correct.. I just made that up as an output that I would sort of expect it to be like.


As I see it you can only go up to day three (from your original data set) yet your output has up to 5 days.
The original data has 5 days.. If you look at the data as presented by dput() there are 5 days.... I just used the head() function to also show the structure of the dataset.. I think it makes it easier to understand



Code:
day Left Left1 Middle Right Right1 Side
   1 0.25    NA   0.00  0.25   0.50    R
   1 0.33    NA   0.00  0.33   0.33    R
   2 0.00  0.00   0.30  0.30   0.30    R
   2 0.00  0.50   0.00  0.50   0.00    R
   2 0.25  0.25   0.25  0.25   0.00    L
   3 0.33  0.33   0.00  0.33   0.00    L
Additionally if you rounded to get the values could you round two places rather than 1 as it makes figuring out what you're after much easier.

This is just made up data.. But in the future I will use two places...
 
#13
Here is also another way although I don't really understand it.. I will spend some time to review both today.. This one finds the mean I havn't put in the SE part yet..


Code:
td<-read.csv('test data.csv')

IDs<-unique(td$day)  
new<-as.data.frame(matrix(ncol=3))
final<-as.data.frame(matrix(ncol=2))
se<-function(x) {sqrt(var(x)/length(x))
}
for (i in 1:length(td[,1])){
list<-vector()
if(td$Side[i]=="R")
x<-td[(td$Side=='R'),]
y<-td[(td$Side=='L'),]
time<-c((x[,5]+x[,6]),(y[,2]+y[,3]))
daylist<-c(x[,1],y[,1])	
new[1:length(time),1]<-daylist
new[1:length(time),2]<-time
for(j in 1:length(unique(new[,1]))){
day<-new[(new[,1]==j),]
final[j,2]<-mean(day[,2])
final[j,1]<-j	
}
 
#14
Code:
FUN <- function(dat, side = "L") {
    DF <- split(dat, dat$Side)[[side]]
    ind <- if(side=="L") 2:3 else 5:6
    stderr <- function(x) sqrt(var(x)/length(x))
    meanNse <- function(x) c(mean=mean(x), se=stderr(x))
    OUT <- aggregate(rowSums(DF[, ind], na.rm=TRUE), list(DF[, 1]),  meanNse) 
    OUT <- data.frame(OUT[, 1], data.frame(OUT[, 2] ))
    colnames(OUT)[1] <- "day"
    return(OUT)
}


FUN(td)
FUN(td, "R")
Yielding:

Code:
> FUN(td)
  day mean   se
1   2 0.50   NA
2   3 0.63 0.03
3   4 0.83 0.17

> FUN(td, "R")
  day  mean    se
1   1 0.705 0.045
2   2 0.550 0.050
3   4 1.000    NA
Trinker, is this finding the mean and SE of the right and left sides?
 

Dason

Ambassador to the humans
#15
trinker's code is finding the mean and se for each day for a given side. If you don't specify a side it defaults to "L".
 

trinker

ggplot2orBust
#16
I personally would not have viewed the data this way (specifying either L or R) instead I'd have printed out a separate data frame for both L and R but I gave the OP what they requested.

This may make it nicer to know which side is being printed (threw a cat in there):
Code:
FUN <- function(dat, side = "L") {
    DF <- split(dat, dat$Side)[[side]]
    ind <- if(side=="L") 2:3 else 5:6
    stderr <- function(x) sqrt(var(x)/length(x))
    meanNse <- function(x) c(mean=mean(x), se=stderr(x))
    OUT <- aggregate(rowSums(DF[, ind], na.rm=TRUE), list(DF[, 1]),  meanNse) 
    OUT <- data.frame(OUT[, 1], data.frame(OUT[, 2] ))
    colnames(OUT)[1] <- "day"
    cat(paste(ifelse(side=="L", "left", "right"), "side\n\n"))
    return(OUT)
}


FUN(td)
FUN(td, "L")   #notice same as above
FUN(td, "R")