A little help with looping..

#1
Below is a similar to a data set I am working with... I am struggling to write a loop that would go through the data and add a new column 'Rank' based on the other two columns..

So for example if the
'Color' = 'red' and measure between 0-2 then it would have a rank of 1
'Color - 'red' and measure between 3-5 then it would have rank of 2

'Color'= 'blue and measure between 20-50 then it would get a rank of 1
'Color' = 'blue and measure between 51-61 then it would get a 1.

And so on..


Any help is greatly appreciated!!

Code:
structure(list(Color = structure(c(3L, 3L, 3L, 3L, 1L, 1L, 1L, 
1L, 2L, 2L, 2L, 2L), .Label = c("blue", "green", "red"), class = "factor"), 
    Measure = c(1L, 1L, 5L, 5L, 45L, 45L, 60L, 60L, 100L, 100L, 
    200L, 200L)), .Names = c("Color", "Measure"), class = "data.frame", row.names = c(NA, 
-12L))
 

JesperHP

TS Contributor
#2
Im sorry I dont understand youre description of how exactly you want to recode. How are any reader supposed to know what values you want to assign any green.
However it looks like you want to apply a function to rows of a data.frame for which you can use apply()
 
#3
ya.. if it were green it would be a value of 1 if the measure was between 50-150 and a value of 2 of between 151-250

Ill explore apply()

Thanks!
 
#4
I think I could do it like this..

Code:
> ddply(data, .(Color, Measure), newcolumn =  myfunction(Measure))
But I am having trouble with writing the function to sort the data...
 
#5
I was originally trying to write something like this.. probably not the best way to get it it work work, especially not understanding the syntax fully

Code:
for (i in 1:length(data)){
	if(data$Color == 'red' &  0<data$Measure[i]<5) {
		data$Group1[1]<-1
		}
		if(data$Color == 'red' & 4<data$Measure[i]<6) {
			mydata$Group1[i]<-2
	if(data$Color == 'blue' &  44<data$Measure[i]<50) {
		data$Group1[1]<-1
		}
		if(data$Color == 'blue' & 50<data$Measure[i]>61) {
			mydata$Group1[i]<-2
	if(data$Color == 'green' &  90<data$Measure[i]<101) {
		data$Group1[1]<-1
		}
		if(data$Color == 'green' & 101<data$Measure[i]>201) {
			mydata$Group1[i]<-2

		}
}
 

JesperHP

TS Contributor
#6
Ignoring such standard comments as "try to avoid loops because they are slow" and "remember to vectorize" it seems to me you are on the right track.


The way I understand you is you want to make a new variable X for observations i=1,...,N such that in each case
\(X_i = f(Y_{1i},...,Y_{Ki})[\math] where the important thing is that X_i only depends on values of observation i.

Then maybe something like this is what you need:
Code:
mydata=structure(list(Color = structure(c(3L, 3L, 3L, 3L, 1L, 1L, 1L, 
1L, 2L, 2L, 2L, 2L), .Label = c("blue", "green", "red"), class = "factor"), 
    Measure = c(1L, 1L, 5L, 5L, 45L, 45L, 60L, 60L, 100L, 100L, 
    200L, 200L)), .Names = c("Color", "Measure"), class = "data.frame", row.names = c(NA, 
-12L))

mydata$X=NA

for (i in 1:nrow(mydata))
	{
		if (mydata$Color[i]=="red") 
			{ if (mydata$Measure[i]>1)
			 { mydata$X[i]="kowabunga"} else {mydata$X[i]="kaslaam"}
			}

		if (mydata$Color[i]=="blue")
			{
				if(mydata$Measure[i]>45)
					{ mydata$X[i]="dobooo"} else {mydata$X[i]="dabii"}
			}

		if (mydata$Color[i]=="green")
			{
				if(mydata$Measure[i]>100)
			{ mydata$X[i]="wichiiiiinga"} else {mydata$X[i]="wichuuuunga"}
			}

	}
\)
 
#7
Jesper, thank you!! :wave:

although I cant tell.. are you being sarcastic and I should avoid loops?

or do you believe loops are worthy? :tup:
 
#8
So I restructured the code to fit my needs more specifically.. The actual dataset is a lot more complicated this..

I guess I should listen to the coders!

Code:
mydata<-structure(list(Color = structure(c(3L, 3L, 3L, 3L, 3L, 3L, 1L, 
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("blue", 
"green", "red"), class = "factor"), Measure = c(1L, 1L, 5L, 5L, 
9L, 9L, 45L, 45L, 60L, 60L, 80L, 80L, 300L, 300L, 100L, 100L, 
200L, 200L)), .Names = c("Color", "Measure"), class = "data.frame", row.names = c(NA, 
-18L))
my new loop..
Code:
for (i in 1:nrow(mydata))
	{
		if (mydata$Color[i]=="red") 
			{ if (mydata$Measure[i]>0 & mydata$Measure[i]<2)
			 { mydata$NaimansAverageSizedHands[i]="A"}
			  if (mydata$Measure[i]>2 & mydata$Measure[i]<6)
			  {mydata$NaimansAverageSizedHands[i] = 'B'}
			  if (mydata$Measure[i]>5 &mydata$Measure[i]<10)
			  {mydata$NaimansAverageSizedHands[i] = 'C'}
			 
			}

		if (mydata$Color[i]=="blue")
			{ if (mydata$Measure[i]>44 & mydata$Measure[i]<60)
			 {mydata$NaimansAverageSizedHands[i]="A"}
			  if (mydata$Measure[i]>45 & mydata$Measure[i]<61)
			  {mydata$NaimansAverageSizedHands[i] = 'B'}
			  if (mydata$Measure[i]>60 & mydata$Measure[i] < 81)
			  {mydata$NaimansAverageSizedHands[i] = 'C'}
			 
			}

		if (mydata$Color[i]=="green")
			{ if (mydata$Measure[i]>99 & mydata$Measure[i]<101)
			 { mydata$NaimansAverageSizedHands[i]="A"}
			  if (mydata$Measure[i]>101 & mydata$Measure[i]<201)
			  {mydata$NaimansAverageSizedHands[i] = 'B'}
			  if (mydata$Measure[i]>200 & mydata$Measure[i]< 301)
			  {mydata$NaimansAverageSizedHands[i] = 'C'}
	}}



Ive been trying to figure out apply() with little luck.. all the examples seem to use numerical values and not factors..

Id love to learn if you have some advice..

Sincere thank you for your help already!