Iterating in R

#1
Hi all,

I want to perform a set of data management tasks that involves the following:

-Input a data frame with an arbitrary number of columns, and column names, of numeric type.
-Check each value (i.e. entry) to see if it meets a condition
-If it does, replace the value

so as an example:

the input:
df_test <- data.frame(id=c(1,2), x1=c(1,6), x2=c(9,6), xt=c(10,12))

the condition:
if any entry in the data frame is in between 1-5 (inclusive), replace that value with another, say 9999999 (I've chosen this because it's so large that it's not possible in my actual data)

the return:
return a data frame with the numbers recoded if they meet the condition



I'm currently looking at purrr (a really basic tutorial, so I'm not sure if this is the best way to do my operation, and it may take me a while to get through it), and I'm wondering if this is a good starting point.

Thanks!


Edit: I've been able to get the logic kind of working, but not in a format / workflow I'll ultimately need it to be in. Any help is still/direction is greatly appreciated.

Code:
library(purrr)

inp <- 1:10

checksmall <- function(.x) {
  if (.x %in% c(1,2,3,4,5)) {
    .x <- 99999
  } else {
    .x <- .x
  }
}

inp2 <- unlist(map(inp,checksmall))
Code:
> inp
[1]  1  2  3  4  5  6  7  8  9 10
> inp2
[1] 99999 99999 99999 99999 99999     6     7     8     9    10
 
Last edited:

Dason

Ambassador to the humans
#2
purrr is way too advanced for the task you described. But I'd also suggest using NA instead of a large indicator value for the recode value. You could do the recode on all columns you want simultaneously very easily using base R.
 
#3
purrr is way too advanced for the task you described. But I'd also suggest using NA instead of a large indicator value for the recode value. You could do the recode on all columns you want simultaneously very easily using base R.
Thanks Dason! You're absolutely right it was way easier in base R. Thanks for the perspective.


Code:
df <- data.frame(x1=c(1,9,10), x2=c(11,43,3))
df[df==1|df==2|df==3|df==4|df==5] <- NA

> df
  x1 x2
1 NA 11
2  9 43
3 10 NA

Any insight to as why the following doesn't work?

Code:
df[df %in% c(1,2,3,4,5)] <- NA
I realize I can do the following for the initial question I asked and get a good result.

Code:
df[df < 6] <- NA
In the future, I may need to use arbitrary values to recode (e.g. 1,2,3,4,5,18, 77, 100)

best,