I decided to try to improve upon my search function to make it more versatile for searching through large data sets (similar to the search button in Microsoft Excel). I changed grep to agrep and added an ignore.case argument so the function is no longer case sensitive and takes approximate matches. I added a variation argument (agrep's max.distance) set at .02. Adjust this to 0 to narrow the results or higher to broaden.
Currently the function works on specific columns of a data frame. I wanted to make it work on a data frame and return all rows that contain any columns with the search term. I thought about using the apply function and then unique to eliminate duplicates. Unfortunately, I can't seem to get this to work. Any ideas?
Search Function Code
EXAMPLE
Currently the function works on specific columns of a data frame. I wanted to make it work on a data frame and return all rows that contain any columns with the search term. I thought about using the apply function and then unique to eliminate duplicates. Unfortunately, I can't seem to get this to work. Any ideas?
Search Function Code
Code:
Search<-function(term,dataframe,column.name,variation=.02){
te<-substitute(term)
te<-as.character(te)
cn<-substitute(column.name)
cn<-as.character(cn)
HUNT<-agrep(te,dataframe[,cn],ignore.case =TRUE,max.distance=variation)
dataframe[c(HUNT),]
}
Code:
#CREATING A FAKE DATA SET
SampDF<-data.frame("islands"=names(islands),mtcars[1:48,])
#EXAMPLES
Search(cuba,SampDF,islands)
Search(New,SampDF,islands)
Search(ho,SampDF,islands)#Too much variation
Search(ho,SampDF,islands,var=0)
Search("Axel Hbeierg",SampDF,islands)#not enough variation
Search("Axel Hbeierg",SampDF,islands,var=2)
Search(19,SampDF,mpg,0)
Last edited: