# How can I report common strings in array/dataframe

#### jfca283

##### Member
Hello,
I have no idea how to report in a vector the commons variables from a data.frame or matrix.
I have a data many columns. In this example I show four columns with d1-d4 as names.

So, how do I create a vector with the common strings? Something as this:

It looks simple, but unfortunately I can't decipher it.
I can't decipher it.
Thanks for your time and interest.

#### Dason

You'll need to be more specific. Are you looking to do this on a per column basis? Do you want to ignore the columns and treat all if the data combined? Maybe you're interested in unique rows? You didn't provide enough info in your post to know for sure.

#### jfca283

##### Member
I just need a vector reporting only the "strings" that exist, or are common, considering all the columns from the data frame or matrix.
I don't care the row order of a "string" to report. Only the existence is the relevant. The example I provided with the images tries to explain that. Sorry if I didn't make myself clear.

#### Dason

Ok - I see what you're doing now. You want to find the common intersection of elements over all the columns.

Code:
> # making some fake data
> dat <- data.frame(a = letters[1:5], b = letters[2:6], c = letters[3:7])
> dat
a b c
1 a b c
2 b c d
3 c d e
4 d e f
5 e f g
> # for a single pair intersect gets us what we want
> intersect(dat$a, dat$b)
[1] "b" "c" "d" "e"
> # it doesn't work for more than two columns
> intersect(dat$a, dat$b, dat$c) Error in intersect(dat$a, dat$b, dat$c) : unused argument (dat$c) > # but the Reduce function allows us to iterate the function over all the columns. > Reduce(intersect, dat) [1] "c" "d" "e" #### jfca283 ##### Member Your code worked great. One final question, what do I need to edit in order to obtain the opposite to the intersection? I mean, the rows that are not shared in every column? Thanks for your sintax, Dason. Very simple and fast. #### Dason ##### Ambassador to the humans Can you provide an example of what you want? #### jfca283 ##### Member Of course, Code: dat <- data.frame(a = letters[1:5], b = letters[2:6], c = letters[3:7]) I need to perform some function to retrieve the non commons letters considering the three columns in order to export the non common letters to a vector or colum as: Code: vector<-a, f, g I already know that using the setdiff function I can obtain the non common letters as I desire, but only using two columns or vectors from dat. Code: c(setdiff(dat$a,dat$b),setdiff(dat$b,dat\$a))
"f" "a"
The problem is that I need to apply this to several columns.

#### Dason

By non-common letters you just mean that you want the letters that aren't present in all of the columns right? I ask because if that's the case I would have expected your example to give c("a", "b", "f", "g") and not just c("a", "f", "g") as you wrote.

Code:
> dat <- data.frame(a = letters[1:5], b = letters[2:6], c = letters[3:7], stringsAsFactors = FALSE)
> common <- Reduce(intersect, dat)
> all.options <- unique(unlist(dat))
> setdiff(all.options, common)
[1] "a" "b" "f" "g"

#### jfca283

##### Member
Thanks, Dason.
You are very kind.
I think this thread is solved.