# removing rows SNPS according to proportion minor allele < 5%

#### Marwah Soliman

##### New Member
hello,

I know this should be easy but my brain can't get it
I have the following part of data
as data is 900,000 rows * 1000 columns

Code:
           V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
1    rs987435  C  G  0  2  1  2  2  1   1
2    rs345783  C  G  0  0  0  0  0  0   0
3    rs955894  G  T  2  1  2  2  2  2   2
4   rs6088791  A  G  0  1  1  0  1  2   2
5  rs11180435  C  T  1  1  1  1  0  2   0
6  rs17571465  A  T  2  2  2  2  2  2   2
7  rs17011450  C  T  2  2  2  2  2  2   2
8   rs6919430  A  C  2  2  2  2  2  2   2
9   rs2342723  C  T  0  0  0  0  0  0   1
10 rs11992567  C  T  2  2  2  2  2  2   2
I would like to have another data frame where it keeps only the rows of snps with minor allele frequency ( frequency of number 2 in the data) > 5% and remove the ones with MAF < 5%

basically I want the following out put for example ( represent the frequency of 0,1,2 for each snp

Code:
              0      1       2
rs2342723      20     10     1
rs11992567     35    20     10
and then remove rs2342723 since proportion of 2 = 0.03

I used the following code

data <- datasnp[rowSums(datasnp==2)/ncol(datasnp) > 0.05, ] is this correct??

Last edited: