I have the following data set about the snps ID
and a gene reference file
I'm trying to match the genes with the SNps using snp location, so include the snps that has
POS >= txstart and POS<= txend
for example I want a data set that has the following columns
how to do this match using R
Code:
POS ID
78599583 rs987435
33395779 rs345783
189807684 rs955894
33907909 rs6088791
75664046 rs11180435
218890658 rs17571465
127630276 rs17011450
90919465 rs6919430
Code:
genename name chrom strand txstart txend
CDK1 NM_001786 chr10 + 62208217 62224616
CALB2 NM_001740 chr16 + 69950116 69981843
STK38 NM_007271 chr6 - 36569637 36623271
YWHAE NM_006761 chr17 - 1194583 1250306
SYT1 NM_005639 chr12 + 77782579 78369919
ARHGAP22 NM_001347736 chr10 - 49452323 49534316
PRMT2 NM_001535 chr21 + 46879934 46909464
CELSR3 NM_001407 chr3 - 48648899 48675352
POS >= txstart and POS<= txend
for example I want a data set that has the following columns
Code:
genename SNPID chrom position txstart txend