# First script - the beginner

#### Juanhijuan

##### New Member
In the begining I'd like to say "Hello" to everyone. I just started to learn R but in theory it's so complicated so I stopped to reading books and want to do my first script.
Would be great if you could tell me how can I manage it.

My first question is how to make the action to select just those sequences which has a letter "K" in the sequence. It doesn't matter if it has 1, 2 or more. I just select those with "K". There is like 1000 sequences in this column.

Can someone give me helpful hand ?

#### trinker

##### ggplot2orBust
Hello. Welcome to Talkstats and R!! Love em both.

Let's start by learning how to format a question that is most likely to get a response that you're looking for. The question you wrote is pretty clear in your desired output, however, with computer programming most find it easiest to solve and teach the problem interactively. That means what you post should be able to be interacted with. You posted an image and R doesn't recognize this as a standard data type. So I'd suggest you post the data from the image here instead using code tags.

When you're posting code, dataframes or computer output it's helpful to wrap this information in code tags by:
1. either clicking the pound (#) sign icon or
2. wrap with [NOPARSE]
Code:
some code
[/NOPARSE]

which produces:
Code:
some code

#### Juanhijuan

##### New Member
Thanks for the reply. Hopefully I will soon learn how to use properly the forum .

I put whole data set which I need to analyse on the speedyshare. The most important for me now are those 2 columns which I put down.
I believe that's not so much of code to analyse it but anyway I just started learning R and I need to prepare that script in a week... That's very important for me.

Code:
http://speedy.sh/Yrbsb/20130830-Report-SUMO-Acetyl-Biotin-2nd-sample-runs.csv
Code:
Sequence	modifications
AAAAGAAAVANQGKK	[14] Acetyl (K)|[15] Acetyl (K)
AAAAGAAAVANQGKK	[14] Acetyl (K)|[15] Acetyl (K)
AAFTKLDQVWGSE	[5] Acetyl (K)
AAIELRE
AAIKFIKFINPKINDGE	[4] Acetyl (K)|[7] Acetyl (K)|[12] Acetyl (K)
AAIKFIKFINPKINDGE	[4] Acetyl (K)|[7] Acetyl (K)|[12] Acetyl (K)
AAIKFIKFINPKINDGE	[7] Acetyl (K)|[12] Acetyl (K)
AAIKFIKFINPKINDGE	[4] Acetyl (K)|[7] Acetyl (K)
AAIYKLLKSHFRNE	[5] Biotin (K)|[8] Acetyl (K)
AAKKFEE	[3] Acetyl (K)|[4] Acetyl (K)
AAKYFRE	[3] Acetyl (K)
AANVKKTLVE	[5] Acetyl (K)|[6] Acetyl (K)
AARAGELLKE
AARAGELLKE
AARDSKSPIILQTSNGGAAYFAGKGISNE	[6] Acetyl (K)|[24] Acetyl (K)
AARDSKSPIILQTSNGGAAYFAGKGISNE	[24] Acetyl (K)
AAVKARVASIDE	[4] Acetyl (K)
AAVKASAPGSVILLE	[4] Acetyl (K)
AAVKASAPGSVILLE
AAVKASAPGSVILLE	[4] Acetyl (K)
AEKLKAE	[3] Acetyl (K)|[5] Acetyl (K)
AEQVKKE	[5] Acetyl (K)|[6] Acetyl (K)
AFAKRQGKE	[4] Acetyl (K)|[8] Acetyl (K)
AFGSGTAAVVSPIKE	[14] Acetyl (K)

#### Juanhijuan

##### New Member
Until my post is waiting for approval I want to ask for one more thing. I want to make a vector of my column "V" in the excel file (it's a 22 column in number) so I put this

Code:
vec_seq <- tbl_all[,5]
It's not working, no idea why. It's what R says "Error in [.data.frame(tbl_all, , 22) : undefined columns selected"