# how to split and cut (big data)

#### natus vincere

##### New Member
Hello,

I have table: 3 cols and over 300k rows.

Col1 - ID (numeric id, ex: 123456)
Col2 - Numb (numeric from 1 to 25, ex: 1)
Col3 - Yes/no (0 or 1)

I have 443 unique IDs. Each ID has more than 400 rows, almost of all have different numbers of rows.

How to split initial table to table in this way:
Col1 - ID1
Col2 - ID2
..
Col443 - ID443

and fill it from 1 table (yes/no), 'Numb' is not valuable for the moment

and number of rows is 400 or the minimum of number of rows within ID.
so how to split it and finally cut excess ?

#### trinker

##### ggplot2orBust
THis is very difficult to follow. Can you please make a minimal example of your data, say with 3 IDs and maybe 10 rows. And then show a desired output from that minimal working example? Please sue code tags to share the data and expected output http://www.talkstats.com/showthread.php/29338-How-to-use-Code-tags

We should be able to just cut and paste your data into R.

#### natus vincere

##### New Member
I thought that col2 'number' is not valuable for me at the moment, so there are 2 col's and 10 rows

Code:
id <- c("23456","23456","23456","12321","12321","12321","12321","33333","33333","33333")
yn <- c(1,0,1,1,0,1,0,0,1,0)
dat<- data.frame (ID=id, YN=yn)
as you can notice, there are 4 rows for "12321". This is to cover situation, where I need to cut excess (here, in example, excess, is a last row of "12321"

output:
Code:
id23456 <- c(1,0,1)
id12321 <- c(0,0,1)
id33333 <- c(1,1,0)
output <- data.frame(id23456,id12321,id33333)

#### bryangoodrich

##### Probably A Mammal
I have no idea how you derived your output. You want column id33333 to be (1, 1, 0) when we observe it in the data as (0, 1, 0) for the only three rows it has. It is entirely unclear how you want to "trim" off the extra observation from 12321 unless your logic is simply to say "only keep the first 3," in this case.