Recent content by Marwah Soliman

  1. M

    How to vectorize the following loop code?

    I have the following code for(i in 1:400) { if (mydata$quad[i]==1){ mydata$z[i]=rnorm(n=sum(mydata$quad[i]==1),mean=1,1) } else if (mydata$quad[i]==2){ mydata$z[i]=rnorm(n=sum(mydata$quad[i]==2),mean=9,1) } else if...
  2. M

    code give more variables than what I have

    hello, I have the following data V1 V2 1-Dec -0.0539597263407143, -0.0511951479332808, 1-Mar 2.91833246778835, 0.549735256023383 1-Sep -0.60152265093355, 0.596134547512546 The values are in one column which is V2, I would like to separate the column V2...
  3. M

    what is wrong with my code.

    Hello, there is a variable named quad take values from 1 to 9 ( 400 points) I'm trying to simulate a z variable as follows: fo for example if I have 30 points of quad=1 , i simulate 30 Z values with mean =0 the if I have 15 points of quad=2,I simulate 15 Z value with mean = 2 .... and so on...
  4. M

    what is wrong with my code

    I'm generating a random numbers x,y between -10 and 10 then I create an if statement to classify each pair(x,y) in 4 quadrant in the following if statement but the column quad give me all 4 so why? x <- runif(n=400, min = -10, max = +10) y <- runif(n=400, min = -10, max = +10)...
  5. M

    how to (x,y) values from xy plane in R

    Hello, I would like to generate random points on the x-y plane, for example let's set -10 < X < 10, -10 < Y < 10, and n = 400 how can I do that?
  6. M

    how to repeat the same row for two genes names

    I have the following row TMEM186;PMM2 cg00011459 0.92614347 0.950500413 0.941842608 0.942410506 0.948078755 0.872005565 0.934310136 0.906705645 0.948436255 I have many of these so please make your code general I want firs to separate the gene names into two rows and repeat the same row...
  7. M

    how to split one column in multiple columns in r

    hi, I have the following data, A2M -0.313797611 A2M 0.74108697 A2M -0.364874718 A2M 0.296491814 A2M -0.387285569 A2M 0.58645548 A2M -0.602393199 A2M -1.028994508 A2M -0.639690915 A2M -0.200383697 A2M 0.716309005 A2M -0.918101251 A2M 0.338130889 A2M 0.794278207 A2M 0.340621333...
  8. M

    how to extract the F value in anova from a permutation test

    Hi I did a permutation test using the following code perm=replicate(1000,anova(lm(sample(genenormal)~as.factor(snptest1)))) the I got the following this is just one column , I have more than that : V1 Df c(2,1092) Sum sq c(48.96,9782.66) Mean sq...
  9. M

    help where is the error

    I have the following code not written by me new_tum <- as.matrix(clinical[,ind_keep]) new_tum_collapsed <- c() for (i in 1:dim(new_tum)[1]){[i,])) < dim(new_tum)[2]){ m <- min(new_tum[i,],na.rm=T) new_tum_collapsed <- c(new_tum_collapsed,m) } else {...
  10. M

    how to put labels on ggplot2 graph

    I have a data and I did PCA on that data I used the following code i.pca=prcomp(tmydf) scores <-$x) qplot(x = PC1, y = PC2, data = scores, geom = "point",col=race) I need to put the rownames of tmydf as labels on the graph how can I do that ?
  11. M

    how to give column names for a data in R

    hello, I have a file with 1117 columns and I have the names of these columns but how to give the names to those columns it is not efficient to write colnames(data)=c("name1", "name2",.....) because they are 1117 names any help
  12. M

    removing rows SNPS according to proportion minor allele < 5%

    hello, I know this should be easy but my brain can't get it I have the following part of data as data is 900,000 rows * 1000 columns V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 1 rs987435 C G 0 2 1 2 2 1 1 2 rs345783 C G 0 0 0 0 0 0 0 3 rs955894 G T 2 1 2 2 2...
  13. M

    replace -inf with 0 in data

    hello, I have data set about gene expression I discovered that there was some -inf in the data I need to replace them with zero my data is 390989 rows * 1000 column here is a row from my data (part of it) A1BG 0 0 5.604860 4.487620 5.545700 -inf I need to get A1BG 0 0 5.604860...
  14. M

    efficient R code

    I have the following code in R to creat .ped file and .map file that is used in Plink "open-source whole genome association analysis toolset" I need more efficient R code since the data I have is 905460 rows and 1120 columns and it takes forever to run , " I didn't get results for 2 days...
  15. M

    Map snps into a ref gene file using R

    I have the following data set about the snps ID POS ID 78599583 rs987435 33395779 rs345783 189807684 rs955894 33907909 rs6088791 75664046 rs11180435 218890658 rs17571465 127630276 rs17011450 90919465 rs6919430 and a gene...