Question how to build the contingency table

#1
Hi all,

I have two lists of genes as follows:

  • DEList has 282 gene names
  • AllList has 32805 gene names
  • DEList is a subset of AllList.

In both lists I've looked for genes which have a specific parameter (e.g. binding site for pol3 and binding site for pol2). The results of this search is in the table below.

  • pol3DE 4
  • pol2DE 190
  • pol3all 85
  • pol2all 12365

pol3DE and pol2DE are both subsets of the list DEList with the specific parameter binding site for pol3 and binding site for pol2 respectively. pol3all and pol2all are both subsets of the list AllList with the same specific parameters as above.

I would like to calculate the p-value to see if a higher proportion of pol2-specific genes are in the DEList than the AllList, and likewise for the pol3-specific genes in DEList relative to AllList.

If I understand it correctly I have six different parameters:

  • AllList - 32805
  • DEList - 282
  • pol3DE - 4
  • pol2DE - 190
  • pol3all - 85
  • pol2all - 12365

How do I create the contingency table for the test in this case?
I have tried and created this one (s. attachment)?
It doesn't look right, as I don't have the sum of DEList in nowhere in the table. But does it has to be?

As far as I understand it, I need to run the McNemar test for this data, as the two groups (=lists of genes) are dependent.

What I would like to know is not the significance of each group on its own, but whether or not the proportion of of pol2 in the comparison DEList relative to AllList is higher than the proportion of pol3 for the same comparison.
So Just making two McNemar tests wouldn't really solve the problem, unless I can compare the two p-values.
I have run the two tests and both p-values are very low, i.e. supposedly significant, but what about against each other. Is there a way to compare this?

Code:
pol3 = as.table(rbind(c(32720, 0), 
                     c( 81, 4) ))
colnames(pol3) <- rownames(pol3) <- c("No", "Yes")
names(dimnames(pol3)) = c("all", "DE")
pol3
mcnemar.test(pol3, correct=FALSE)

	McNemar's Chi-squared test

data:  pol3
McNemar's chi-squared = 81, df = 1, p-value < 2.2e-16

pol2 = as.table(rbind(c(32720, 0), 
                     c( 12175, 190) ))
colnames(pol2) <- rownames(pol2) <- c("No", "Yes")
names(dimnames(pol2)) = c("all", "DE")
pol2
mcnemar.test(pol2, correct=FALSE)

	McNemar's Chi-squared test

data:  pol2
McNemar's chi-squared = 12175, df = 1, p-value < 2.2e-16
Thanks

Assa

PS.
I have asked this question also in a different forum, but I still have difficulties to understand this topic. :confused::confused:
 
Last edited: