Hello all.
I would like to perform some exact tests for Hardy-Weinberg equilibrium on a genotype data set, but I am not a very
experienced SAS-user and I am definately no statistician so I can't seem to find the right way of doing this in SAS.
An example of observed values for one of the bi-allelic loci that I would like to test is:
pp 289
pq 86
qq 3
Calculating this manually according to H-W yields expected values of:
pp 292
pq 81
qq 6
consequently the chi-square value becomes: 1.563 with 1 degree of freedom corresponding to an asymptotic p-value of 0.21.
I believe that this is correct, but when I try to make SAS calculate the exact p-value on the same data using the code listed
below, I get a different chi-square value (1.8395), different degrees of freedom (2 rather than 1?) and of course ultimately
a different p-value (asymptotic p-value of 0.3986 and an exact p-value of 1.0486.)
Also, I am probably missing something here, but how can the exact p-value ever excede 1.00?
------------
data test;
input genotype $ status $ count @@ ;
cards;
pp Observed 289
pq Observed 86
qq Observed 3
;
proc freq order=data data=test;
weight count;
tables genotype / testf= (292 81 6);
exact chisq;
run;
------------
I hope that someone in here can help me solve this problem.
Thanks in advance!
Best regards
Christian
I would like to perform some exact tests for Hardy-Weinberg equilibrium on a genotype data set, but I am not a very
experienced SAS-user and I am definately no statistician so I can't seem to find the right way of doing this in SAS.
An example of observed values for one of the bi-allelic loci that I would like to test is:
pp 289
pq 86
qq 3
Calculating this manually according to H-W yields expected values of:
pp 292
pq 81
qq 6
consequently the chi-square value becomes: 1.563 with 1 degree of freedom corresponding to an asymptotic p-value of 0.21.
I believe that this is correct, but when I try to make SAS calculate the exact p-value on the same data using the code listed
below, I get a different chi-square value (1.8395), different degrees of freedom (2 rather than 1?) and of course ultimately
a different p-value (asymptotic p-value of 0.3986 and an exact p-value of 1.0486.)
Also, I am probably missing something here, but how can the exact p-value ever excede 1.00?
------------
data test;
input genotype $ status $ count @@ ;
cards;
pp Observed 289
pq Observed 86
qq Observed 3
;
proc freq order=data data=test;
weight count;
tables genotype / testf= (292 81 6);
exact chisq;
run;
------------
I hope that someone in here can help me solve this problem.
Thanks in advance!
Best regards
Christian