I'm looking for some assistance in statistical analysis with R (ideally), but also some general stats advice. This follows from a review which identified the need for me to adjust for clustering of relatives within family groups in my data set.
I am investigating cardiac phenotypes (I'm a cardiologist) in blood relatives (individuals) of sequential cases of premature sudden death. The cases of sudden death are categorised into 2 groups: 1. explained sudden death; 2. unexplained sudden death.
The groups are unmatched (sequential cases). Within each group, individuals are clustered in family subgroups / strata (of between 1 and 10 individuals).
All individuals / relatives are investigated for evidence of cardiac disease and categorised as "affected" or "unaffected."
I want to report the difference (or not) in proportion of blood relatives who are "affected" between the 2 groups.
For example:
I had initially used a simple Fisher / chi-squared test of proportions (group vs affected status in a 2x2 contingecy table). However, it is clear I need to adjust for the clustering of relatives within family groups.
What test is most appropriate in this circumstance, and which package in R provides the easiest way to account for this?
Having looked around (Google etc), I have found:
However, I have no experience in this at all.
I believe that the Donner (1989) or Rao & Scott (1992) modifications of chi-squared may be appropriate. I have found package(aod) which includes functions donner() and raoscott()
I would certainly appreciate a second opinion on which (if either) to use, and what options are appropriate. I'm currently leaning to Donner, given its prior use in clinical medical / vet / dental research.
My current plan:
The data "matrix" will be 1 column per family / case with 4 columns: ID, group, n (number of relatives in family), y (number affected in family).
I am very grateful for any advice.
Many thanks.
I am investigating cardiac phenotypes (I'm a cardiologist) in blood relatives (individuals) of sequential cases of premature sudden death. The cases of sudden death are categorised into 2 groups: 1. explained sudden death; 2. unexplained sudden death.
The groups are unmatched (sequential cases). Within each group, individuals are clustered in family subgroups / strata (of between 1 and 10 individuals).
All individuals / relatives are investigated for evidence of cardiac disease and categorised as "affected" or "unaffected."
I want to report the difference (or not) in proportion of blood relatives who are "affected" between the 2 groups.
For example:
Group 1 consists of 157 individuals comprised of 41 family clusters
Group 2 consists of 463 individuals comprised of 163 family clusters
Proportion "affected" in Group 1 = 22.9%
Proportion "affected" in Group 2 = 24.6%
Group 2 consists of 463 individuals comprised of 163 family clusters
Proportion "affected" in Group 1 = 22.9%
Proportion "affected" in Group 2 = 24.6%
What test is most appropriate in this circumstance, and which package in R provides the easiest way to account for this?
Having looked around (Google etc), I have found:
- Ratio estimate chi-square test
- Generalized estimating equation
However, I have no experience in this at all.
I believe that the Donner (1989) or Rao & Scott (1992) modifications of chi-squared may be appropriate. I have found package(aod) which includes functions donner() and raoscott()
I would certainly appreciate a second opinion on which (if either) to use, and what options are appropriate. I'm currently leaning to Donner, given its prior use in clinical medical / vet / dental research.
My current plan:
Code:
donner(cbind(y,n-y) ~ group, data=matrix)
raoscott(cbind(y,n-y) ~ group, data=matrix)
I am very grateful for any advice.
Many thanks.