Hi All! Thanks for reading.
I wanted to compare month of sample collection (April, May, June, and July) to presence/absence data (coded in binary dummy variables). I tried to run a Simple logistic regression, but realized that the output table with p values, etc all report for may, June, and July compared to April. It seems May is only one significantly different from April... but is there a way to compare all months to each other?
Code and initial result is posted below for anyone interested in attempting to help
Thanks!
## OPEN FILES ##
data <- read.table(file=file.choose(), header=TRUE, sep="\t")
## CHECKING DATA ##
dim(data)
colnames(data)
## CALCULATING FREQUENCIES PER MONTH ##
freq <- table(data$Month, data$Present.Absent)
freq <- freq[c(1,4,3,2),]
## CALCULATING PREVALENCES PER MONTH ##
prev <- freq[,2]/rowSums(freq)
## PLOTTING PREVALENCES
barplot(height=prev)
help(barplot)
## LOGISTIC REGRESSION FOR MONTH ##
glm.month <- glm(data$Present.Absent ~ data$Month, family=binomial())
anova(glm.month, test="Chisq")
summary(glm.month)
________________________________________________________________
Output =
Df Deviance Resid. Df Resid. Dev Pr(>Chi)
NULL 273 180.77
data$Month 3 15.255 270 165.51 0.001611 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> summary(glm.month)
Call:
glm(formula = data$Present.Absent ~ data$Month, family = binomial())
Deviance Residuals:
Min 1Q Median 3Q Max
-2.6081 0.2604 0.3632 0.4474 0.6980
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 3.3673 0.5872 5.734 9.79e-09 ***
data$MonthJuly -0.6817 0.8372 -0.814 0.41548
data$MonthJune -1.1160 0.7273 -1.534 0.12495
data$MonthMay -2.0794 0.6516 -3.191 0.00142 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 180.77 on 273 degrees of freedom
Residual deviance: 165.51 on 270 degrees of freedom
AIC: 173.51
Number of Fisher Scoring iterations: 6
I wanted to compare month of sample collection (April, May, June, and July) to presence/absence data (coded in binary dummy variables). I tried to run a Simple logistic regression, but realized that the output table with p values, etc all report for may, June, and July compared to April. It seems May is only one significantly different from April... but is there a way to compare all months to each other?
Code and initial result is posted below for anyone interested in attempting to help
Thanks!
## OPEN FILES ##
data <- read.table(file=file.choose(), header=TRUE, sep="\t")
## CHECKING DATA ##
dim(data)
colnames(data)
## CALCULATING FREQUENCIES PER MONTH ##
freq <- table(data$Month, data$Present.Absent)
freq <- freq[c(1,4,3,2),]
## CALCULATING PREVALENCES PER MONTH ##
prev <- freq[,2]/rowSums(freq)
## PLOTTING PREVALENCES
barplot(height=prev)
help(barplot)
## LOGISTIC REGRESSION FOR MONTH ##
glm.month <- glm(data$Present.Absent ~ data$Month, family=binomial())
anova(glm.month, test="Chisq")
summary(glm.month)
________________________________________________________________
Output =
Df Deviance Resid. Df Resid. Dev Pr(>Chi)
NULL 273 180.77
data$Month 3 15.255 270 165.51 0.001611 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> summary(glm.month)
Call:
glm(formula = data$Present.Absent ~ data$Month, family = binomial())
Deviance Residuals:
Min 1Q Median 3Q Max
-2.6081 0.2604 0.3632 0.4474 0.6980
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 3.3673 0.5872 5.734 9.79e-09 ***
data$MonthJuly -0.6817 0.8372 -0.814 0.41548
data$MonthJune -1.1160 0.7273 -1.534 0.12495
data$MonthMay -2.0794 0.6516 -3.191 0.00142 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 180.77 on 273 degrees of freedom
Residual deviance: 165.51 on 270 degrees of freedom
AIC: 173.51
Number of Fisher Scoring iterations: 6