Classification model according to multivariate presence/absence dataset

Hi there!

I am currently working on the analysis of a two part experiment looking at bacterial communities in diatoms, under different temperatures. It is quite a complicated one, and I was hoping I could get some input on how to analyze it. First, however, let me explain how it was designed:

Part 1: Monocultures.
Here, 7 single species of diatoms were cultured in artificial sea water under 4 different temperatures (16, 19, 22, and 25 C). This was done in triplicate. We analyzed the bacterial communities in each species, and this resulted in a presence-absence (1's and 0's) vector of bacteria for each sample of diatom.

Part 2: Metacommunities

Here, we assembled identical diatom communities from those 7 initial species. These were kept at either 16, 19, or 22 C, they were sampled, then the temperature was raised by 3 degrees, then they were sampled again, and finally they were allowed to recover and at the end they were sampled one last time.

The problem is that we do not have replicates for this part of the experiment.

What I would like to do is as follows:
1) make a hypothetical 'null' bacterial community for the basic diatom metacommunity at each initial temperature. This would be done by looking at which bacteria are present in each diatom monoculture at a given temperature and adding them.

2) Compare the distribution of bacteria in each metacommunity to the null models, and provide a metric that shows how "deviated" the bacterial community is from the null expectation.

I am not sure what methods I should use. I am fluent in Matlab... any suggestions?


Super Moderator

For your first question - as I see it - you have two options.

1) create a factor called species and analyse all species together with a temperature factor, then do appropriate posy hoc tests

2) analyse your species seperatley using time as your factor (4 groups)

in each case I would be incluned to use a GLM with a binomial link function.

The metacommunities analysis is probably more straight forward (if you access to PRIMER v6).

This analyse suits ANOSIM.

What I would do is:

>Create your NULL community along side your species columns.

>Create a dissimilarity matrix (Sorenson probably in this case fro P/A data)

>Run ANOSIM to test for no difference between communities. You can also specific pariwise comparions if you like.

>the amount of deviation can be inffered from you dissimalarity scores from your matrix, or if you have access to the PERMANOVA add on, you could also run PERMDISP - which specifically tests for the amount amount variation around centroids or medians in your populations.

Lack of replication is not an issue here as there are options to create a model based on no repliciation.
Last edited: