How to use a Distance Matrix as an Independent Variable?


Let's say I have 100 locations. I know the distance between each location (i.e. distance matrix). How can I use the distance matrix (or some derivative) as an independent variable to access significance (if any) with the dependent/response variable(s)?

Any ideas?

Could you give some more information. Is it a matrix or a vector? How is each distance distributed?
If it is a vector you could probably start by assuming that the vector is multivariate normal.
The matrix's values are the Euclidian distances between all points (NxN, or here, 100x100).

Is it a matrix or a vector?
I believe it is a matrix.

How is each distance distributed?
Um, the distance are locations of the Earth.

My dependent variables are Population Mean and Population Trend (slope) of a mammal. I have other independent variable measuring various aspects of habitat. I have the suspicion that these distances may be 'important' with regards to the dependent variables. I am just looking for a way or ways to do this or at least think about it.

Thank you kindly,
Hmm.. I think modeling a matrix is pretty complicated as you dont need a covariance matrix anymore, but something in 3 dimensions. Therefore you should probably see if you could write your model in vector form. Perhaps by the vech operator:
Besides, you still haven't described your problem yet. What is your model? I don't really get why the measurements are random either.

"Um, the distance are locations of the Earth". Yeah, but we are doing statistics here, so you need statistical distributions. If you had just one measurement. How would you model it?
Thanks for you reply. I know my ambitions seem obtuse. Right now, I don't have a model, I am trying to determine if there is a way I can use these distance as an independent variable. With or without it I will probably use AIC for model selection with the other variables.

The problem I am encountering is that my other variables are, mostly, demonstrating a fair amount of spatial autocorrelation. This got me thinking that maybe these 'distances' might be 'important' in the model. I just don't know how I would use them or if I can. Thanks for your help, perhaps adding them as a vector would work, I obviously need to read more about that.

I don't really get why the measurements are random either.
They are not random they were selected by the mammal based on their suitability with regards to their life history.

I really appreciate your help,
I have been reading about Mantel tests and partial-Mantel tests, but have not come to a conclusion yet. I'd be interested in reading what others have to say.



Super Moderator
Mantel tests could work if you have two matrices (geographic distances and species for example). From what I can gather you have one euclidean distance matrix and mean population size data from each site, in which case I dont think a Mantel test is what you are after.

If you have a singe response varaible and you are firm set on using a distance matrix, then you could produce an NMDS on Euclidena distances and overlay your means in a bubble plot. However, would be inclined to first look at geographic distance and means in simple regression analyses first to determine what type of, if any relationship there is between the two varaibles. Producing a distance matrix on these geographic distances doesn't make make much ecological sense to me because in Euclidean space
sites close together will group close together same as the farther sites. We know this. However, if you had some more meaningful information about your sites (latitude, mean annual temps. rainfall etc...) and incorporate this into your analysis, it might be a better place to start.

At the moment (unless I have missed something obvious) you have a univarite study (distance vs. pop means) and you should treat it as such.
Yes, I believe that my tests will be of the univariate nature due the limitations imposed by my response variable (population counts). I have various other independent variables, including some that are derivatives of distance, but even when my sites are spatially independent (i.e. don't overlap) there is a fair level of spatial autocorrelation. Then I got the idea "Hey, maybe this distance (matrix) is telling the story." However, it appears now that that might not be the case.

As far as the Mantel tests, the second non-Euclidean distance matrix could created from any variable by calculating the distance (difference between values of the same unit), right?

Thanks again I enjoy the discussion,


Super Moderator
As far as the Mantel tests, the second non-Euclidean distance matrix could created from any variable by calculating the distance (difference between values of the same unit), right?
Correct. However you can use different units, but this will require the data to be normalised proir to calculating your distance matrix