Transforming binomial variables for temporal PCA analysis?

I currently have data that I am interested in analyzing via PCA. Currently the data is in a form that makes it tricky. The data currently looks like *Month (Jan through October)* *five sites, each sampled once a month, 10 months of data total* Site 1: Detection of target species: Yes/No(binary), DO, Nitrate, etc (in ppm)(continous), Site 2...Site 5, same structure. There are interesting relationships between this data that I have seen via binary logistic regression.

I thought that I might tweak this data a little bit to fit it into a PCA. Obviously this would be a little tricky as it's not ideal that each site across the months currently only have the binary detection data. I thought I might be able to generate a detection variable for each point by assigning each the average number of detections for that month. (i.e. January = Site One :0, Site Two:0, Site Three:1 , Site Four :1 Site Five:0 would mean the proportion for that month would be 2/5 = .4 and .4/5=.08. So then maybe the sites could each be reassigned a detection probability of .08 in place of the binary detection for PCA purposes ?

This may be totally wrong, I am just attempting to think outside the box. Please let me know if you have better ideas or an alternative.


Active Member
Primary Component Analysis is a method of reducing a problem with many variables that contribute to a value down to the most important few, with the least loss of information.
Last edited: