Is Pearson correlation suitable for my analysis? Suggestions appreciated!

#1
Hi guys, it's been weeks since I started to worry about my data analysis.
I work with microarray data analysis, meaning that a have several datasets containing each one a bunch of genes (around 30-1034, depending on treatment). In my case I'm working with 5 independent treatments, namely 1. a "parental" toxin; 2. a metabolite from this toxin (called X); 3. another metabolite (called Y); 4. other metabolite (Z); 5. a control group. Furthermore, I worked with 3 time points for each treatment (6, 12 and 18 h of incubation with cells).
My variables/treatments are composed of different genes (cases) that can be upregulated (positive values) or downregulated (negative values).
My plan was to use like a "percentage similarity" to relate my data. For example, the response of the metabolite "X" after 6h is 35% similar to the response of the parental toxin after 12h. That could be very helpful to analyze temporal relations between metabolism and response of the cells.
At first I tried Pearson correlation, to see the similarity between the inumerous treatments, then I could transform to percentage through coefficient of determination (r^2), which would represent the percentage of variation shared between the variables. The thing is, I can't consider these variables as dependents of one another, because there is no causal relation between then... Right?
I also tried cosine similarity, which gave me pretty much the same thing.
I used binary correlations (Ochiai, Jaccard, simple matching, Dice, pretty much every test from SPSS) attributing 0 for presence of the gene and 1 for abscence to obtain percentage similarities, but I ended up with crazy results. Also, it's not good because I just can use 2 values (1 for presence and 0 for abscence, not considering if the gene was up or downregulated).
My supervisor thought about using the slope of the linear regression, pearson, or any graph (?!) to obtain an angle and use that as a relationship measure. But i can't plot these values, because there is no dependent and independent variable. I tried plotting both ways (toxin vs metabolite and metabolite vs toxin) to see what I could do with the equations, but I'm not very confident.
So, any suggestion of approach I could use? Any suggestion from statisctical experts would help a lot a poor biologist! :yup: