I have been trying to familiarize myself with statistics but I'm getting confused on what methods to use when answering what questions.
For instance, if I want to determine which of a set of tested parameters (A, B, C, ...) are significantly different between two groups, would I use a t-test, Mann-Whitney U test, or something else? Also, I read that when there are multiple significance tests going on at once, I need to do an adjustment (Bonferroni, Benjamini-Hochberg, Holm) on the data. However, when I do these adjustments, my BH p-values are crazy high (smallest is around 0.45). I'm not sure what that means. Interpretation of a lot of statistical tests is a bit of a challenge for me as well. Also, is this type of data set considered multivariate if I'm only grouping according to type (group) and nothing else, like age or sex?
What would I need to do if I wanted to answer questions such as:
1. Does group B show a significant change from group A?
2. Determine the dependence of the tested parameters.
3. Determine if the data follows a normal distribution. If not, what kind of distribution best fits the data?
General statistical questions:
4. What is the difference between a similarity matrix and performing various t-tests?
5. Why is clustering so important? What kinds of questions could clustering help answer?
6. How do similarity matrices and distance matrices differ?
7. How are QQ-plots best used?
8. When should data be transformed before performing an analysis on it? Are there certain transformations that are more ideal for answering certain questions about the data?
Clearly, I'm thoroughly confused. I am currently reading Numerical Ecology with R by Daniel Borcard. I am also referencing about 7 other books to try to help me to understand the things that Borcard doesn't explain. I'm getting very lost. Please HELP!!
For instance, if I want to determine which of a set of tested parameters (A, B, C, ...) are significantly different between two groups, would I use a t-test, Mann-Whitney U test, or something else? Also, I read that when there are multiple significance tests going on at once, I need to do an adjustment (Bonferroni, Benjamini-Hochberg, Holm) on the data. However, when I do these adjustments, my BH p-values are crazy high (smallest is around 0.45). I'm not sure what that means. Interpretation of a lot of statistical tests is a bit of a challenge for me as well. Also, is this type of data set considered multivariate if I'm only grouping according to type (group) and nothing else, like age or sex?
What would I need to do if I wanted to answer questions such as:
1. Does group B show a significant change from group A?
2. Determine the dependence of the tested parameters.
3. Determine if the data follows a normal distribution. If not, what kind of distribution best fits the data?
General statistical questions:
4. What is the difference between a similarity matrix and performing various t-tests?
5. Why is clustering so important? What kinds of questions could clustering help answer?
6. How do similarity matrices and distance matrices differ?
7. How are QQ-plots best used?
8. When should data be transformed before performing an analysis on it? Are there certain transformations that are more ideal for answering certain questions about the data?
Clearly, I'm thoroughly confused. I am currently reading Numerical Ecology with R by Daniel Borcard. I am also referencing about 7 other books to try to help me to understand the things that Borcard doesn't explain. I'm getting very lost. Please HELP!!
Last edited: