during the last days, I tried to understand the use of contrasts, but the more I read, the more I am confused.

As far as I understand, a contrast represents a specific hypothesis regarding the relationship between means of K levels of a factor variable. This is done via dummy-coding, i.e., for each Hypothesis (or rather Null Hypothesis H0), I formulate an appropriate dummy variable which takes a certain value for each factor level, i.e., it translates my H0 into a multiple regression equation.

E.g., if I have 4 levels, and my H0 is that the first and the last means are equal, I could use the contrast c(-1,0,0,1) to test this.

Now I have several questions, any answer is very welcome:

1.) R shows always a contrast matrix with K-1 columns instead of a single contrast. Even If I define only a single contrast via contrasts(x) <- c(-1,0,0,1), and subsequently display this contrast, R does something strange and creates two additional contrasts, namely

[,1] [,2] [,3]

1 -1 -0.2697521 0.42099143

2 0 -0.3256196 -0.80247857

3 0 0.8651239 -0.03950429

4 1 -0.2697521 0.42099143

Why that? Do I always need K-1 Contrasts, e.g., in ANOVA? Can’t I test a single contrast?

2.) E.g. on Wikipedia, I found some rules how to define contrasts. namely:

a) the sum of the contrast coefficients should sum to zero and

b) the coefficients for the means to be combined/averaged must be the same in magnitude and direction.

If I now look at the standard contrast used in R for a factor variable with 4 levels, these contrast contradict these rules:

2 3 4

1 0 0 0

2 1 0 0

3 0 1 0

4 0 0 1

If we look e.g. at the first contrast (first column), the sum of its coefficients equals to 1 and not to zero. Furthermore, I thought that these standard contrasts compare each level with the baseline level. Thus, shouldn’t there be a “-1” instead of a “0” in the first row of the matrix?

3.) Which role does the absolute value of contrast coefficient play? Let us e.g. assume we want to compare the combined means of the first two with the last two levels of a variable, what is the difference between the contrasts c(-0.5,-0.5,0.5,0.5) and c(-1,-1,1,1)? F- and p-values are always the same, but how is the magnitude of these contrast coefficients related to the real difference between the combined means?

4.) If we test for a linear relationship between four (ordered) levels, the standard contrast in R is c(-0.671,-0.224,0.224,0.671). I see that these coefficients increase linearly and sum up to zero. But they represent a regression line with a positive slope, how can they detect a linear decrease?

Sorry for this bunch of naive questions; I am grateful even for any partial answer