In experiments involving human subjects, samples are taken from the population. The same can be said of the stimuli used in experiments involving, for example, linguistic material. Because in most cases we are interested in generalizing the experimental effects to entire populations (e.g., of participants and items), the group of participants and items that form the basis of an experiment are treated in a statistical analysis as random variables.

It is our understanding that including participants and items in the random effect structure (in a mixed-effect regression) not only deals with issues of generalizability, but also of heteroscedasticity and non-spherical error variance, which, if not taken into account, may lead to the variance (and thus standard error) of the coefficients to be underestimated, causing an increase in the probability of a false positive (eg, Baayen, Davidson, and Bates, 2008).

Question 1:

In some circumstances, ALL items in the population are included in the experiment. This is the case for a study of ours right now, in which the "items" are all possible sequences of three consecutive button presses in a sequence-learning paradigm (eg, one item is the sequence 1-1-1, another is 1-1-2, etc.; in fact the dependent measure is reaction times to each sequence). In this case, where all items in the population are used in the experiment, would it still be warranted or necessary or problematic to include items as a random effect?

Question 2:

Should one always test whether the inclusion of items as a random effect is warranted, given that there may be heteroscedasticity in this variable? That is, even if all items in the population are included in the experiment, as in the case described just above?

Thanks!

Antoine Tremblay

Brain and Language Lab

Georgetown University