One of the many possible answers to this question is that it depends on your perspective on measurement, and the kind of inferences you want to make.

1) Perhaps you are an **operationalist**. You operationally define "opinions" as score on the questionnaire, and want only to make inferences about how the IV affects scores on the actual questionnaire. You are not interested in the question of whether the questionnaire is a reliable or valid measure of opinions; it's serving as your operational definition, and that's that. If this is your stance, you don't need to worry about "levels of measurement" or anything like that, although you should still check that the *distributional *assumptions of ANOVA are met (e.g., normality homoscedasticity and independence of errors).

2) On the other hand maybe you take a **latent variable **stance to measurement. You see your questionnaire as an imperfect measure of a latent variable (people's opinions) that exists out there in the real world, but which isn't directly observable. You consider that scores on your questionnaire are produced by a combination of variation in this latent variable, as well as measurement error. Because of the effects of measurement error, ANOVA is unlikely to produce particularly accurate inferences about the relationships between latent variables. If you are interested in inferences about latent variables, you will probably want to use a statistical method designed for making such inferences; e.g. structural equation modelling.

3) Finally, maybe you take a **representationalist** stance to measurement. You see your questionnaire scores as reflecting information about *empirical relations *observed amongst your sample. You don't want to have to make the ontological assumption that "opinions" exist out there in the world; measurement is just about summarising observed empirical relations. The implications of representationalism are probably the most complex. The whole idea of "levels of measurement" comes from representationalism.

From a representationalism perspective, the type of data you have means that you have observed ordinal relations. For example, if participant 1 ticks "agree" to an item, s/he has a higher level of agreement than someone who ticks "disagree"). However, you are not able to empirically compare *differences *between people. E.g. if participant 1 ticks "Agree" to an item, participant 2 ticks "disagree", and participant 3 ticks "strongly disagree", you can't empirically determine whether the difference between participant 1 and 2 is greater than, less than, or the same as the difference between participant 2 and 3.

Because you have only observed ordinal relations, a representationalist would argue that there are many possible ways to code responses to your items that are equally valid (as long as they are all monotonic transformations of each other). E.g. you could code response options like "strongly disagree", "disagree", "agree", and "strongly agree" as 1, 2, 3 and 4 respectively. This coding would convey the information that you have observed: E.g. that a participant responding "disagree" has a lower level of agreement than someone who ticks "agree". However, coding these 4 response categories as 1, 5, 7, and 1019 would be just as good: it again conveys the same information about the ordering of responses (1 < 5 < 7 < 1019).

Because the choice of coding scheme is arbitrary, within the set of all possible monotonic transformations, we would only want to use a statistical test that is invariant across all these transformations. This is not the case for ANOVA. If you coded responses to a Likert item as 1 (strongly disagree), 2 (disagree), 3 (agree) and 4 (strongly agree), and then used these responses as the response variable in an ANOVA, you would likely get quite different results than if you coded them as 1, 5, 7, and 1019. You could however use a Kruskal-Wallis test, which is a non-parametric rank-based alternative to ANOVA, which would not have this problem.