# Help with a language study

#### Pooh

I’m comparing language production of two groups of nonnative speakers. The first group is speakers who learn English without having any contact with native speakers and the second groups are those who have regular contact with native speakers. I also have a native speaker control group.
In my research context, the three groups of speakers have to say things to apologize in 8 role-play situations. They first four require them to apologize sincerely and the latter requires them to apologize insincerely. Then the same situations will be repeated but this time they have to apologize with scripts.
The apologies (both sincere and insincere in both task conditions) are extracted and played to a group of native speaker judges. On a 7-point Likert scale, the judges rate the speaker of each apology with 1) =absolutely sure the utterance is not an apology and 7)=absolutely sure the utterance is an apology.

The questions are 1) the extent to which the speakers’ intent is correctly identified by the judges, 2) whether the group with contact with native speakers will have scores closer to native speakers’ and whether task conditions (free choice vs. script) and speaker group (nonnative group 1, nonnative group 2, native speaker control) correlate with the ratings.

There’s more to this research but this is the part I have most trouble with.

I have almost zero knowledge about statistics but I’ve been reading about it. Could anybody help recommend statistical analyses appropriate to the data.

Thank you very much.
Pooh

#### JohnM

First off, there are huge ongoing debates in statistics over whether the results on a Likert scale can be treated as an interval scale, and using parametric methods (assuming an underlying normal distribution) to analyze them. Count me among those who think it's OK. Most of the time, anyway.

There are actually 4 questions:

1) the extent to which the speakers’ intent is correctly identified by the judges
-this can be approached in a number of ways, but it depends on how you define "correct" or "incorrect"
- can you give me more background on this one?

2) whether the group with contact with native speakers will have scores closer to native speakers’
-here you can use an independent samples t-test between the "contact" and "no contact" groups

3) do task conditions (free choice vs. script) correlate with the ratings
-here you can use an independent samples t-test between the "free-choice" and "script" conditions

4) does speaker group (nonnative group 1, nonnative group 2, native speaker control) correlate with the ratings
-here you can use Dunnet's t-test (to compare a control vs a number of alternatives)
-or you could do a 1-way ANOVA, then do post-hoc pairwise comparisons between the groups

#### Pooh

JohnM said:
First off, there are huge ongoing debates in statistics over whether the results on a Likert scale can be treated as an interval scale, and using parametric methods (assuming an underlying normal distribution) to analyze them. Count me among those who think it's OK. Most of the time, anyway.

There are actually 4 questions:

1) the extent to which the speakers’ intent is correctly identified by the judges
-this can be approached in a number of ways, but it depends on how you define "correct" or "incorrect"
- can you give me more background on this one?
"correct' is when the judges perceive the speaker of each apology as it is intended (some of them are intended to be sincere, some of them are intended to be ostensible). "incorrect" is when there' a mismatch between the judges' perception and the speaker's intent e.g. the judges perceive the speaker of a particular apology as sincere but the speaker is in fact insincere.
Pooh

#### JohnM

I'll have to think this one over - there are a few different ways you could approach the analysis of this question, and it's not clear to me what exactly is being evaluated - the judges' ability or the speakers' ability - no fault of your own, it's just that this situation has some ambiguity to it.

My first thought is that you could compare the judges' "correctness" among the three groups of participants - "contact", "no contact", and "controls" - the judges should correctly gauge the speaker's intent more often or to a higher degree with the "controls," slightly less with the "contact" group, and least with the "no contact" group.

In this case, you could analyze it the same way I suggested for question 4.

Thoughts?

#### Pooh

JohnM said:
My first thought is that you could compare the judges' "correctness" among the three groups of participants - "contact", "no contact", and "controls" - the judges should correctly gauge the speaker's intent more often or to a higher degree with the "controls," slightly less with the "contact" group, and least with the "no contact" group.

In this case, you could analyze it the same way I suggested for question 4.

Thoughts?
Sorry I didn't make it clear. The judges' responses will be used as indicators of how successful the speakers convey their intent. The assumption is what you said. The judges should best gauge the native speaker controls, and better do "contact" group, and the least do "no contact" group.

Thank you,
Pooh