Best analysis for 2 IV and pop not normally distr?

I am totally new to stats and trying to figure out what to do! I've edited my questions below in light of new info.

I have two research questions for my study. The first is whether academic discipline and/or speaker role (instructor or student) correlate with the frequency of use of two linguistic features (two different phrases). For this part, I wanted to perform factorial ANOVA for each linguistic feature since I have two IVs. But the population is not normally distributed so I don't think this will work. I'm not sure how complicated transforming the data would be or if there is something else I should use?

The second research question is similar to the first, except I have divided each linguistic feature into four different categories (i..e., the different ways the speakers use the language). I'm asking whether discipline and/or role correlate with frequency rates of each subcategory of the phrases. As a result, each linguistic feature is now divided into 4. I guess each subcategory should be analyzed in the same way as above, right?

I have 66 observations (33 class transcripts divided into teacher language and student language; 17 transcripts from discipline 1 and 16 from discipline 2). And I have normed the occurrences of the features per 1000 (most transcripts are a few thousand words).

Any suggestions would be most welcome. I'm happy to clarify anything. Thanks very much!
Last edited: