# Normality and tests

#### leejones15

##### New Member
Hi all,
I am working on a project where I have three theoretical constructs and have binary coding for positives and negatives of each (attached is a snapshot of my data, but total N=159 and age ranges from 5-19). Couple questions:
1. How can I test this data for normality? To know if I should use parametric or non-parametric tests?
2. What test should I use to compare each construct (columns E-J) across age, gender, and location? And on whether or not they know a scientists (last column)?

TIA

#### Attachments

• 28.5 KB Views: 5

#### noetsi

##### Fortran must die
All of the formal tests of normality have serious power issues and thus I stay away from them. A QQ plot is the best approach I have found to test normality.

You have to be careful what you are testing normality in. For example in regression you are interested in the residuals not the raw data in terms of normality.

#### leejones15

##### New Member
Hi Lee,

I don't really understand what do you want to compare?
What is the dependent variable?
Are the predictors: Embodiment, Attainability, ...Desirability, Sci, Age?

When the sample size is 30 or more usually you can assume the average distribute normally.
The dependent variable would be the embodiment, attainability, and desirability (positive and negative), so six different tests. The predictors would be age (categorical), gender (binary), location (there are two different schools, so binary), and whether they know a scientist (binary). I'm assuming I would sum together the "scores" find the average in the dependent column for each predictor (e.g., the total for embodiment for age 5) and use that as my continuous dependent variable.
Does that help clarify?

Last edited:

#### obh

##### Active Member
Hi Lee,

Why is age categorical?
Did you try using the linear regression? (if meets the assumptions)

#### obh

##### Active Member
All of the formal tests of normality have serious power issues and thus I stay away from them. A QQ plot is the best approach I have found to test normality.

You have to be careful what you are testing normality in. For example in regression you are interested in the residuals not the raw data in terms of normality.
Hi Noetsi I think the common practice is to combine a normality test with a graphical method like the QQ plot.

#### leejones15

##### New Member
Hi Lee,

Why is age categorical?
Did you try using the linear regression? (if meets the assumptions)
I am treating age as categorical because it is not assumed that there will be a trend, we just need to see if there is a significant difference in which constructs each age associates with. When I actually run the tests, I will group ages together (5-7, 8-10, etc)

#### obh

##### Active Member
Why not just put the real age?

Anyway, even with an ordinal age variable you can run a linear regression. (if meets the assumptions)
If for example, you will run a one-way ANOVA over only the age variable you may miss some differences due to other predictors and may get a wrong answer ...

#### leejones15

##### New Member
Why not just put the real age?

Anyway, even with an ordinal age variable you can run a linear regression. (if meets the assumptions)
If for example, you will run a one-way ANOVA over only the age variable you may miss some differences due to other predictors and may get a wrong answer ...
Thanks. With a regression model, then, would logistic regression be a better choice since the dependent is binary?

#### obh

##### Active Member
If the DV binary you should use the logistic regression.

Why don't you use continuous DV?

#### leejones15

##### New Member
If the DV binary you should use the logistic regression.

Why don't you use continuous DV?
The construct is either present or not present, so binary seems to make the most sense.

What about comparing gender to the constructs? That would be binary v binary.

Hi Lee,

#### leejones15

##### New Member
Yep, I'll try:
I am using a theory of role modeling to evaluate student responses to a survey about scientists. The role model theory has three constructs: goal embodiment, attainability, and desirability. I went through the survey and coded student responses that indicated a particular construct. The responses code be coded as positive or negative. For example, positive codes for desirability would be terms like "cool" and "fun" while negative codes would be words like "boring" and "nerd." It is a qualitative study in that we are evaluating the words students used, but now we want to statistically compare across age, gender, location, and whether they know a scientist.
For statistical comparison, if a student anywhere indicated a particular construct (positive or negative) they were given a '1' for that construct. If nothing was present, they were given a '0'. Even if they had multiple coded responses for one construct, they were still only given a '1' to indicate presence rather than quantity. This is the standard code practice for studies in this vein. In the end, I have something that looks like the picture on the original post (only with 159 students from ages 5 - 19).
In the end, I will have six DV: one for each construct both positive and negative. I want to compare that to the four predictors (age, gender, location, and whether they know a scientist).

Does that help?

#### obh

##### Active Member
So the same student could get both 1 for negative and 1 for positive? so actually Positive Embodiment and Negative Embodiment are two dependent variables? say all the combinations are possible for one student (Pos Emb,Neg Emb ): (1,1), (1,0) , (0,1), (0,0) ?

#### obh

##### Active Member
So probably a binary logistical regression for each DV?

#### leejones15

##### New Member
It seems that way for age, at least. But what about gender, location, and knowing a scientist (also all binary)?
Also, are there tests for normality I need to run before I can use logistic regression?

#### obh

##### Active Member
A predictor variable can be binary. The residuals' normality assumption is not relevant to the logistic regression

#### leejones15

##### New Member
What test would I use to compare a binary predictor to a binary DV, though? It wouldn't still be logistic regression, would it?

#### obh

##### Active Member
In the Logistic regression, you will get a p-value per each coefficient Wald test (actually z-test)