Factor Analysis Question

#1
Hi everyone, I'm a Trainee Clinical Psychologist, not a statistician but we have research to do as part of our training so I'm hoping someone here can help.

I'm working on a questionnaire that we want to factor analyze to see if there are clusters/factors within the data that would be indicative of different diagnoses. The problem I have is that while most of the items are scored on a 5-point scale, not all of them are... some are a yes/no option.

My concern is that scoring "no" as 1, and "yes" as 2 (say) means that the 2 will load with other scores of 2 which may be entirely spurious.


My specific question, is can we justify analysing it if we treat the two-part answers as if they were the poles of a 5-point scale (so 1 for "No" and 5 for "yes").

I think this would work, but i don't want to violate some law of FA, and my crude attempts at finding the answer online haven't worked.

Hopefully someone can help.

Thanks!
 
#2
Hi everyone, I'm a Trainee Clinical Psychologist, not a statistician but we have research to do as part of our training so I'm hoping someone here can help.

I'm working on a questionnaire that we want to factor analyze to see if there are clusters/factors within the data that would be indicative of different diagnoses. The problem I have is that while most of the items are scored on a 5-point scale, not all of them are... some are a yes/no option.

My concern is that scoring "no" as 1, and "yes" as 2 (say) means that the 2 will load with other scores of 2 which may be entirely spurious.


My specific question, is can we justify analysing it if we treat the two-part answers as if they were the poles of a 5-point scale (so 1 for "No" and 5 for "yes").

I think this would work, but i don't want to violate some law of FA, and my crude attempts at finding the answer online haven't worked.

Hopefully someone can help.

Thanks!
Instead of yes/no what you should have done is scale out out...

for example:

1 -- not likely
2 - somewhat likely
3 - likely
4 - somewhat likely
5 - very likely

you really aren't going to be able to do any meaningful FA comparing 2 item answers with 5 item answers.

at least i don't think so....but, i've been wrong before. :)
 
#3
Most exploratory factor analysis standardise your measures, which means all your five point, two point or continuous scores will become a metric with mean=0 and standard deviation of 1. So I don't think there is worry about misinterpretation. The problem is the 2 point scale might not be correlated well with most extracted factors.
 
#4
Yeah, it's a problem.

Factor Analysis have a pretty strong assumption of Multivariate Normality. Even doing the 1-5 scales only works in specific situations (I know Bengt Muthen has a paper on this), but binary ones are pretty hard to justify as normal.

That said, it will run, and in many fields you'll get away with it. Doesn't mean your results will be accurate, but no one may object....

I know there exists an analysis called latent class analysis that is (as I understand) an equivalent to Factor Analysis for categorical variables. I don't know much about it, but I would suggest looking into it. I'm pretty sure that AMOS now runs it, which is part of SPSS. You used to need specialized software for it. I don't know about other major stat packages.

Karen
 
#5
Very helpful, guys. Thanks very much. It could be that FA just isn't the approach for this particular problem.

I think I'll look into that Latent Class Analysis. Thanks for the tip.

Best wishes,
Mark
 
#6
Hello,
Latent class analysis is more like an SEM version of cluster analysis. If factor analysis is what you need, you can try confirmatory factor analysis (CFA). But CFA and LCA are both not exploratory. A priori theories will usually be required.