Can you use factor analysis on a table has count data?

#1
I am working on a project where my predecesor has been analyzing a table of rows by columns of count data. Brands represent the columns, and statements about those brands represent the rows. The cells contain count data where people who have consumed one of 10 brands have endorsed one of 10 statements.
Some brands were consumed more than others, and therefore had more respondents.

I have attached a representation of the table including imaginary base sizes.

I have the respondent data available. However, I would like to ask:
Is it reasonable to use factor analysis on this table (excluding base sizes)? Or is it more proper to use respondent level data? What do you get if you use factor analysis on a table that uses count data
like this? What changes if it was percent data instead? Shoul we be using a respondent level data matrix?

This is a matter that I need to resolve by Monday. Please help.
 

gianmarco

TS Contributor
#3
Hello,
For contingent tables like that, Correspondence Analysis is often used. It has some aspects in common to Factor Analysis, being a method of dimensionality reduction; however, CA is tailored for cross-tabulation and count data.

CA is used, for instance, in marketing research to understand the relation between brands and consumers’ opinion, which looks like what you’re after.

There are many resources on the web about CA. For a quick introduction and some useful references please feel free to visit my website (cainarchaeology.weebly.com).

Best
Gm
 
#4
Hi Gianmarco and thank you for your answer.
That is what I thought.

Correspondence Analysis, by my understanding, ,would give us essentially a correlation map.
I also believe that Factor analysis provides a different solution that emphasizes underlying structure..

Can you please confirm if this is true. Can you please advise on why I would prefer one over the other?
I have respondent level data available.
Also, can you please specify: how wrong is it to apply a factor analysis to this table, treating each row as an observation? What would you get out of this method?

Thank you very much
 
Last edited:

gianmarco

TS Contributor
#5
I have seen in liyerature the use of FA for count data, but (to the best of my understanding) it was before the popularization of CA.

CA reduces the number of dimensions useful to represent the table graphically, bringing to the fore latent patterns of association between rows and columns. So, if you goal is to understand how brands differ in terms of customers’ statement, CA can be used. By the same token, CA may help you understanding which statement each brand is more ‘associated’ with.

My knowledge of FA is not so extensive to suggest you to prefer one technique over the other. What I can tell you is that CA is frequently used in cases like the one you described.

Best
Gm