# PCA, FA, CFA, latent variables, and making a scale.

## Best approach to the problem?

• ### Other?

• Total voters
1

#### psnp

##### New Member
Hi! I was wondering if anyone had any suggestions for me.

I'm working with two datasets. One contains a population of cases (n=25) from 2005, and the other contains a population of cases (27) from 2007. Each case has a score on each of six variables/indicators (v1, v2, v3...), each indicator has a different max value (v1's max= 30, v2's max= 22, v3's max= 20...); the sum of all six indicators will range from 0 to 100 for any given case.

The point of the variables was the construction of an additive scale (the 0-100 one), and I was hoping to test the validity of the scale. On the surface, weighting the variables and adding them this randomly seemed a bit problematic, but the idea of a single variable combining the other six makes sense. My question is how to combine the variables (or test how they've been weighted/combined).

I tried Principal Component Analysis and Factor Analysis, and both extracted one component/factor with a very high eigenvalue. This seems to indicate to me that on the surface, the single latent variable (through combination of the six variables) makes sense. However, when I had STATA predict scores on the factor, the predictions are very highly correlated with the additive scale (r2=.98). Is that to be expected, or is something fishy?

All I want is to confirm that the additive scale is useful OR to suggest more useful weightings for the variables. I recognise that I have a very small sample here, and so confirmatory factor analysis is probably out.

Can anyone suggest how I go about combining the six variables into an optimal scale, or assessing their additive combination?

(Also: since there are two sets of countries, data for 2005 and for 2007, I have been analysing each group separately. Should I combine the two and analyse the cases as country-years in order to double to population?)

Thanks a million!

Last edited:

#### vinux

##### Dark Knight
I tried Principal Component Analysis and Factor Analysis, and both extracted one component/factor with a very high eigenvalue. This seems to indicate to me that on the surface, the single latent variable (through combination of the six variables) makes sense. However, when I had STATA predict scores on the factor, the predictions are very highly correlated with the additive scale (r2=.98). Is that to be expected, or is something fishy?

All I want is to confirm that the additive scale is useful OR to suggest more useful weightings for the variables. I recognise that I have a very small sample here, and so confirmatory factor analysis is probably out.

Can anyone suggest how I go about combining the six variables into an optimal scale, or assessing their additive combination?

(Also: since there are two sets of countries, data for 2005 and for 2007, I have been analysing each group separately. Should I combine the two and analyse the cases as country-years in order to double to population?)

Thanks a million!
I guess PCA and FA will give similar linear combination. If all the coefficients are not positive then I will not suggest PCA or FA ( But you can use this, if you can make an interpret the linear combination)

I guess all the coefficient of first factor is positive. If the proportion of coefficients of additive scale and the factor are close then you may get high correlation.

you can check the covariance matrix of 2005 and 2007 are same or not. I don't think you have enough observation to do a statistical hypothesis.

By the way what is your objective? comparison of 2005 and 2007?