Confused with SEM and factor analysis


I am confused about what to do with the data set

This data set has 29 variables, of which we have to discard the variables ID, FTEequivalent and Size as they are unnecessary. From the remaining ones, we have four dependent variables- turnoverincreasingFIN, profitabilityincreasing, turnoverstaffpercent, competitionforjobs. The other 22 variables were considered for factor analysis. We are guessing 4 factors (as the questionnaire was designed so).

The factors include-

employees ability to raise voice or participation, V factor (stands for Voice). Variables in V factor: newsletterspread, attsurveyspread, howsurveysused, surveyspaidattn2, regularstaffmtg, frequencystaffwalkinwards.

trainings given, T factor. Variables in T factor: workteams, trainingforfirstyears, averagetrainingmgmt, averagetrainingnonmgmt, averagetrainingnursing.

performance management factor, PM factor. Variables in PM factor: formalperformanceapppercnt, performanceappusedforwage, promotionrulesusage, jobdescriptformajority, jobdescriptmodifiedformally.

recruitment factor, R factor. Variables in R factor: internalpromotion, expendonrecruitmentproc, proportionnewemployee.

I was trying to use this R codes for calculating factor scores first and then see if the four factors V, T, PM and R significantly affects the four dependent variables. But results are not coming as expected. Is this procedure right to use factor scores as input of further regression? I have come to know after facing the problem that the regression in factor scores produces biased (towards zero) estimates due to unaccounted measurement error. Then what should I use for this kind of analysis?

I guess these are common problems in Psychology, but I am a little inexperienced, I think. Some have suggested me SEM. What do you think? I don't know much about this. Kindly suggest me the procedure that is most appropriate seeing the data. That will be a great help indeed. I'll learn that procedure. Any quick reference is also appreciated!

Note that, there are both ordinal and scale variables. Moreover, my only objective is to see if the factors V, T, PM and R affects the dependent variables significantly or not.

Thanks and best regards.


It looks like you have a non-experimental design. If I understand it correctly, the participants (hospital employees?) were given a questionnaire to assess their “ability to raise their voice”, “Training”, “performance management”, and “recruitment”. You think that these four “factors” should be related to whether a hospital has “turnover increasing”, “profitability increasing”, “turnover percent”, and “competition for jobs”.

Because of this, you cannot say that training “caused” the hospital to have increasing profitability or that recruitment “affected” competition for jobs. What you can say is whether or not the employee factors are related to the hospital outcomes.

As far as the analysis goes, if you want to identify the underlying structure of these factors and create a model that shows how they are all related then you should look into SEM and factor analysis. You said that all you want to know is whether the factors and dependent variables are related so you don’t need to do SEM.

If the questionnaire items are on the same scale, you can combine the items for each factor into a composite score so instead of 6 individual items for “Voice”, you would have a single item. You should use Cronbach’s alpha to check whether the 6 items are “similar” enough to be combined. (if using SPSS, it’s under Analyze -> Scale -> Reliability Analysis; values range from 0 to 1 and you’d like a value of .70 or better).

It looks like you were trying to combine the items using R code. It can be tricky to use R if you are not used to it and it is difficult to see where you went wrong. It would be easier for you if you had a program with a more user-friendly interface. But, if you do not have another statistical package available to you (SAS, SPSS, etc.) then try using Excel Formulas to combine the variables into factors.

Run correlational analyses and multiple regression analyses.

If you are not getting the results you expected, look at the questionnaire. If it is a published measure (well-validated, etc.) then it may be that the factors and the outcomes you are interested in are not related. If the questionnaire was created for the purpose of the study and has not been previously validated, then it may be an issue with the questionnaire instead.

Elite Research, LLC
1925 E. Beltline Rd. Suite 200 Carrollton, TX 75006
(800) 806-5661 (972) 538-1374