Comparing PCA to multiple regression.

#1
Dear all,

I would appreciate any help on this question. I have been asked to compare the results of PCA to multiple regression. I ve been given 10 variables. For the PCA, i am expected to regress the 3 components that result after the dimension reduction process, and for the multiple regression to simply regress the 10 variables that i have been given. After running PCA, i decided to drop 2 variables because it offered me a simper and more interpretable structure.My question is the following: since i want to compare the findings and I ended up dropping the 2 variables for the PCA, shouldn't i regress the 8 variables on the dependent instead of 10 since then i will be comparing non-comparable results? Thank you in advance. Happy new year!!
 
#2
I will take a stab at this one.
I assume the two variables you dropped from your PCA showed poor communality and loaded poorly? They could still have strong predictive power though even if poorly correlated with other independents. Have you looked at correlations between all independent and dependents? The PCA just makes all these different associations easier to interpret.

My advice would be to run regression with component scores then the raw data (all 10 and then 8) and see which gives better predictive power. Pay attention to how well the two variables you dropped perform in the 10 variable model and then use what you learned in all three to make the case for dropping or keeping them.
 
Last edited: