Princ. component Analy. (PCA) wrong method for my data??? please help :)

#1
Hi there guys i'm facing a statistics Problem and hope you might be able to help me out. (sorry for my english it isn't that good)
I want to analyse the Price of electricity in germany and have already bulid a dataset with 70 Variables
(Solarproduction, Windproduction, Nuclearproduction and so on all on same scale )

These Variables are all more or less correlated with the price and the dataset is about 20000 observations (hourly resolution)

Goal: find the variables that cluster together and those who are responsible for changes in the price variable.

I am already aware what PCA does so you do not have to explain this to me.
Problem: If i am making my PCA i have a few issues: 1) my 1. PCA only describes 24 % of the variance and my second and third only around 15 % which isnt much
2) When trying to find clusters and dependencies in the factor map i also end up with no nice picture,
So i dont get a "Price is high" and "Price is low" cluster in any of these PC1/PC2 PC1/PC3 PC2/PC4 and so on plots
Further i thought i can get the important variables by looking at the PC1 and PC2 contribution. But also here my best Variables only have around 5% of PC1 and PC2 and so i cant tell which variables are important and which not:(


So what can i do to find the variables that cluster with high prices or are the variables that are responsible for the variance in the dataset
Am i considering to much/to uncorrelatet/to few variables ???
What am i doing wrong ?


I would be so happy if someone can help me a bit out :)
 

Karabiner

TS Contributor
#2
I don't know how important the "variables that cluster" objective is. Regarding the "which variables are important"
part you could resort to linear regression approaches which deal with large numbers of predictors i.e. regularization
techniques such as Ridge regression, or the Lasso.

Just my 2pence

With kind regards

Karabiner