Hello everyone !
I have a data with n = 100 000 rows and p = 2 variables X and Y.
There is a trend between these two variables however it is very blurry and we don't see anything (too many points).
My strategy is to use a clustering algorithm (K-Means for example) on the 100 000 rows and to...
Thank you for your answers. Indeed there is a big field about anomaly detection. I am doing my phd on this : I work on Isolation Forest, Local Outlier Factor,.... but all thoses methods like Mahabolis distance only measure an anormality score on a dataframe X with p columns and n rows.
My...
Hello, thank you for your answers. Indeed it is a data with n = 100 000 individuals (rows) and p = 50 columns (where the first one is Y and the other 49 variables are X). All the variables are quantitatives and they are not times series.
I can't go into details but the variables on X are just...
Hello,
I have an issue of machine learning/anomaly detection. Indeed, I have a variable Y and several other variables X. The purpose is to quantify the degree of abnormality of the data on Y but I have to take into account the values on the other variables (the relationship between Y and X)...
Hello,
I have a heavy-tail distribution and I was wondering what exactly does the Hill's estimator ? I am not sure to fully understand this notion. I know it estimates something about the tail of the distribution.
Thank you so much
Have a nice day
Thanks !
I think you are right :D !
Just a thing :), there is still repeated measures ? Is it ok to apply a simple clustering algorithm on the variables ?
Hello !
I come to you because I have to help one of my colleague who is a plant biologist. The purpose of this study is to cluster 70 quantitative variables. Each of these variables represents a different protein (it is a measure done on it, I don't know how exactly it works).
But here is the...
Hello, I have a problem. Indeed I have 9 categorical variables and I'm trying to do clustering variables.
Could you tell me if I'm doing the right thing ?
1) I calculate for each pair of variables the Cramer's V. I represent those associations in a matrix. I call it X.
2) I calculate 1-X and...
Hello. The variables in X represent several optic tests in the light. And the variables in Y measurements we've done on the eyes. My compagny wants to know if there is a connexion between those two set of variables.
Hello, i have a problem with a correlation study. I have to study the linear relationship between two set of variables X and Y. There are 12 variables in X and 4 in Y.
I applied a Canonical Correlation Analysis and it seemed all the variables in X with 3 in Y were correlated.
I also did a...