question about correlation

ado

New Member
#1
Hi,

I am new here and I would like you to help me with this question I have please.

Say that generally speaking in 80% of cases height is strongly correlated with weight, that is the more you are tall the more you weigh.

Now, is there anyway to know in the cases not explained by the height/weight correlation for a given height how much can one possibly weigh and with what probability?

Say one is 1.80 CM tall and this means that he usually weighs 80kg. In the cases one person deviates from the norm, can we know how much he can possibly weigh and with what probability?

Thank you in advance.
 

hlsmith

Not a robit
#2
You need to step away from correlations and hang out with its fun cousin simple linear regression. Regression can tell you the coefficient relationship between two variables and provide confidence or prediction intervals around estimates. Do you have an actual dataset to work with?
 

ado

New Member
#3
Thank you for your reply. I could use linear regression but before i try that, let me explain myself a little further. Say i have two variables, one that ranges between -10 and +10 and the other that ranges between -30 and +30 (this one is more volatile). Now, I know that that say in 80% of the cases they move in proportion in the same direction. So if one does +7, the other does +21. I want to know if there is any way to find out in the remaining 20% of the cases how extremely they could diverge and with what probabilities. How do I approach this issue? I can get data set for the two variables from which the correlation was determined.
 

hlsmith

Not a robit
#4
I would wonder if there is a subset that seem to defy the underlying function, if this is just a missing variable problem (e.g., 80% positively correlated females and 20% inversely correlated are males). So you are missing a piece of the data generating function.
 

hlsmith

Not a robit
#5
Is the change in association random or a threshold thing. Meaning are lower values inversely correlated or just some values here and there. It may be beneficial to fit a LOESS curve or spline to your data to see if there is a curvilinear relationship. Visually knowing the relationship can help find thresholds or potential cut points. For example, 0-1.5 alcoholic drinks are associated with mild cardiovascular benefits, but greater than 2 drinks are associated with negative outcomes (e.g., the enigmatic j or hockey stick distribution).
 

ado

New Member
#6
Hi. Thank you for helping. I am not sure if the change in association is random or a threshold thing, however, either way based on my data set I would only like to know how much they could diverge and with what frequency. Correlation tells me how much two variables move in sync ok. but is there any formula that tells how those same two variables can go out of sync based on the same data? To recap, I have many observations, two defined ranges for two defined variables, they are correlated 80% of the times, in 20% of cases they are not so correlated, when they are not correlated how extreme can it be? I am sorry I am not an expert so maybe it is just me asking pointless questions. For example bond prices are inversely correlated with interest rates. If one rises the other tends to fall in say 90% of the cases. In the remaining 10% of the cases how extremely one can raise and the other do nothing or maybe even bond price fall and interest rate fall as well?
 

hlsmith

Not a robit
#8
Well the percentage correlation is confusing. You would have to run all subsets of data to know that. Well plotting a spline with prediction intervals might get at you goal. I am guessing doing the same thing but with a Bayesian twist would get you probabilities for certain extremes.