To assess meaningfulness between an independent variable and a dependent one, is just the p value alone adequate and/or the correlation?

#1
To decide whether there is a probable meaningful relationship between dependent/independent variables, is it best to use just the t test (at, say, 5%; anything less than this meaning it is unlikely for the relationship to happen by chance and so is probably meaningful);

or look at the variance and only consider correlations greater than, say, 0.5

or use both: a p value of less than 5% and a correlation greater than 0.5?

p.s. I thought I'd posted this yesterday, but cannot see it.
 
#3
The two characterizations are separate:

1] "The effect is statistically significant" means that the true parameter has a value different from the value in the null hypothesis. In the modeling setting, this typically implies that the corresponding predictor has a relationship with the dependent variable and should stay in the model.

2] "The effect is substantial" means that the size of the true parameter is big enough to describe a relationship affecting real life.

Whether "correlation above 0.5" is substantial enough depends on your situation, on the context of your problem (looks substantial to me). Whether "p-value below 0.05" indicates statistical significance depends on the preset significance level, dictated by your research standards and the research standards in your industry.

I cannot comment on whether the t-test is appropriate. Agree with hlsmith: you have to describe the research question(s) and data.
 
Last edited:

noetsi

Fortran must die
#4
Of what value is it to have a slope that is too small to have any substantive effect, yet be statistically meaningful? If you have a large enough data base, lots of things will be statistically significant that show very small effect size. Unfortunately many who don't know much about statistics think the test of statistical significance tells you that the effect size is important (I have been asked to run these test to determine substantive significance many times).

One way to think of test of statistical significance is required, but not sufficient. if you don't have statistical significance than you probably should ignore the slope. But just having it does not mean the slope (essentially the relationship between the variables controlling for others) matters.
 

hlsmith

Not a robit
#5
I like the above posts. I will add as well that a statistically significant result (e.g., slope) can help rule out chance, though if its effect is small an investigator may want to remember that it may not take that large of an effect of an omitted variable to account for the effect. Such as an unknown confounder.
 
#6
Thanks to you all for your guidance. I think this confirms my way of progressing in using the results of my linear regression usage.

I am relating many lifestyle independent variables against somewhat fewer (but still a lot) of independent biomarkers.

For some, such as blood pressure and body composition, I have nearly 3,000 daily events and so I can do multiple regression analysis (using Statistica) for many independent variables at once, but for others (done much more sporadically) I have a relatively few events (ranging from a dozen or so to nearly 100) for each independent variable against each dependent variable.

My concern was that a correlation needed to be higher for fewer events for it to be meaningful - and this would be reflected in the p-value. So, say, a correlation of 0.5 for a hundred events may be as meaningful as 0.7 for a dozen events. My problem was how to get a statistical 'feel' for this - and I think the p-value would give me this.

If I understand the advice you have kindly profffered, I am on the right path.

Than you
 

hlsmith

Not a robit
#7
My concern was that a correlation needed to be higher for fewer events for it to be meaningful - and this would be reflected in the p-value. So, say, a correlation of 0.5 for a hundred events may be as meaningful as 0.7 for a dozen events. My problem was how to get a statistical 'feel' for this - and I think the p-value would give me this.
Just keep sampling variability in mind which can be seen when looking at 95% confidence intervals. Estimates from smaller samples have the potential to vary given variability in the selected sample. You become more certain of the estimate with larger samples.