My t-test results are significant, but are they meaningful?

#1
I have been researching the question of whether the need to downplay genetics in adoption might weaken a country's response to the coronavirus, which is a piece of genetic material. I ran some t-tests using a t-test calculator. These are two of the results:

Those of the 23 richest countries in the world that adopt (N= 12 US, France, Italy,
Germany, UK, Australia, Canada, Spain, Switzerland, Netherlands, Sweden, Belgium) versus those of the 23 richest countries in the world that don't adopt (N=11 China Japan, India, South Korea, Taiwan, Brazil, Hong Kong, Russia, Mexico, Austria, Indonesia): 13/07/2020 Worldometer cases/million (COVID19)
The two-tailed P value equals 0.0319
mean cases rich countries that adopt:4407.25
mean cases rich countries that don't adopt: 1805.82
(Conclusion, the increased rate of COVID19 cases/million people in rich countries that adopt versus rich countries that don't adopt is significant). It suggests that adoption is a separate causative factor from wealth re rate of COVID19 infections in a country.

2) All 23 countries that adopt internationally (US, Spain, France, Italy, Canada, Netherlands, Sweden, Norway, Denmark, Australia,
Belgium, Cyprus, Finland, Germany, Iceland, Luxembourg, Malta, New Zealand, Norway, Switzerland, UK, Andorra, Israel) versus all countries over 5,000,000 population: Worldometer, 12/07/2020
mean of cases/ million in International adopter countries (n= 23) 3908.35
mean cases/million in all countries over 5 million people (n=122) 2805.03
two tailed p value = .0010
Conclusion: the increased rate of COVID19 in countries that adopt internationally versus all countries with a population greater than 5 million is highly significant.

Am I right to assume these tests are meaningful? Are they publishable?

Boruch
 
Last edited:

obh

Active Member
#2
Hi Boruchf,

So do you mean the difference between countries that adopt internally? to the countries that adopt internationally?

I feel that there is a very long "distance" between the DV (cases/ million) and the IV (adopt internationally: true/false)
There are so many potential IVs that may influence the corona (for example, culture, government, location, density, other genetic parameters)
If you could theoretical be able to control all other IVs to be the same ...
Please also read about multicollinearity.
 
Last edited:
#3
Thanks for your reply,

I started by comparing the top countries that adopt domestically versus the top countries that adopt internationally. Next, I will study multicollinearity.


I did four t tests.
1) 20 top int. adopters versus 20 top domestic adopters (removed countries appearing in both groups) (greater cases among international adopters almost reaches significance - results below

2) 20 top int. adopters versus 30 top domestic adopters, (removed countries appearing in both samples) Greater cases/million among international adopters even closer to reaching significance - results below

3) 20 top int adopters versus total population (of countries > 5 Million. Results: increase cases/million among international sdopters highly significant. - results below

4) 30 top domestic adopters versus all countries pop > 5 Million. Results: tiny increase in cases/million among top domestic adopters is not significant - results below.

1)
Unpaired t test results top 20 int adopters versus top 20 domestic adopters
P value and statistical significance:
The two-tailed P value equals 0.0952
By conventional criteria, this difference is considered to be not quite statistically significant.

Confidence interval:
The mean of Int. Adopters minus Domestic Adopters equals 1605.40
95% confidence interval of this difference: From -299.02 to 3509.82

Intermediate values used in calculations:
t = 1.7268
df = 28
standard error of difference = 929.709

Group Group One Group Two
Mean3535.27 1929.87
SD2608.95 2481.68
SEM673.63 ......640.77
N15 ...... . . 15

2)

Top 30 domestic adopters versus top 20 int. Adopters 15/07/2020 (7 countries appearing in both samples removed).

Unpaired t test results
P value and statistical significance:
The two-tailed P value equals 0.0863
By conventional criteria, this difference is considered to be not quite statistically significant.

Confidence interval:
The mean of int adopters minus domestic adopters equals 1348.18
95% confidence interval of this difference: From -202.86 to 2899.22

Intermediate values used in calculations:
t = 1.7664
df = 34
standard error of difference = 763.217

Group Int. Adopt. Domestic Adopt.
Mean .2999.31 1651.13
SD 2358.95 .2107.52
SEM ....654.25 ....439.45
N .,.. 13 ........23

3)
20 Top Int. Adopter Countries vrs Pop > 5 Million
15/07/2020

15/07/2020 Top Int. Adopters vrs. Population >5 Million
P value and statistical significance:
The two-tailed P value equals 0.0040
By conventional criteria, this difference is considered to be very statistically significant.

Confidence interval:
The mean of Population >5 Million minus Top Int. Adopters equals -1984.19
95% confidence interval of this difference: From -3325.81 to -642.57

Intermediate values used in calculations:
t = 2.9243
df = 138
standard error of difference = 678.510

Group Population >5 Million Top Int. Adopters
Mean 1847.41 3831.60
SD . 2807.00 2823.66
SEM 256.24 631.39
N 120 20

4)

All countries > 5 Million vrs. Top Domestic Adopters

Unpaired t test results
P value and statistical significance:
The two-tailed P value equals 0.2779
By conventional criteria, this difference is considered to be not statistically significant.

Confidence interval:
The mean of Pop. > 5 Million minus Top Domestic Adopters equals -628.85
95% confidence interval of this difference: From -1769.80 to 512.09

Intermediate values used in calculations:
t = 1.0891
df = 149
standard error of difference = 577.399


Pop. > 5 Million v. Top Domestic Adopters
Mean 1889.38 2518.23
SD 2833.15 2822.12
SEM 257.56 515.25
N 121 30

Boruch
 
#4
Suppose you had 135 observations, each for one country including y and N for # of cases and total number of people. To compare the the proportion of COVID-19 cases between two regions classified in your question, you can try proc logistic or proc glimmix:

proc glimmix;
class region;
model y/N = region / dist=binomial link=logit solution;
run;

You can add other covariates in the model statement, and also random effect for other geographic region variables.