Methods for group comparison

#1
Straight to the point:

I'm writing my first article on the subject of comparing our national (Finnish) lung cancer population to the Swedish and Danish to see, if there are significant differences between the two patient populations. I've gathered a total of ~600 patients worth of real world data here in Finland and I'm trying to compare it to the other Nordic populations to find some shred of evidence for the reason of our national lung cancer care shortcomings presented in a massive study published a few years prior. I chose two studies as points of comparison: one Swedish and one Danish article.

The problem:

I do not have access to the other Nordic cancer registries, nor was I able to gain access to the primary datasets of said articles. All I was left was the data presented in the articles themselves, which was enough for simple "X > Y"-type of table comparison, but as I was lacking the primary data itself, hence no std errors, variances etcetc, I ran out of tools to perform any meaningful statistical analyses on said population differences. The data in the articles themselves is just registry data presented in bulk (e.g. survival of smoking males in the Swedish population with p-values, total N, proportions and so on) with Kaplan-Meier survival plots.

To my knowledge I'm left with just very simple and superficial population comparisons, but it won't hurt to ask. Are there any possible methods for me to use to strengthen my analysis or should I just abandon my efforts as futile and move on to the analyses made inside my own population exclusively? Sure, I still have my comparison tables, but to me it just feels like my article isn't on solid ground with just those.
 

hlsmith

Less is more. Stay pure. Stay poor.
#2
You will likely have issues. I would conduct analyses using your own data and make superficial comparisons in your discussion section. Even if you had the data, there could be missing confounders or selection bias making the sets not comparable.

Of note, I know in R software there is a package for simulating survival data. For fun, it may be interesting to see if you could use that to play around with creating a plasmode simulation of some such.
 
Last edited:
#3
You will likely have issues. I would conduct analyses using your own data and make superficial comparisons in your discussion section. Even if you had the data, there could be missing confounders or selection bias making the sets not comparable.

Of note, I know in R software there is a package for simulating survival data. For fun, it may be interesting to see if you could use that to play around with creating a plasmode simulation of some such.
Thank you for quick reply!

But yeah, this is how I saw it too. As a rookie I have allocated way too much time creating this article only to find myself in this spot, hehe. But I guess I need to learn things the hard way.

Is the R package you are talking about perhaps the one named "simsurv"? I've studied R and learned to be somewhat proficient with it, so it wouldn't hurt to learn something new and perhaps even test it. Can't even consult my university hospital's statistician, since he is on a holiday leave, so I'm left soloing all this and try to make sense of it all.