Sample size considerations


I am fairly new in the field of statistics. I have a question about study design and outcome considerations.
I have a overall studygroup of 3950 observed time stamps.
With several subfilters I reached a sample size of 22-25 in the last 7-8 years.
Is there a possibility to perform a prediction of future cases based on this data for the next 2 years?
Can I vary this predictive data by using different significance levels and statistical power?

I hope this does not sound too cryptic. All I want to to do is a predictive outcome intervall for the subset I used so far in the near future.
If you have further questions or suggestions, feel free to ask.
For anyone interested: I tried to use gPower to determine the right sample size for the optimized power and sig.
I am not 100% sure how to use the test statistics in the future. Maybe someone wit more nowledge could help.


Well-Known Member

How do you plan to predict?
What do you want to predict?
Is the prediction change over time?
What test do you want to run?
first of all thanks for the interest.
It is a Bayesian Hierarchical Logistic Regression Model.
Now I want to compare this model performance while having different sample sizes, sig. or statistical power.
From other statistical models i know that you can change the critical values by using the t or z transformation.
Maybe change the model?


Less is more. Stay pure. Stay poor.
@cheesecake21 your descriptions are very scattered. We would be better able to try and help if you outright described the data you have and the setting with a understandable description. The link is to a clinical trial, but nothing above seems related to randomized clinical trials. In addition, your first post makes it seem like you have univariate time series data, then you mention a MLM Bayesian logisitic model, which isn't time series but I guess could be repeated measures?

Just write out what data you have, perhaps provide a small sample of it, and then describe the samples size and format of variables along with the objective using the variables in that description.

I am sorry for the unclear description of mine.
The colleagues in the paper of the clinical trial used repeated meaures in different time frames to predict the outcome of genetic markers.
They used the Bayesian Hierarchical Logistic Regression Model,
My question here would be more hypothetical, because there is no clinical data yet.

Google Books Link

This more of a model comparison or more theoretical discusssion.
We are now discussing if there is a minimum sample size required for this approach or if we can switch to another method.
Maybe use different significance levels for the testing or power levels.