Hi, I will glad to read your thoughts

Whats sample size is large enough so I can assume normality based on the Central Limit Theorem? Some write 30 some write 100.

I run some simulations and it seems that for some skewed data (finite variance, independent ...) you need a sample bigger than 200.

And what is reasonably symmetrical ???

I run simulations from F(8,8) to F(19,19) with a large number of repeats (100,000) and checked what sample size brings the average close to the Normal distribution.

What is "close to the Normal distribution"? I thought about two options:

1.

2. SW test - since limited to 5000, an average of each 5000 size blocks. Definitely too powerful for a large sample, so maybe with low p-value 0.01 or 0.001.

Currently, I used option1, I may also try option 2

And I run the following regression:

DV=sample size

IVs: Population's parameters: Skewness, Excess Kurtosis, Skewness*Kurtosis, Excess Kurtosis^2, Skewness^2

A potential problem you know the sample statistics (Skewness, Kurtosis) while the regression is base on the true statistics.

Your thoughts? Any recommended article?

Whats sample size is large enough so I can assume normality based on the Central Limit Theorem? Some write 30 some write 100.

I run some simulations and it seems that for some skewed data (finite variance, independent ...) you need a sample bigger than 200.

And what is reasonably symmetrical ???

I run simulations from F(8,8) to F(19,19) with a large number of repeats (100,000) and checked what sample size brings the average close to the Normal distribution.

What is "close to the Normal distribution"? I thought about two options:

1.

**Sample distribution's Skewness**<0.5 and**Sample distribution's excess Kurtosis**<0.52. SW test - since limited to 5000, an average of each 5000 size blocks. Definitely too powerful for a large sample, so maybe with low p-value 0.01 or 0.001.

Currently, I used option1, I may also try option 2

And I run the following regression:

DV=sample size

IVs: Population's parameters: Skewness, Excess Kurtosis, Skewness*Kurtosis, Excess Kurtosis^2, Skewness^2

A potential problem you know the sample statistics (Skewness, Kurtosis) while the regression is base on the true statistics.

Your thoughts? Any recommended article?

Last edited: