I have a set of data points from a genetic study. I wish to ascertain which of these data points is statistically significant. The data is not normally distributed. Values can range from zero to the low thousands, with there being a large number of zero values.

At present, I am using a bootstrap method to determine which values are significant. A new data set is formed by sampling the original data set 100,000 times without replacement, and the distribution of the newly generated data set used to determine significant cut-off values for the original data set (i.e. the 1% cut off value is the 1000th highest value in the re-sampled data set). Individual

*p*-values are similarly determined for each data point in the original data set by where they fall in the distribution of values in the re-sampled data set.

I am concerned that the excess of zero values and the fact that the original data set does not follow a normal distribution will mean that this method of determining significance will be biased or inaccurate.

Is this something I should be concerned about, and if so does anyone know a way around it?

Also, can anyone recommend a good method for adjusting for false discovery rates in the 1% cut-off value or the individual-values? I tried using the BY method for the individual p-values, but it converted every value to 1.