(Solved) Drawback of random sampling - need someone to explain better

#1
This is what my research skills book mentions:

The first few hundred cases selected using simple random sampling normally consist of bunches of cases whose numbers are closer together followed by a gap and then further bunching. For more than a few hundred cases, this pattern occurs far less frequently. Because of the technique’s random nature, it is, therefore, possible that the chance occurrence of such patterns will result in certain parts of a population being over- or under-represented.

I don't fully understand this. Can anyone explain better? Maybe with an example if possible?
 

RamonNL

New Member
#2
It is not the clearest text I've read. But what I understand is that, as you increase your random sample, it becomes more representative of the population. In other words it becomes much more like it, it becomes like a mirror of it that becomes sharper and sharper as you increase the sample size. Because you are randomly allocating (selection is left to chance and chance alone, rather than to your own judgmenent and bias), but there is always the possibility that some groups of customers may be over represented (also by chance). The effects of this risk (which is not good for representativeness) decrease as you increase the size of your random sample. Consider reviewing minumum sample size, central limit theorem, and normal distribution to consolidate your understanding of this topic further. Hope this helps.
 

obh

Active Member
#3
This is what my research skills book mentions:

The first few hundred cases selected using simple random sampling normally consist of bunches of cases whose numbers are closer together followed by a gap and then further bunching. For more than a few hundred cases, this pattern occurs far less frequently. Because of the technique’s random nature, it is, therefore, possible that the chance occurrence of such patterns will result in certain parts of a population being over- or under-represented.

I don't fully understand this. Can anyone explain better? Maybe with an example if possible?
Not so clear text ... maybe they wrote about the following example:

If the mean of each range (probability * sample size):

probabilities:
0.01, 0.03, 0.05, 0.08, 0.09, 0.1, 0.11, ...

Averages of small sample size: (10)
0.1, 0.3, 0.5, 0.8, 0.9, 1 , 1.1, 1.3, 1.6, 1.9, 2.1, 2.5, 3, 3.5, 3.8
You may get something like (with gaps):
0, 0, 0, 1, 0,1 ,0, 1,1, 0, 2, 0, 2, 3, 2

Averages of large sample size (10,000)
100 ,300, 500, 800, 900, 1000
You may get the following without gaps
95, 345, 477, 766, 921, 1011
 

hlsmith

Less is more. Stay pure. Stay poor.
#4
Well, I am going to extend the concept a little here. You are presented with fair coin. You are asked to flip it say 5 times and you get 4/5 heads. Is it a fair coin given the 80 probability of heads, well it could be? You do this over and over then plot results in a histogram. Well over the long run you will have a better idea if it is a fair coin, where you see close to 50% heads. Next scenario, you do the same thing but each time you flip the coin 1000 times. Given the risk chance, a stochastic run of heads can happen, but since there are so many flips the probability converges closes to 50% every time.
 

noetsi

Fortran must die
#6
Random sampling is preferable if you actually sample correctly and have a meaningful sampling frame and a large enough number of cases.
 
#7
It is not the clearest text I've read. But what I understand is that, as you increase your random sample, it becomes more representative of the population. In other words it becomes much more like it, it becomes like a mirror of it that becomes sharper and sharper as you increase the sample size. Because you are randomly allocating (selection is left to chance and chance alone, rather than to your own judgmenent and bias), but there is always the possibility that some groups of customers may be over represented (also by chance). The effects of this risk (which is not good for representativeness) decrease as you increase the size of your random sample. Consider reviewing minumum sample size, central limit theorem, and normal distribution to consolidate your understanding of this topic further. Hope this helps.
It is not the clearest text I've read. But what I understand is that, as you increase your random sample, it becomes more representative of the population. In other words it becomes much more like it, it becomes like a mirror of it that becomes sharper and sharper as you increase the sample size. Because you are randomly allocating (selection is left to chance and chance alone, rather than to your own judgmenent and bias), but there is always the possibility that some groups of customers may be over represented (also by chance). The effects of this risk (which is not good for representativeness) decrease as you increase the size of your random sample. Consider reviewing minumum sample size, central limit theorem, and normal distribution to consolidate your understanding of this topic further. Hope this helps.
Thanks a lot. That helped.
 
#8
Well, I am going to extend the concept a little here. You are presented with fair coin. You are asked to flip it say 5 times and you get 4/5 heads. Is it a fair coin given the 80 probability of heads, well it could be? You do this over and over then plot results in a histogram. Well over the long run you will have a better idea if it is a fair coin, where you see close to 50% heads. Next scenario, you do the same thing but each time you flip the coin 1000 times. Given the risk chance, a stochastic run of heads can happen, but since there are so many flips the probability converges closes to 50% every time.
That was very helpful. Thank you.
 

noetsi

Fortran must die
#9
Personally I think sample size gets discussed primarily in the context of statistical power (in statistics) and the null hypothesis. But what is likely more important is can you generalize from your sample. And that depends less on a random sample or a specific number than if you have a good sampling frame. Just because you have enough to run a reasonable p value, does not mean you can IMHO generalize from your sample.

That is my humble opinion, I have not seen a formal discussion of this because authors who talk about sample size tend to be in statistics (or polling) and those who talk about external validity (aka generalizing) tend to be methods experts- who don't get quoted as much as they should in the statistical text.