small sample size

#1
For my mcs thesis I am testing 6 ex alcoholics of which: 2 have been sustained from alcohol 2-5 years, two have 6-10 years and two have 11-15 years and 6 controls. I will be testing their ability to recognize emotions on a computer task therefore the dependent variable will be accuracy what analysis would I be able to use for this as it is such a small sample size?
 

noetsi

Fortran must die
#2
It depends on what statistic you are running. If the number of predictor variables exceeds the sample size (and here the intercept is a predictor) than you won't have enough degrees of freedom I believe to run many methods. It is also true that for some methods (such as logistic regression) some of the results are only correct with larger sample sizes (the results are only asymptotically accurate). For regression for example common rules of thumb are to have 100 plus cases with more for every predictor variable (although you can actually run with fewer, it is just dangerous to do so particularly as you violate the assumptions of the method).

Your power will be extremely low so that your type II error rate will be very high. And generalizing will be extremely difficult. Have you considered doing qualitative approaches such as interviews rather than quantitative ones (I don't know of course if your question supports that). I have not seen statistical analysis with just six people before.

It has been pointed out that in a high quality controlled experiment when you have reason to believe the results are generalizable to the larger population my comments above are not valid. They strongly disagree with the rules of thumb which of course are just general approaches which may not apply to specific analysis. Sources I have seen, which focus on observational data in the social sciences, strongly assert the need for large samples, but in other fields the perspective is different.
 
Last edited:

CowboyBear

Super Moderator
#3
For regression for example common rules of thumb are to have 100 plus cases with more for every predictor variable (although you can actually run with fewer, it is just dangerous to do so particularly as you violate the assumptions of the method).
I think we've discussed before that this comment is not valid. Regression by ordinary least squares does not assume a large sample. If its distributional assumptions are met, the coefficients will be consistent efficient and unbiased, and the sampling distribution of the coefficients will be normal, even in small samples. Regression assumptions are discussed in our paper here. By coincidence the previous article published in that journal showed that you can legitimately use a t-test with as few as 2 people in each subsample. Note that a student's t-test is regression: regression with a single binary predictor. :)

Regression is more robust to distributional violations with large sample sizes, but a large sample size is not itself one of the assumptions made. IMO people worry far too much about sample size and not nearly enough about where the sample came from. E.g., if you have no random selection from a population, and no random assignment to conditions, then what are you trying to make inferences about in the first place? :confused:

Your power will be extremely low so that your type II error rate will be very high. And generalizing will be extremely difficult.
This is more valid, and power is probably the main problem in OP's situation. (OP: can you recruit a bigger sample?) But keep in mind that sample size is not the only determinant of power: If the effect is very large, the sample needed to detect the effect might be very small. If the effect is tiny, the necessary sample might be huge. The variability of the response variable also matters.
 

Dason

Ambassador to the humans
#4
By coincidence the previous article published in that journal showed that you can legitimately use a t-test with as few as 2 people in each subsample. Note that a student's t-test is regression: regression with a single binary predictor. :)
And technically if you assume that both groups have equal variance you can get away with one group having two observations and the other only having a single observation. Note that this probably isn't the best idea though unless you're *extremely* confident the assumptions are met.