I have some summary stats on people's drink intake and I'd like to work out the variance.
Frustratingly I have no access to the full data set so I'm having to infer the shape as much as possible from a handful of descriptive stats.
I have access to the mean amount, the standard error of the mean and the weighted and unweighted bases (segmented by sex, age, etc.)
I am guessing I can calculate SD for each segment by multiplying the sqrt of the unweighted base (i.e. actual sample size) by the SEM.
Is this valid, or is there something I've overlooked?
If one SD is significantly greater than the mean, what can I hypothesise about the skewness of the underlying data set? Bearing in mind I know there are no null values (non-drinkers were excluded)...
Based on my knowledge of the population involved, my assumption is that the bulk of the sample will have a relatively low intake, with a long narrow tail of people drinking up to several times the 'normal' amount. I'd like to be able to point to the variance to support this assumption.