sampling

  1. C

    Simple Algebraic Calculation about underestimation

    In a findings, it is found that the non-coverage rate for the second-level intercept variance is 8.9%, and the non-coverage rate for the second-level slope variance is 8.8%. Although the coverage is not grotesquely wrong, the 95% confidence interval is clearly too short. The amount of...
  2. P

    Sampling in a bounded region by using a MCMC approach ensuring samples uniformity

    Hi everybody, tired of waiting any longer for your remarks. Thank you all. P.
  3. R

    Sampling Behavior of a count

    Undercoverage is a problem that occurs in surveys when some groups in the population are underrepresented in the sampling frame used to select the sample. We can check for undercoverage by comparing the sample with known facts about the population. a) Suppose we take an SRS of n= 500 people...
  4. C

    Are Unbiasedness and Accuracy of the estimates, all to determine the sample size?

    For determining sample size , why is to focus on the unbiasedness and accuracy of the estimates ? Are this two properties, unbiasedness and accuracy of the estimates, all to determine the sample size ? In a simulation study of multilevel model, authors chose that combination of sample size...
  5. N

    Bayesian inference with unequal sampling

    I have a "two-column" data set, with a multi-class categorical variable A, and two-class variable B. It is assumed that each observation is independent. For each category of variable A, I want to make a Bayesian estimate of a binomial parameter for class 1 of variable B, consistent with the...
  6. C

    Interesting abstract question - Statisticians pls chk this

    I have a massive dataset (10s of millions of rows and 100s of dimensions). The dimensions are of all conceivable data types. How do I arrive at the sample that is: 1) Smallest 2) Most representative of the population with respect to all the dimensions If you can direct me to any...
  7. J

    Sampling Samples from a Big Data Set in R

    I have a large data set (23 million records, ~ 9 Gb) coming in R and am trying to figure out the best way to draw a sample from it. The plan I have right now is: 1) Break down the dataset into smaller pieces of around ~ 4 million records or 1.5 gb 2) Draw a random sample from each 3)...
  8. R

    Disproportional Sampling for Uneven Case Controls

    Hello, I am looking to do a random sampling analysis of a case/control dataset containing 6X as many controls as cases. Therefore, I need to correct for this overabundance of controls (without simply removing the controls) using disproportional sampling. Is switching the # cases and...
  9. G

    sampling distribution question help.

    "Mendel’s laws of inheritance indicate that individuals in a second-generation cross have a 75% chance of carrying a dominant trait and a 25% chance of carrying a recessive trait. Inheritance occurs independently in each individual. What is the sampling distribution of the proportion of...
  10. I

    What is booster sampling?

    We are doing convenient sampling rather than random sampling for a research that we are conducting. The authority is concerned about the reliability of the sampling. They have suggested that we look into booster sampling. Can any of you please let us know more on this? or redirect to...
  11. S

    Help understand probability in a simply random sample

    Quoted is an extract for Sample Survey Principles and Methods, Vic Barnett(2002) Pg 34 The concept of probability averaging only arises in relation to some prescribed probability sampling schemes. Thus, for simple random sampling we have the concept of the expected value of y_i, the ith...
  12. S

    'Alternate' proof that the expected value of the sample mean is the population mean

    It would be appreciated if someone could verify that this makes sense. By definition \bar{x} = \frac{\sum x_i}{n} So taking its expectation we get \bar{x} = \frac{1}{n} E[\sum x_i] Now, as we have a population of size N and a sample size of size n, we have {N\choose n} different samples and...
  13. P

    sampling methods - randomness sometimes a challenge

    Hi everyone, I need your advice about a sampling method that might work in this quite specific context. The story goes.. I need to conduct a survey in local fruit markets in Africa. Most of the fruit marketplaces are street markets and I cannot gather a list of all the people that trade in...
  14. G

    Sampling for rare events

    I am trying to estimate the event rate in a population where the event by definition (Money laundering ) is rare by drawing a representative sample. Assuming an event rate of 1 %, I calculated the sample size to estimate the event rate in the population using the general sample size calculation...
  15. C

    Computing sample size and Best Critical Region

    Let X_1,X_2,\ldots,X_n denote a random sample from a normal distribution N(\theta,100). Show that C=[(x_1,x_2,\ldots,x_n):c\leq \bar x=\frac{\sum_1^n x_i}{n}] is a best critical region for testing H_o:\theta=75 against H_1:\theta=78. Find n and c so that...
  16. S

    Proper Sampling - Can i sample like this without destroying reliability?

    The population that i need to extract a sample from is all firms from country A with activities in country B. The firms are classified into two categories: as having a subsidiary in country B, or not having a subsidiary in country B (the activity could then be exports, which does not require a...
  17. J

    Checking the Independence.

    Let X_i\sim N(\mu,\sigma^2) ; where [i=1,2,\ldots,n] Z_i\sim N(0,1) ; where [i=1,2,\ldots,n] Proof that \frac{(\bar X-\mu)}{\sigma} and \sum_{i=1}^n\frac{(X_i-\bar X)^2}{\sigma^2} are independent, which implies \bar X and \sum_{i=1}^n(X_i-\bar X)^2 are independent. If i show that \bar...
  18. J

    Cumulants of non central chi-square distribution

    Cumulant generating function is defined by logarithm of Moment generating function. K_X(t)=\log M_X(t) let X is a non central \chi^2 variate with parameters degrees of freedom,n and non-centrality parameter,\lambda. Moment generating function of X is...
  19. S

    calculating p-value from resampling of the null distribution

    Hi all, i am a PhD student in Bioinformatics and i'm dealing with some statistics issue in my reasearch; i was hoping you could lend me a hand :D I am dealing with a pool of datapoints (let's call it Q). I selected a subset of Q according to some particular characteristics, i will call...
  20. B

    Sampling a hemisphere using an arbitary distribtuion

    I am writing a ray tracer and I wish to fire rays from a point **p** into a hemisphere above that point according to some distribution. 1) I have derived a method to uniformly sample within a solid angle (defined by theta) above **p** [1] \phi = 2\pi\xi_1 \alpha = \arccos...