Search results

  1. hlsmith

    Does a reduced sample size after joining two datasets count as missing data?

    Almost all biases can be considered missing data. Model misspecification = missing knowledge, confounding = missing variable, selection bias = missing individuals, information bias = missing accuracy/classification metrics, missing data = missing data. Why can't you deterministically join these...
  2. hlsmith

    Entropy Balancing in Panel Data Setting

    DiD is big in the econ field and it seems your example aligns with that and the general terminology used. I am not overly familiar with the speak, but I would imagine the entropy is pseudo-comparable to balancing background imbalances via propensity scores. The weights function to balance...
  3. hlsmith

    Curving qq plot: What does this mean?

    Well it also seems like your data is bounded by zero and zero is a regular value. What is up with that?
  4. hlsmith

    paired vs independent, which one

    To clarify with examples: Paired: compare people's weights before and after an intervention. Independent: compare people's weights that got an intervention to people that did not get the intervention. Independent: compare people's weights that got an intervention to people that did not get an...
  5. hlsmith

    Block randomization

    I haven't done this in R yet, but I have done it in the past, block and varied block size randomization in SAS. Where code is available online if you want to rewrite it into R. I would be interested to see if you find a relevant package - which it would be difficult to imagine a package doesn't...
  6. hlsmith

    Questions about maximum sampling error in stratified sampling

    This is not my area, but if you go to the webpage they convincingly convey that yes this is the sampling error formula. However, in all of their examples they are applying this to a binary items. So if you didn't have a "p" and had a continuous variable like weight - what would you use?
  7. hlsmith

    Question about hierarchical Cluster Analysis

    For clarification are you conducting hierarchical cluster analysis where you end up with a dendrogram looking figure or hierarchical multilevel modeling of clusters (regression)? Thanks and welcome to the forum!
  8. hlsmith

    Questions concerning the analysis of my Data

    set.seed(1234) y1 = rnorm(15, 0, 1) y2 = exp(y1) par(mfrow=c(1,2)) hist(y1, main="normal, n=15") hist(y2, main="lognormal, n=15")
  9. hlsmith

    Questions concerning the analysis of my Data

    I would have to replicate @Karabiner's comment on assumptions related to distributions. I can create a n=15 simulation of a random normal distribution and it can look skewed given the finiteness of the sample size.
  10. hlsmith

    Bayesian proxy to Huber-White Estimators

    I am working on a study where a very small portion of the sample contribute more than one observation. I was wondering if there was a Bayesian proxy to the Huber-White estimator or robust regression? I was actually planning on using non-informative priors since it is novel research and there...
  11. hlsmith

    Sample Size n=5

    That is a very particular question, not sure anyone here will know the answer. I would imagine, cost, time, precedence, and size of difference along with variability should come into play. You could potentially simulate the process to explore sample sizes and power.
  12. hlsmith

    Questions concerning the analysis of my Data

    Can you provide a couple more pieces of information? Was the intervention randomized? What is the overall sample size and how many people are in each of the 4 subgroups? Thanks and welcome to the forum!
  13. hlsmith

    Permutations question

    Great stuff. Yeah I was going to harp on the complexities of genomics, but thankfully you had a disclaimer. It would be ideal if you passed the disclaimer along to the patrons. Also, of interest would be that if we uncoiled our DNA it would stretch to like Jupiter three times and if you spoke...
  14. hlsmith

    Double or combined propbability

    Not that it matters, but it isn't apparent if the person losing is the server or receiver?
  15. hlsmith

    Test Durbin Watson(error independence)

    Or maybe something about why they are different. Which i would say is that you are proposing a different structural relationship between the variables (additive vs multiplicative?).
  16. hlsmith

    Confounding Without Interaction?

    Didnt watch the video but interaction is that the mutual presence of the two predictors at the same time change their effects on the outcome on additive or multiplicative scale in a synergistic or antagonistic direction. Think about risk for lung cancer and the increased risk for smokers and...
  17. hlsmith

    Skewness and Kurtosis of a t-distribution

    Since I got a like on my prior crass post. What do you think I call their joint distribution?
  18. hlsmith

    Very high standardized coefficient beta (binary logistic regression)

    Perhaps we are talking about two different things, but a standardized binary variable would be 0 for its prevalence and a unit would be a std increase right. I didn't think it would have the same coefficient. I didn't follow your original thoughts about collinearity?
  19. hlsmith

    Skewness and Kurtosis of a t-distribution

    But I thought these were
  20. hlsmith

    Very high standardized coefficient beta (binary logistic regression)

    I have never been able to process in my mind how a person would interpret a standardized categorical (e.g., > 2 groups) variable in a regression. Even a binary variable that was standardized is pretty darn weird to interpret. If you can't interpret them, do you really need to standardize them...