Recent content by rogojel

  1. rogojel

    Simulating a logistic regressio - scary results

    Hi, I tried to work out the necessary sample size for a logistic regression by simulations and got some scary results. If anyone could check the code below, it would be a great help. I simulate a logistic regression with two normalized variables, one having a fixed odds ratio of 1.4 the other a...
  2. rogojel

    Support vector machines - what's the point?

    hi, I just finished the chapter on SVMs from the Statistical Learning book of Hastie and Tibshirani. Their focus is to use the SVMs for classification - and they also show that the SVMs are equivalent to the logistic regression with nonlinear predictors (there is even an impressive exercise to...
  3. rogojel

    testing the bootstrap

    hi, below is an experiment I just did testing whether using a bootstrap I could get better results as with simple repetead sampling. Is there any error in the logic/code? It seems that bootstrapping would just amplify the sampling error without adding any value - what do I miss? This is...
  4. rogojel

    Bootstrap and hypothesis test

    Hi, I have to decide whether two groups have the same variance or not. My sample size is less then impressive, 13 data points per group. I ran Bartletts test and got a p-value of 0.09 - but I suspected that the relatively high value might be due to the small sample size. So, I ran a...
  5. rogojel

    Turning an R script into an exexutable

    hi, before R I used to be big fan if Perl- and there we had an option to turn a script into an exe file so that people could use it without having to install Perl. Is there any similar solution for R? regards
  6. rogojel

    p-values in regression trees

    Hi, I am analyzing some pretty hopeless datasets where the link between the DVs and IVs is quite weak. I observed however that when I take the two groups resulting from the first partition in the tree I can generallly get a nicely low p-value with a t-test . Is this some property of the trees I...
  7. rogojel

    Bookclub: ISLR

    Hi, I just pledged to work my way through this book (Introduction to Statistical Learning by James, Witten, Hastie, Tibshirani). Anyone willing to join? Keeping the discipline would be more fun if we worked together. regards
  8. rogojel

    Pitching R in Pharma

    hi, I will pitch R to a pharma company on Monday. Their biggest fear is that they can not validate R, so they can not use it in any situation related to regulatory, like proving something statistically to the FDA , because they can not prove, that the packages they used in R are correct...
  9. rogojel

    Model vs. test

    Hi, we just had an interesting discussion about the chi-squared test, that could be generalized. I think the core of the discussion was whether one should prefer a simple statistical test or a statistical model - e.g. a chi-squared test vs. a logistic regression. Given that in practice one is...
  10. rogojel

    Physical interpretation of an AR(n) series

    Hi, I am learning TS analysis but my angle is probably diffent from the usual. Being in six sigma I need to reduce the variability of a process that I can prove is basically AR(3) which with 3 runs on the average per day means one day's runs influence the next day in the mathematical model...
  11. rogojel

    Interpretation of an AR model

    hi, looking at process times of a rather complex machine I found that an AR(2) model describes it quite well. Does it make sense to interpret this as an indication that the machine has a "memory" of about 2 runs ? E.g. insufficient cleaning between runs? thanks a lot!
  12. rogojel

    What does Python offer that R can't?

    Thanks a lo! This puts me squarely in the R camp - great arguments on bith sides.
  13. rogojel

    What does Python offer that R can't?

    Hi, I am looking at Coursera data anlysis specializations and most (basically all except John Hopkins) offer Python as the language of choice for their courses. I really like R but I started to wonder if I am missing something? I just see no advantages in Python for data analysis, that would...
  14. rogojel

    beating the rules at discrete sampling

    hi, I am looking at the question of how to sample rare occurances, like a rare defect in a manufacturing line. If I use the standard sample size formula the sample size goes into hundreds, which is unreasonable due to the long waiting times. My idea s to measure the time intervals between...
  15. rogojel

    Interpretation of a one - way ANOVA

    hi, I am working through a designed experiments texbook and I would like to see if I got one exercise right. The story is about measuring the effect of baking powder on the height of biscuits. The exoerimental factor is the amount of soda with, say, 3 levels - let us call it low amount, medium...