Hi,
I tried to work out the necessary sample size for a logistic regression by simulations and got some scary results. If anyone could check the code below, it would be a great help.
I simulate a logistic regression with two normalized variables, one having a fixed odds ratio of 1.4 the other a...
hi,
I just finished the chapter on SVMs from the Statistical Learning book of Hastie and Tibshirani. Their focus is to use the SVMs for classification - and they also show that the SVMs are equivalent to the logistic regression with nonlinear predictors (there is even an impressive exercise to...
hi,
below is an experiment I just did testing whether using a bootstrap I could get better results as with simple repetead sampling. Is there any error in the logic/code? It seems that bootstrapping would just amplify the sampling error without adding any value - what do I miss?
This is...
Hi,
I have to decide whether two groups have the same variance or not. My sample size is less then impressive, 13 data points per group. I ran Bartletts test and got a p-value of 0.09 - but I suspected that the relatively high value might be due to the small sample size.
So, I ran a...
hi,
before R I used to be big fan if Perl- and there we had an option to turn a script into an exe file so that people could use it without having to install Perl.
Is there any similar solution for R?
regards
Hi,
I am analyzing some pretty hopeless datasets where the link between the DVs and IVs is quite weak. I observed however that when I take the two groups resulting from the first partition in the tree I can generallly get a nicely low p-value with a t-test . Is this some property of the trees I...
Hi,
I just pledged to work my way through this book (Introduction to Statistical Learning by James, Witten, Hastie, Tibshirani). Anyone willing to join? Keeping the discipline would be more fun if we worked together.
regards
hi,
I will pitch R to a pharma company on Monday. Their biggest fear is that they can not validate R, so they can not use it in any situation related to regulatory, like proving something statistically to the FDA , because they can not prove, that the packages they used in R are correct...
Hi,
we just had an interesting discussion about the chi-squared test, that could be generalized. I think the core of the discussion was whether one should prefer a simple statistical test or a statistical model - e.g. a chi-squared test vs. a logistic regression.
Given that in practice one is...
Hi,
I am learning TS analysis but my angle is probably diffent from the usual. Being in six sigma I need to reduce the variability of a process that I can prove is basically AR(3) which with 3 runs on the average per day means one day's runs influence the next day in the mathematical model...
hi,
looking at process times of a rather complex machine I found that an AR(2) model describes it quite well. Does it make sense to interpret this as an indication that the machine has a "memory" of about 2 runs ? E.g. insufficient cleaning between runs?
thanks a lot!
Hi,
I am looking at Coursera data anlysis specializations and most (basically all except John Hopkins) offer Python as the language of choice for their courses. I really like R but I started to wonder if I am missing something? I just see no advantages in Python for data analysis, that would...
hi,
I am looking at the question of how to sample rare occurances, like a rare defect in a manufacturing line. If I use the standard sample size formula the sample size goes into hundreds, which is unreasonable due to the long waiting times.
My idea s to measure the time intervals between...
hi,
I am working through a designed experiments texbook and I would like to see if I got one exercise right.
The story is about measuring the effect of baking powder on the height of biscuits. The exoerimental factor is the amount of soda with, say, 3 levels - let us call it low amount, medium...