  1. D

    Big data analysis - What statistical method should I use?

    I am currently working on my master thesis that is investigating textual factors of "viral headlines". My professor gave me a data set for R that contains about 4.000 packages. Each package has about 3 -5 different headlines of the same article, and their resulting click rate. Somewhat like an...
  2. Just a Very Curious Nerd

    Interpreting Ridge Trace Plot

    In my research, I aimed to perform a regression model with four predictors and one response variable. When I verified a high collinearity among the predictors, I was instructed to handle this problem using a ridge regression. So, I developed this analysis using R's glmnet package, and I...
  3. G

    Multi-level paired data: mutliple linear model or linear-mixed-effects model? or other? (with R software)

    Hello I have a stats thinking about how to best analyse my data. I have sampled forest floor samples in 10 sites, 1 sample per site. In each sample, I quantified the density of 5 different bacteria strains. I want to know if one bacteria strain is more represented on average than others, and...
  4. L

    Data analysis in R on path data.

    Hello all. I am working on a data base collected by walk/path observations in animals. The data is recorded as coordinate X and Y. Do you have any suggestion about which R package I can use and which analysis would best to test different between different conditions. For now I found a nice...
  5. E

    Book Review - Essentials of Bio statistics: An overview with the help of software

    Would like to receive the review for the following book Essentials of Bio-statistics: An overview with the help of software www.ijsmi.com/book.php This book intends to provide an overview of biostatistics concepts and methodology through the use of statistical software. It helps clinicians...
  6. ybarnatan

    Which non-parametric test should I use while running GLM?

    I'm trying to analyze some experimental data about animal behaviour using R and would need some help or advice regarding which non-parametric test should I use. The variables I have are: - Response variable: "Vueltasmin", a numeric one - Explicatory variable: "Condicion", a factor with 6 levels...
  7. J

    Arguments imply differing number of rows: 1, 0

    I am trying to generate a cdf plot by using ggplot and have looked at some examples online. However when I try to replicate it I get the following error: "arguments imply differing number of rows: 1, 0" I made a search and it seems from what I gather the nrows!=ncol and that doesn't work for a...
  8. T

    HELP - citation model in R

    Hello everyone. I need help with the citation model as explained in This article "Bibliometric tools applied to analytical articles: the example of gene transfer‐related research" The model is expressed as C=KY+S C=cumulative number of citation Y=article life K=citation rate (number of...
  9. C

    R: Date Calculation

    Let I have some data against time : time <- seq(ISOdate(2007,7,1,0), ISOdate(2008,4,5,23), by = "1 hour") y <- rnorm(n = length(time)) dat <- data.frame(time = time, y = y) Now I am trying to create another variable `day_index` which will take value 1 for 2007-07-01...
  10. L

    Finite mixture bivariate probit model

    Hello, all, I would like to model my data using a bivariate probit model but would like to explain unobserved heterogeneity of the coefficients in the model using finite mixture. That is, I would like to explain the characteristics of the segments (sub groups) when I estimate my bivariate...
  11. L

    I need immediate help on how I can use R to do my project!!

    The project that I have decided to take is about how the success rate of a football team (considering a single league of 18 teams) depends on different factors like wages, revenue earned by team, etc.. I need immediate help!!
  12. A

    Help Needed: Multivariate Regression Using R

    I am testing the a couple of CAPM based models for my dissertation, and I have a healthy amount of stocks to regress, 5000 according to my last calculations, they’d have to be done as 75 stocks at a time, (in a portfolio). The number makes it an unrealistic task to accomplish manually, I have...
  13. A

    negative R² in output of randomForest package

    Hi everyone, I am currently writing my master thesis about random forests and just started to work with the R software. When I am running my model the output looks like this: Mean of squared residuals: 0.0002441535 % Var explained: -8.82 Can anyone explain me why I get a negative R²? I...
  14. J

    what stats to use?

    Dear All, As I am not experienced in stats I would like to ask for some stats suggestions for a study. study one: control impact with one impacted site and one reference site. Two species and biometrics taken. I need to see if the the impacted site increases biometrics. study two...
  15. C

    R code

    How does the code in R work? Why does it take 3 values? 0:2/5 [1] 0.0 0.2 0.4
  16. E

    Dredge in R

    Hi everyone, My knowledge in R statistical software is poor. Could eventually someone help me to understand which way I do dredge in R? How do I present all top models within 2 AIC and model averaged parameter estimates? Thank you in advance, Tom
  17. M

    R graphs and capabilities

    Hello everyone, I am a MATLAB-user (in my previous work & education) but my company has R installed (and I don’t think there is the will to get an expensive MATLAB license). I am vaguely familiar with R as I used it for a statistics class 8 years ago, but I haven’t touched it since then...
  18. R

    [R – lme function] Comparing curves in longitudinal data (growth curves)

    I think my problem is best solved by the use of multi-level models. However, I’d like to confirm if what I’m doing is right, and the use of the function is right too. I work with cells (biology), by doing electrophysiology. I apply to each cell a series of currents, in 20 pA steps, ranging from...
  19. S

    Median Interneuron Distance in R in the package Kohonen

    Hi all, I am trying to detect outliers using self organizing models in the R package Kohonen. In most of the papers the MID matrix (median interneuron distance) is used to assess possible outliers. Therefore, I am trying to compute the MID matrix from the output from the package Kohonen...
  20. L

    please help: Is this R-syntax is true?

    I want to ask some confusing in my mind regarding the data analysis: I have conducting the experimental trial in repeated measure by hours as below: Design: Completely Randomized Design--(in green houses)-there are 6 box unit...