[R] How to get normal/ rankit scores from non-normal data with ties.


I would like to see whether pathogen levels influenced certain bumblebee colony development parameters such as the number of queen pupae produced.

Due to non-normality of the data I would like to do a rankit transformation of the data as suggested by [Bishara & Hittner (2012)][1].

To define this transformation, let x_r be the ascending rank of x, such that xr = 1 for the lowest value of x. The RIN transformation function used here is

f(x)= Φ^(-1) ((x_r-0.5)/n)

where Φ^(-1)is the inverse normal cumulative distribution function and n is the sample size (Bliss, 1967).

I could not figure out how to do the rankit transformation in R. I tried this:

    my.df$queen.pupae_rankit = qnorm((rank(my.df$queen.pupae)-0.5)/length(my.df$queen.pupae))
However, the ties seem to prevent a normal distribution of the rankit scores:


Therefore, I would like to know

  1. How can I get rankit scores from data with ties?
  2. Is the qnorm function actually the correct function to get the inverse cumulative distribution function
  3. [Bishara & Hittner (2012)][1] used the rankit scores in Pearson correlations rather than regressions. I know in a regression only the independent variable has to be normally distributed. Should I anyway, as [Bishara & Hittner (2012)] did transform also the dependent variable?