# Increase variability

#### trinker

##### ggplot2orBust
I have the need to take a variable that is normally distributed around 0 and the lower and upper bound is at -1 and 1. I need a transformation to flatten the top and widen the tails.

I just happened to think this may be a fishers z I'm looking for. If so I'll mark this thread as solved.

#### trinker

##### ggplot2orBust
no that's not it:

Data looks something like this:
Code:
x <- rnorm(10000, sd = .2)

#### trinker

##### ggplot2orBust
Ahh the distribution is a bit different than I had originally thought. It's bimodal. Here's a link to the data file as an R .RData file:

Code:
load(url("http://dl.dropbox.com/u/61803503/dist.RData"))
x
plot(x)
hist(x)
Here's what it looks like:  #### Dason

It's still not clear to me what you want to do and why you want to do it.

#### trinker

##### ggplot2orBust
Yeah sorry to me it's clear. The deal is I have difference in proportions of word uses between two people. I then want to color the words in a word cloud ad a gradient based on who used the words more. The problem is right now there's so little variablity if I use read and blue as the 2 gradient colors all words are purple. I actually demo how to do this here: http://trinkerrstuff.wordpress.com/2012/11/13/gradient-word-clouds/

However that solution is for that one time. The distribution of proportion difference will change.

#### Dason

You could use the rankings instead of the raw data.

#### trinker

##### ggplot2orBust
Hmm that may make sense. Zipf did that. let me think on that. Thanks for the idea Dason.

#### Dason

Or you could transform the data to be an 'ideal normal spread'.

Code:
transToNorm <- function(x){
qnorm( rank(x)/(length(x) + 1) )
}

j <- runif(1000)
hist(j)
k <- transToNorm(j)
hist(k)

#### trinker

##### ggplot2orBust
I think I was approaching the problem all wrong. Here's the final approach I took based on your rank comment:

Code:
load(url("http://dl.dropbox.com/u/61803503/dist.RData"))
breaks <- 10
low <- x[x < 0]
high <- x[x > 0]
lcuts <- quantile(low, seq(0, 1, length.out = round(breaks/2)))
hcuts <- quantile(high, seq(0, 1, length.out = round(breaks/2)))
cts <- as.numeric(unique(sort(c(-1, lcuts, 0, hcuts, 1))))
cut(x, breaks=cts)