# Share your functions & code

#### trinker

##### ggplot2orBust
RANDOM COLORS

Description
This is a function I use when I'm looking at scatterplots of data and would like each point to be a unique color. This helps me with certain visualization of the data where I want individual data points to stand out; sometimes keeping everything in black blurs it all together. I became annoyed with writing out the argument to get random color choices (if you just use a preseat palette the color choices are often very similar because the numeric sequence is very similar in color shade, hence the desire for randomization). This function can also be used to quickly change the palette settings (which contains eight colors by default) to as many different random colors as you want (see eaxmple at the bottom).

There very well may be a more efficient method of doing this but I use the following function and keep it in a sourced .Rdata file or in the startup function .First() as a sourced object. Generally, I find the colors or rainbow argument to be most useful in differentiating points, thus the assignment of color.choice as rainbow. Not the most amazing function but it's a time saver for me.

FUNCTION
ran.col(c(dataframe, vector, number), color.choice = c(colors, rainbow, heat, terrain, topo, cm))

The first argument will accespt a dataframe, vector or number as an arument. The second argument are any of the built in Color Palettes.

FUNCTION CODE
Code:
ran.col <- function(x, color.choice = "rainbow"){
color.choice <- substitute(color.choice)
color.choice <- as.character(color.choice)

if(is.data.frame(x) == TRUE) {
switch(color.choice,
colors = sample(colors()[-1], nrow(x), replace = FALSE),
rainbow = sample(rainbow(10000), nrow(x), replace = FALSE),
heat = sample(heat.colors(10000), nrow(x), replace = FALSE),
terrain = sample(terrain.colors(10000), nrow(x), replace = FALSE),
topo = sample(topo.colors(10000), nrow(x), replace = FALSE),
cm = sample(cm.colors(10000), nrow(x), replace = FALSE))

}else if(is.vector(x) == TRUE & length(x)!= 1) {
switch(color.choice,
colors = sample(colors()[-1], length(x), replace = FALSE),
rainbow = sample(rainbow(10000), length(x), replace = FALSE),
heat = sample(heat.colors(10000), length(x), replace = FALSE),
terrain = sample(terrain.colors(10000), length(x), replace = FALSE),
topo = sample(topo.colors(10000), length(x), replace = FALSE),
cm = sample(cm.colors(10000), length(x), replace = FALSE))

}else if(is.numeric(x) == TRUE) {
switch(color.choice,
colors = sample(colors()[-1], x, replace = FALSE),
rainbow = sample(rainbow(10000), x, replace = FALSE),
heat = sample(heat.colors(10000), x, replace = FALSE),
terrain = sample(terrain.colors(10000), x, replace = FALSE),
topo = sample(topo.colors(10000), x, replace = FALSE),
cm = sample(cm.colors(10000), x, replace = FALSE))
}
}
EXAMPLE
Code:
x11(16,8)
par(mfrow = c(2,3))
with(mtcars,plot(mpg,disp,pch=19,col=ran.col(mtcars,colors),main="COLORS"))
with(mtcars,plot(mpg,disp,pch=19,col=ran.col(mtcars,rainbow),main="RAINBOW"))
with(mtcars,plot(mpg,disp,pch=19,col=ran.col(mtcars,heat),main="HEAT"))
with(mtcars,plot(mpg,disp,pch=19,col=ran.col(mtcars,terrain),main="TERRAIN"))
with(mtcars,plot(mpg,disp,pch=19,col=ran.col(mtcars,topo),main="TOPO"))
with(mtcars,plot(mpg,disp,pch=19,col=ran.col(3,cm),main="CM"))

ran.col(6,colors)
#USING TO SET PALETTE
palette()  #current palette
palette(ran.col(10))   #set palette
palette()    #current palette
with(mtcars,plot(mpg,disp,pch=19,col=cyl,main="COLORS"))
palette("default")    #return to default
with(mtcars,plot(mpg,disp,pch=19,col=cyl,main="COLORS"))

PS trying to follow style guide rules Hoping I am, but if not, let me know so I can correct it.

Last edited:

#### Dason

I'm still not exactly sure how you're doing indenting but that's alright. Here's how I would indent it. If you ever use Emacs+ESS there is a nice little function that will correct indentation for you...

Code:
ran.col <- function(df, color.choice = "rainbow"){
color.choice <- substitute(color.choice)
color.choice <- as.character(color.choice)
switch(color.choice,
colors = sample(colors()[-1], nrow(df), replace = FALSE),
rainbow = sample(rainbow(10000), nrow(df), replace = FALSE),
heat = sample(heat.colors(10000), nrow(df), replace = FALSE),
terrain = sample(terrain.colors(10000), nrow(df), replace = FALSE),
topo = sample(topo.colors(10000), nrow(df), replace = FALSE),
cm = sample(cm.colors(10000), nrow(df), replace = FALSE))
}
I used spaces to align the options in the switch with color.choice. Well I indented to the same level as the switch (1 indent) and then used spaces from there - this will keep everything aligned on any system.

#### trinker

##### ggplot2orBust
Dason,

Thanks for the continued feedback. I made some adjustments to the code to accept vectors and numbers as well. I rewrote it and tried to follow your indentation guide. I updated the original post.

May I ask why everything gets 5 indentation spaces but colors in the switch argument gets 8? Continue to critique, that will help me. Sometimes I'm not sure how to indent the if else portions.

#### Dason

Everything gets one indent (I don't use spaces) because it is only one level deep (it's all inside the function). The colors in the switch statement get aligned to the first argument with spaces because they are part of the switch statement - they aren't new commands so they're aligned with that first argument to show that they aren't new commands but just part of the switch statement - but that the switch statement would be far too long to have as a single line.

Here is how I would do something like an if/else
Code:
fact <- function(n){
if(n < 0){
warning("Inappropriate input to factorial")
}else if(n == 0 | n == 1){
return(1)
}else{
return(n*fact(n-1))
}
}

#### Dason

That's an interesting way to think about it.

One way to get used to indenting is to learn Python. You'll learn about indenting... or else.

#### trinker

##### ggplot2orBust
}else if(n == 0 | n == 1){
Thanks a lot Dason. I didn't know I could put if and else together like that. I thought [R] would garble the two together and was adding in many unnecessary curly braces. I posted a more updated version (edited original post). Am I getting closer? Anything you'd do differently?

#### Dason

One thing they teach you while taking any sort of programming class is that you want to avoid rewriting code. You have quite a bit of repeated code with only a slight change. The way I would probably go about doing that is to introduce a new variable - modify that using the if/else and then use the switch statement with that.

Code:
blarg <- function(x){

if(is.data.frame(x)){
len <- nrow(x)
}else if(#something else){
len <- x
}else{
len <- 5
}

switch(color.thingy,
blah, len, otherstuff,
foo, len, bar)
}
I know that code doesn't make any sense but your code is on an entirely different page so I just made it up. Hopefully you get what I'm saying here though.

#### Jake

I came up with some code in response to this thread. Basically the OP was interested in looking at the the relative variance between two levels of a factor that is crossed with subjects (as in the typical case of a sample of subjects all of whom have a pre-score and a post-score) using bootstrapping techniques. The code I came up with and a little output are below, and I follow that with some geeky details and then a question or two about getting my code to play even nicer with the boot() function from the boot package. (The integration is okay now but could definitely be better.)

The part that I'm mainly interested in discussing/modifying is the getvarstats() function, but it will be useful to also show the simulated data and some of the output from boot() to give a better sense of the issues I'm concerned with.
Code:
> library(data.table)
> library(boot)
> set.seed(12345) # I've got the same combination on my luggage!
>
> ### make some paired data with unequal variances
> dat <- data.table(subject=rep(1:50,2),
+                   prepost=rep(c(-1,1),each=50),
+                   subint=rep(rnorm(50,mean=0,sd=5),2),
+                   subslope=rep(rnorm(50,mean=5,sd=3),2),
+                   error=c(rnorm(50,mean=0,sd=5),rnorm(50,mean=0,sd=10)),
+                   key="subject,prepost")
> dat$dv <- round(55 + dat$subint + dat$subslope*dat$prepost + dat$error,2) > dat <- data.table(subject=1:50, + pre=dat[prepost==-1]$dv,
+                   post=dat[prepost==1]$dv, + key="subject") > > ### examine > head(dat) subject pre post [1,] 1 55.67 45.11 [2,] 2 41.92 74.87 [3,] 3 51.40 61.57 [4,] 4 40.05 50.72 [5,] 5 55.75 59.93 [6,] 6 37.40 49.23 > nrow(dat) [1] 50 > dat[,list(mean_pre=mean(pre),mean_post=mean(post))] mean_pre mean_post [1,] 49.8998 62.8648 > dat[,list(var_pre=var(pre),var_post=var(post))] var_pre var_post [1,] 79.73543 142.5389 > cor(dat$pre,dat$post) [1] 0.2546421 > > ### bootstrap! > getvarstats <- function(data, seeds) { + index <- max.col(matrix(c(c(seeds[2:length(seeds)],seeds[1]),seeds),ncol=2))-1 + index[length(seeds)] <- !index[length(seeds)] + index <- c(1:length(seeds)+index*length(seeds), + 1:length(seeds)+index*length(seeds)+length(seeds)) + values <- c(data$pre,data$post,data$pre)
+   d <- data.table(pre=values[index[1:length(seeds)]],
+                   post=values[index[seq(length(seeds)+1,2*length(seeds))]])
+   return(c(vardiff = var(d$post) - var(d$pre),
+            varratio = var(d$post)/var(d$pre),
+            postvar = var(d$post), + prevar = var(d$pre)))
+ }
> resamples <- 10000
> results <- boot(data=dat, statistic=getvarstats, R=resamples)
> results

ORDINARY NONPARAMETRIC BOOTSTRAP

Call:
boot(data = dat, statistic = getvarstats, R = resamples)

Bootstrap Statistics :
original      bias    std. error
t1*  62.803472 -62.6755404  36.6695677
t2*   1.787648  -0.7566453   0.2547327
t3* 142.538903  10.3294624  18.4133146
t4*  79.735431  73.0050028  18.4070700
>
> vardiff <- var(dat$post) - var(dat$pre)
> vardiff
[1] 62.80347
> varratio <- var(dat$post)/var(dat$pre)
> varratio
[1] 1.787648
>
> hist(results$t[,1],breaks=100) > nullCI_diff <- quantile(results$t[,1],probs=c(.025,.5,.975))
> nullCI_diff
2.5%         50%       97.5%
-71.3018270   0.2681338  72.0280165
> p_diff <- mean(results$t[,1] > var(dat$post)-var(dat$pre) + | results$t[,1] < var(dat$pre)-var(dat$post))
> p_diff
[1] 0.0855
>
> hist(results$t[,2],breaks=100) > nullCI_ratio <- quantile(results$t[,2],probs=c(.025,.5,.975))
> nullCI_ratio
2.5%       50%     97.5%
0.6213139 1.0017508 1.6167661
> p_ratio <- mean(results$t[,2] > var(dat$post)/var(dat$pre) + | results$t[,2] < var(dat$pre)/var(dat$post))
> p_ratio
[1] 0.0153
>
> boot.ci(results)
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 10000 bootstrap replicates

CALL :
boot.ci(boot.out = results)

Intervals :
Level      Normal              Basic             Studentized
95%   ( 53.61, 197.35 )   ( 53.56, 196.94 )   ( 53.08, 290.01 )

Level     Percentile            BCa
95%   (-71.34,  72.05 )   ( 54.06, 136.01 )
Calculations and Intervals on Original Scale
Warning : BCa Intervals used Extreme Quantiles
Some BCa intervals may be unstable
Warning message:
In norm.inter(t, adj.alpha) : extreme order statistics used as endpoints
Getting the boot() function from the boot package to play nice with the paired data is tricky, because the boot() function is set to resample from the data on whole rows only. So the function to compute the the difference and ratio has to first transform the sampled indices received from boot() into random binary pairs and then use those to shuffle the within-subject levels of the IV, rather than just using the sampled indices themselves. This is further complicated by the fact that boot() first "probes" the function by giving passing in the unsampled indices so that it knows what the actual estimates are. So to play nice with boot(), the difference/ratio function must perform its internal shuffling such that it will return the original estimates when given the unsampled indices. This precludes one of the "obvious" internal shuffling methods such as shuffling based on internally generated random numbers or on whether each sampled index is even or odd.

The solution I came up with was to base the samples on whether each sampled index is greater than index[i+1], breaking ties at random, and with index[length(index)] being compared with index[1] in a circular fashion. When the indices are sampled independently with replacement, as boot() does, then each comparison should return TRUE or FALSE with equal probability; and in the special case where the indices are intact, they should return all FALSEs and a single TRUE at the last index. So the function reverses the value at the last index and then shuffles based on those.

It all works fine this way, but it seems that the boot() function is still getting confused at some point. The issue can be seen in the output from print(results) above, where it says that each resampled statistic has considerable nonzero "bias," and in the output from print(boot.ci(results)), where only the "percentile" CI looks right. These issues almost certainly arise from the assumptions that boot() makes about the structure of the input data, which are not true here. Here is what the output from above looks like if you (inappropriately) allow boot() to resample the data on whole rows as it assumes it is doing:
Code:
> wrongvarstats <- function(data, seeds) {
+   d <- data[seeds,]
+   return(c(vardiff = var(d$post) - var(d$pre),
+            varratio = var(d$post)/var(d$pre),
+            postvar = var(d$post), + prevar = var(d$pre)))
+ }
> resamples <- 10000
> results2 <- boot(data=dat, statistic=wrongvarstats, R=resamples)
> results2

ORDINARY NONPARAMETRIC BOOTSTRAP

Call:
boot(data = dat, statistic = wrongvarstats, R = resamples)

Bootstrap Statistics :
original      bias    std. error
t1*  62.803472 -1.58627384  23.2238257
t2*   1.787648  0.01713536   0.3433451
t3* 142.538903 -3.18166162  23.8999107
t4*  79.735431 -1.59538778  10.4654121
>
> vardiff2 <- var(dat$post) - var(dat$pre)
> vardiff2
[1] 62.80347
> varratio2 <- var(dat$post)/var(dat$pre)
> varratio2
[1] 1.787648
>
> hist(results2$t[,1],breaks=100) > CI_diff <- quantile(results2$t[,1],probs=c(.025,.5,.975))
> CI_diff
2.5%       50%     97.5%
18.10261  60.67636 107.99748
>
> hist(results$t[,2],breaks=100) > CI_ratio <- quantile(results2$t[,2],probs=c(.025,.5,.975))
> CI_ratio
2.5%      50%    97.5%
1.220047 1.779401 2.556624
>
> boot.ci(results2)
BOOTSTRAP CONFIDENCE INTERVAL CALCULATIONS
Based on 10000 bootstrap replicates

CALL :
boot.ci(boot.out = results2)

Intervals :
Level      Normal              Basic             Studentized
95%   ( 18.87, 109.91 )   ( 17.57, 107.52 )   ( 24.36, 116.92 )

Level     Percentile            BCa
95%   ( 18.09, 108.04 )   ( 23.14, 115.60 )
Calculations and Intervals on Original Scale
As you can see, boot.ci() typically gives CIs around the original estimate, not around the estimate under the null hypothesis. I think it does this by doing something like resampling under the null and then adding a constant equal to the original estimate to the final vector of samples, but I'm not really sure yet. (That should work fine for the variance difference but not for the variance ratio, right?)

Now, the boot package comes with lots of nifty features in addition to those I just showed, so I'd like for my approach to the original problem to work seamlessly with boot() so that it will be really easy to get plots, CIs around the actual estimates (not just the null), etc. But I can't quite figure out what I need to change to do so. Right now my approach works fine but it's not really any better than if I had just written the resampling code all from scratch -- that is, it can't really take advantage of most of the features from the boot package. Thoughts?

#### Dason

I've never really been a fan of the boot package. I'm sure it could be nice but if I'm going to bootstrap I usually just code it myself...

#### trinker

##### ggplot2orBust
OPEN AND SEARCH THE WEB WITHIN [R]

THANK YOUS
Thank you to Dason and bryangoodrich for your help on this function. I appreciate both your insights and knowledge.

DESCRIPTION
I spend 70% of my computer time in R. Often I want to go to a web page (including facebook) and do not want to click open the browser and type in the url or find the web page in favorites. I'm what you might call efficient while others may call lazy. I want to do this from within R. This function opens a web page and/or searches a search engine from within R. You may, over a ten year period of your life, gain up to 3.5 days of wasted time clicking and searching. This function should be customized to suit your tastes and make it speedy for you. Plus it makes you super cool.

If you use emacs or something else this may not be the best approach for you.

CUSTOMIZING THE FUNCTION
The power of this function is in its ability to be customized by the user. You can create a shortcut (think of it like a favorites) for any website or search engine.

1) For websites simply add the following line to the body of the function:
customized.website.name = browseURL("the.url"),
NOTE: the shorter and easier to remember the customized.website.name is the better.

2) For search engine customization I am going to direct you to a thread where I learned how to customize the search from dason and bryangoodrich

Delete the preloaded websites I've included and add what you want to make this function efficient.

FUNCTION
web(site, search, url)

site- is the name of the website you want to visit
search- the search tem you want to search
url- the url you want to open (may include www. prefix but does not have to)

DIRECTIONS
1) To simply open up a website type the pre-customized website name into the web function in the following manor:
web(website.name) or web(site = website.name)

2) To search a search engine (or website with search capabilities) type the pre-customized search engine name into the web function in the following manor:
web(search.engine.name, search.term) or web(site = search.engine.name, search = search.term)
NOTE: the search term does not have to include quotation marks for a single word search but is required for multiple word searches

3) To open a URL that is not preloaded into the function use the following format:
web(URL, url = the.url) or web(site = URL, url = the.url)
NOTE: the first argument (site) must be specified as URL and then the url argument (must be explicitly written as url =)
NOTE: the www. prefix is optional

THE CODE
Code:
web <- function(site = "ts", search = NULL, url = NULL){
s <- substitute(site)
s <- as.character(s)

s2 <- substitute(search)
s2 <- as.character(s2)

s3 <- substitute(url)
s3 <- as.character(s3)

switch(s,

talkstats = browseURL("http://www.talkstats.com"),
ts = browseURL("http://www.talkstats.com"),    #talkstats shortcut
textbook = browseURL("http://udel.edu/~mcdonald/statintro.html"),
gmail = browseURL("http://www.gmail.com"),
hotmail = browseURL("http://www.hotmail.com"),
rseek = browseURL("http://www.rseek.org/"),
ucla = browseURL("http://www.ats.ucla.edu/stat/"),
ucla.ex = browseURL("http://www.ats.ucla.edu/stat/r/dae/"),  #examples from ucla
ucla.r = browseURL("http://www.ats.ucla.edu/stat/r/"),            #[R] section from ucla
rnews = browseURL("http://www.jstatsoft.org/"),
quickr = browseURL("http://www.statmethods.net/"),
prettycode = browseURL("http://www.inside-r.org/pretty-r/tool"),
social.research.methods = browseURL("http://www.socialresearchmethods.net/"),
srm = browseURL("http://www.socialresearchmethods.net/"),
stats.formulas = browseURL("http://stattrek.com/Lesson1/Formulas.aspx?Tutorial=Stat"),
formulas = browseURL("http://stattrek.com/Lesson1/Formulas.aspx?Tutorial=Stat"),
stats.dictionary = browseURL("http://stattrek.com/Help/Glossary.aspx"),
dictionary = browseURL("http://stattrek.com/Help/Glossary.aspx"),
dict = browseURL("http://stattrek.com/Help/Glossary.aspx"),
symbols = browseURL("http://stattrek.com/Lesson1/Notation.aspx"),
stats.symbols = browseURL("http://stattrek.com/Lesson1/Notation.aspx"),

#SEARCH ENGINES

g.s = browseURL(paste("http://www.google.com/#q=", s2, sep = "")),
rseek.search = browseURL(paste("http://www.rseek.org/?cx=010923144343702598753%3Aboaz1reyxd4&newwindow=1&q=",s2,"&sa=Search+functions%2C+lists%2C+and+more&cof=FORID%3A11&siteurl=www.rseek.org%2F#933",sep = "")),
r.s = browseURL(paste("http://www.rseek.org/?cx=010923144343702598753%3Aboaz1reyxd4&newwindow=1&q=",s2,"&sa=Search+functions%2C+lists%2C+and+more&cof=FORID%3A11&siteurl=www.rseek.org%2F#933",sep = "")),
wikipedia.search = browseURL(paste("http://en.wikipedia.org/wiki/",s2,sep = "")),
wiki.search = browseURL(paste("http://en.wikipedia.org/wiki/",s2,sep = "")),
w.s = browseURL(paste("http://en.wikipedia.org/wiki/",s2,sep = "")),
quickr.search = browseURL(paste("http://www.statmethods.net/search/index.asp?QU=",s2,"&Action=Search",sep = "")),
q.s = browseURL(paste("http://www.statmethods.net/search/index.asp?QU=",s2,"&Action=Search",sep = "")),
talkstats.search = browseURL(paste("http://www.google.com/#q=",s2,"+site:www.talkstats.com", sep = "")),
ts.s = browseURL(paste("http://www.google.com/#q=",s2,"+site:www.talkstats.com", sep = "")),
dictionary.search = browseURL(paste("http://www.merriam-webster.com/dictionary/",s2,sep = "")),
d.s = browseURL(paste("http://www.merriam-webster.com/dictionary/",s2,sep = "")),

#CODE FOR url ARGUMENT

URL = browseURL(if (substring(s3,1,4)=="www."){
paste("http://",s3,sep="")
}else{
paste("http://www.",s3,sep="")
}))
}
EXAMPLES
Code:
#=======================================
#SIMPLY CALLING A WEBSITE FROM WITHIN R
#=======================================
web(talkstats)
web(ts)
web(ucla.ex)
web(stats.symbols)

#=======================================
#SEARCHING THE WEB FROM WITHIN R
#=======================================
web(rseek.search, "discrimnant analysis") #spelled wrong

#=======================================
#SEARCHING THE WEB FROM WITHIN R
#=======================================
web(URL, url = talkstats.com)

#### trinker

##### ggplot2orBust
OPEN DOCUMENTS, FILES & PROGRAMS FROM WITHIN [R]

DESCRIPTION
This is a follow up to the previous time saving web() function I posted. In fact it could be incorporated into the web function as a single retrieve [ret()] function. I chose to keep them separate. This will, like the webfunction, be useful in cutting down time searching for documents, files, or programs when you don't want to leave R. It may or may not be useful to you depending on your use of R and familiarity with outher programs that may be better (such as emacs). This function must be customized to suit your tastes and make it speedy for you. This one saves a ton of time searching for and opening documents I use frequently. For me this was well worth my time.

CUSTOMIZING THE FUNCTION
You must enter the search paths for folders, documents, programs etc. into the body of the function. Because this function searches the file path of your computer your path will differ from other users's paths.

To customize alter the following code in the body:
file.name = shell.exec("file path"),

file.name- (what you will call the function in R)
For instance I have a folder for generalized linear models that I store articles i find useful. I name this location GLM for file.name.

file.path-This is where you put the target file path. Make sure it's in quotes.

THE FUNCTION
ret(file)

file = file, program, folder etc

THE CODE
Code:
ret <- function(file){

FILE <- substitute(file)
FILE <- as.character(FILE)

switch(FILE,

file.name = shell.exec("file path"),
#.
#.
#.
#etc.
file.name = shell.exec("file path"))
}
EXAMPLES
I can not provide an example because you must customize the function to your computer.
However my code looks something like the following but many more files in the body.

Code:
ret <- function(file){

FILE <- substitute(file)
FILE <- as.character(FILE)

switch(FILE,

anova = shell.exec("C:user/stats/linear models/anova"),
GLM = shell.exec("C:user/stats/generalized linear models"),
formulas = shell.exec("C:user/stats/formulas.doc"))
}

#### Dason

Although I don't think it will screw anything up I would suggest changing your variable name inside of your function. Avoiding variable names such as "F", "T", "c", "I", "q" is just a good idea.

There's technically nothing stopping you from using them but it's just good practice to try to avoid them.

I think it's especially dangerous to use F or T though. If you're feeling especially malicious if somebody has an open R session and isn't paying attention try running the following code and watch them wonder why everything is acting weird...
Code:
T <- FALSE
F <- TRUE

#### trinker

##### ggplot2orBust
Excellent point Dason. That oversight has gotten me in trouble before when running a function line by line (I assigned a value to t which is a very necessary object in R). I am going to make changes to the code to reflect this. Thanks.

#### trinker

##### ggplot2orBust
OPEN PACKAGE/LIBRARY HELP MANUALS FROM WITHIN [R]

DESCRIPTION
Often I'm in R and want to look at a package's help manual (pdf). The call to:
Code:
library(help="package.name")
opens a summary file from R that does not provide a great deal of detail. I searched for a function to go get the manuals without physically searching the Internet. help.start() was the closest thing but did not get the pdf help manual. I asked on r-help site and there appears to be no function for doing this. Maybe the rest of the world isn't as lazy as me. So using what Dason taught me I made a simple function to get help manuals for libraries.

I think this would be a nice addition to the .First() function in the startup

THE FUNCTION
man(library, method=web)

library- the package name you want the manual for
method- 1) web (Internet connection needed but faster)
2) system (No Internet needed as it pulls from the system Latex file but slower; searches computer for pdf first)
3) sytem.safe (faster than system but only searches the wd before creating the pdf from LATEX; use if system fails)

THE CODE
Code:
man <- function(library, method=web){

LIB <- substitute(library)
LIB <- as.character(LIB)

METH <- substitute(method)
METH <- as.character(METH)

switch(METH,
web = {if(LIB == "utils"){
shell.exec("http://stat.ethz.ch/R-manual/R-patched/library/utils/html/00Index.html")
}
if(LIB == "base"){
shell.exec("http://stat.ethz.ch/R-manual/R-patched/library/base/html/00Index.html")
}else{
browseURL(paste("http://cran.r-project.org/web/packages/",LIB,"/",LIB,".pdf", sep = ""))
}
},

system.safe = if (file.exists(paste(getwd(),"/",LIB,".pdf",sep=""))==TRUE){
shell.exec(paste(getwd(),"/",LIB,".pdf",sep=""))
}else{
path <- find.package(LIB)
system(paste(shQuote(file.path(R.home("bin"), "R")),"CMD", "Rd2pdf",shQuote(path)))
},

[B]  system = {x <- dir("C:/", pattern=paste(LIB,".pdf",sep=""), full.names=T, ignore.case=T, recursive=T)[/B]
if(length(x)>1){
x <- x[num]
}else{
x
}
if (file.exists(x)==TRUE){
shell.exec(x)
}else{
path <- find.package(LIB)
system(paste(shQuote(file.path(R.home("bin"), "R")),"CMD", "Rd2pdf",shQuote(path)))
}
}
)
}
NOTE: Using the system method will create a pdf of the helpfile in your working directory. This can be deleted from R using
unlink("packageName.pdf") #for package name you would actually write the package name ex: unlink("plyr.pdf")

EXAMPLE
Code:
man(car)
man(plyr, method=web)
man(plyr, method=system)
unlink("plyr.pdf")  # If you generated a LATEX to pdf manual this deletes the newly created pdf from your wd
NOTE: This could also be added to the web function listed above.

Thanks to prof. Brian Ripley for his help on using the Latex files to generate a pdf help manual(LINK)

EDIT: TO speed up this function insert your computer name and more of the file directory into this line of the code:
Code:
system = {x <- dir("C:/", pattern=paste(LIB,".pdf",sep=""), full.names=T, ignore.case=T, recursive=T)
#As in...
system = {x <- dir("C:/trinkercomputer/user/", pattern=paste(LIB,".pdf",sep=""), full.names=T, ignore.case=T, recursive=T)

Last edited:

#### Dason

I'm not a huge fan of those I guess. They just seem like mashing together the help file for the package and all of the help files for the functions. Is there something useful in them that I'm missing?

#### trinker

##### ggplot2orBust
I like the display of all functions and data sets on the left side of the pdf. Sometimes I remember a certain package does certain things (like psch does psychometric) and I'm just curious about psychometric functions related to what I'm doing. For me they're a big learning tool. I remember going through the base help manual or plyr or plotrix and just being amazed by all the functions. It was like Christmas.