Today I Learned: ____

bryangoodrich

Probably A Mammal
@trinker, what are you trying to do? If you just want to add formatting, I'd make a wrapper for using sprintf that you can apply to your frames or lists. The function itself allows a lot of flexibility in controlling output. You can paste together any parameters you want to send to the function before you call it. Though, I think the format or related functions also have parameters for spacing and alignments.
 

Dason

Ambassador to the humans
mp3 would be awesome.
Mp3s aren't the easiest to get playing through R. On Windows you can use the tuneR package and it's not too bad to get one playing. On a mac you need to specify some options. And I've never got tuneR to work with Linux but you could just create a system command to play the file if you know what you're doing.
 

Dason

Ambassador to the humans
TIL: melt (from reshape) works on multidimensional arrays. This actually makes a couple of the difficulties I've had with multidimensional arrays go away (along with me learning to think about apply in more than 1 dimension).
 

noetsi

Fortran must die
That we are going to monitor things that show little variation, while ignoring those key sources of variation. And not be able to do anything if we find problems in our study - because the contracts are written that way.

Lesson to be learned. Talk to geeks (or semi-geeks like me) before you create long term contracts.
 

bryangoodrich

Probably A Mammal
TIL: My R has native support for making SVG images. You can check it yourself by opening R and typing capabilities()['cairo']. If it returns TRUE, then you can render vector graphics (including SVG and PostScript). While this makes it very easy to reuse high quality images very easily from R, I'm interested in the interactive and dynamic nature of SVG I saw the other day, thanks to Trinker. So in searching for dynamic SVG I stumble across OmegaHat: SVGAnnotation. Unlike some other SVG libraries you can use from R, the direction SVGAnnotation takes is to post-process your svg results by pulling them back into R and adding the extra capabilities. I still need to learn more about the SVG specification and how they render interactive displays, but I'm confident OmegaHat will have done a good job beginning to display how that is done. For instance, their paper points out that SVG are nothing but an XML document that can be edited to add new specifications to the SVG document. Apparently one can include JavaScript to further make it HTML capable of being dynamic when loaded into an appropriate vehicle (e.g., the web browser).

Seriously, check out the examples and their code. They're not too complicated. I'm totally impressed with this, and I want to try and make use of in my GIS project now if I can (e.g., maybe plot a map and add tool tips or a dynamic slider to switch through years). To get the most out of it, you're probably going to have to learn a little HTML, CSS, and JavaScript, but it is well worth it. If you want to do anything with web publishing today, you need to know those basics (PHP helps, too, for server-side scripting). If I have the time, expect to see this on my GIS project by the end of this semester.
 

trinker

ggplot2orBust
TIL: You can use negative indexing with lapply/sapply/apply etc.

as in:
Code:
lapply(1:10, function(i) (dataframe[ ,-i])
Yeah thus seems obvious but I never thought of a reason to use it until my psychometrics HW last night. I'll provide you with two examples 1) a short example and the actual psychometric situation I used it in (often in psychometrics you do item if deleted stats)

SIMPLE EXAMPLE:
Code:
set.seed(13)
x <- sample(1:100, 15)
y <- sapply(seq_along(x), function(i) mean(x[-i]))
y
# this gives us a vector of means for x as 
# each element is removed from the vector
REAL USE
Code:
set.seed(15)
dat <- data.frame(matrix(sample(0:1, 25, TRUE), 5, 5))
rownames(dat) <- 1:5; names(dat) <- paste("item", 1:5, sep="")
require(CTT)
CTT2 <- reliability(dat) 

DF <- data.frame(item.num = paste("item", 1:5, sep=""), 
    adj_cron_alpha = CTT2$alpha.if.deleted,
    adj_tot_cor = CTT2$pbis)

names(CTT2)#gives lots of goodies but not the scale mean or sd if deleted for each item
#FIX (first create function(s) that feeds the data set gets the sd & mean stats)

mean.if.deleted <- function(x){
    y <- reliability(x)
    c(y$scale.mean)
}

sd.if.deleted <- function(x){
    y <- reliability(x)
    c(y$scale.sd)
}

#now the use of negative indexing to copute a new data set with out each item and feed it to the functions
DF$mean.if.delete <- sapply(1:length(dat), function(x) mean.if.deleted(dat[, -x]))
DF$sd.if.delete <- sapply(1:length(dat), function(x) sd.if.deleted(dat[, -x]))

DF
I thought it was pretty cool and thought I'd share since I typically see the apply family used to feed a vector/data.frame/matrix of values and not indexes (learned the feeding of indexes from Dason a few weeks back). Just a different use of the apply functions
 

Jake

Cookie Scientist
At trinker's request:
Today I recalled that you can use stack() to concatenate two vectors into a data frame that includes a factor column indicating which vector the values came from. For example:
Code:
> (groupA <- rnorm(5))
[1] -0.9445818  0.9587041 -0.5147924 -0.9498970 -0.5534835
> (groupB <- rnorm(5))
[1]  0.08415508  0.14757817 -1.29038128  0.74918147  0.04405623
> stack(list(A = groupA, B = groupB))
        values ind
1  -0.94458179   A
2   0.95870412   A
3  -0.51479240   A
4  -0.94989702   A
5  -0.55348354   A
6   0.08415508   B
7   0.14757817   B
8  -1.29038128   B
9   0.74918147   B
10  0.04405623   B
 

Dason

Ambassador to the humans
TIL: Another way to convert from a list into a dataframe.

Code:
## Making babies - or fake data... I forget.
j <- lapply(1:10, rnorm, n=4)

## My typical way of flattening into
## a matrix/dataframe
do.call(rbind, j)

## Can do this with plyr relatively easily too
library(plyr)
## Either using I
ldply(j, I)
## or just returning the vector...
ldply(j, function(x){x})
 

Dason

Ambassador to the humans
I like the do.call approach. I'm glad you've drilled it into me enough times, I see it as a viable option now when I'm dealing with a list.
I do too. It's much faster and doesn't rely on plyr. But I saw my professors using the other method I mentioned (the second version though - where you just return x instead of using I) and I guess it just never occurred to me that this was another possible way to do it.

Now which do you prefer more to get the nth element of each element in a list of vectors:

Code:
## Making babies - or fake data... I forget.
j <- lapply(1:10, rnorm, n=4)

# This one?
do.call(rbind, j)[,1]
# Or this guy.
sapply(j, function(x){x[1]})
 

bryangoodrich

Probably A Mammal
Probably the sapply approach. The do.call is like taking two steps to do the same thing: make the frame and then grab the field you want. The sapply approach operates on each element itself, grabbing the nth value you want, and then puts it together into a vector. At scale, I'm not sure which would be faster. If do.call is, I'd probably opt for that approach then!
 

Dason

Ambassador to the humans
I also prefer the sapply version. I haven't tested the speed issue but I partially like it more because it'll be more likely to complain if something goes wrong. The rbind approach will recycle vectors and you might not get what you expect if the list has vectors of different lengths.
 

trinker

ggplot2orBust
TIL: Dason referenced this in the R Learning Project Thread

As a windows user often packages aren't available for download as zip files. I thought I had to go find special versions of packages else where. Nope:

Code:
install.packages("PATH/TO/THE/.tar.gz/FILE/LIKE:/SVGAnnotation.tar.gz", repos=NULL, type="source")
 

bryangoodrich

Probably A Mammal
TIL: Dason referenced this in the R Learning Project Thread

As a windows user often packages aren't available for download as zip files. I thought I had to go find special versions of packages else where. Nope:

Code:
install.packages("PATH/TO/THE/.tar.gz/FILE/LIKE:/SVGAnnotation.tar.gz", repos=NULL, type="source")
Yeah, but does it compile? Do you need to have a C compiler in your PATH?
 

Dason

Ambassador to the humans
Windows users can use type="source" if there is no C/C++ code in the package no matter what. If there is extra code then they need Rtools.
 

trinker

ggplot2orBust
You can increase your memory limits in R. First check them with:

Code:
memory.limit()  # for windows users or
mem.limits()     #on other platforms
And then use the size argument to increase the limit ?memory.limit is very informative

Code:
memory.limit(size=3400)