[R Graphics] Beautiful graphics thread

Win

New Member
#61
3D Interactive Plot

Here's a 3D interactive plot I like from the ebook "R Fundamentals & Graphics". I've just started learning BLENDER and want to see if I can transfer these 3D images to BLENDER and save it as a .COLLADA file. Has anyone ever done that? Apple has a software called iBook Author which can accommodate interactive 3D images in COLLADA format in ebooks. Their promo video is really cool!

Code:
library(rgl)
x <- seq(-3,3, length=50)
y <- x
f <- function (x,y) {x*exp (-x^2-y^2)}
z <- outer (x,y,f)
plot3d (x,y,z, box=F, axes=F, xlab="",ylab="",zlab="")
surface3d (x,y,z,col=heat.colors(30))
 

trinker

ggplot2orBust
#65
Just for fun. I saw this image (from: http://www.pewinternet.org/2015/01/.../pi_2015-01-29_science-and-society-00-01/)and wanted to replicate it as an exercise.



Here's my result:



I provided the data beolw. See if you can do better. I used ggplot2 + Inkskape for some touchups. I left the code.

Code:
dat <- data.frame(
    freq = c(37, 88, 47, 89, 28, 68, 65, 98, 68, 86, 50, 87, 59, 82, 45, 65, 52, 32, 59, 47, 68, 78, 39, 31, 64, 68)
)

dat[["sector"]] <- rep(c("U.S. adults", "AAAS scientist"), nrow(dat)/2)

field <- c("Biomedical sciences", "Climate, energy, space sciences")
dat[["field"]] <- factor(rep(field, c(10, 16)), levels=field)

items <-  c(
    "Safe to eat genetically\nmodified foods", 
    "Favor the use of animas\nin research", 
    "Safe to eat foods grown\nwith pesticides",  
    "Humans have evolved over\ntime",  
    "Childhood vaccines such\nas MMR should be required  ",  
    "Climate change is mostly\ndue to human activity",  
    "Growing world population\nwill be a major problem",  
    "Favor building more\nnuclear power plants",  
    "Favor more offshore\ndrilling",  
    "Astronaughts essential for\nfuture of U.S. space program",  
    "Favor increased use of\nbioengineered fuel",
    "Favor increased use of\nfracking",  
    "Space station has been\na good investment for U.S."
)


dat[["Item"]] <- factor(rep(items, each=2), levels=rev(items))

library(dplyr)

diffs <- dat %>%
    group_by(Item, field) %>%
    summarise(
        diff = abs(diff(freq)),
        mid = mean(freq),
        xmin = min(freq),
        xmax=max(freq)
    )


ext <- rep("", nrow(diffs))
ext[c(5, 13)] <- "point gap"

ext2 <- rep("", nrow(diffs))
ext2[c(5, 13)] <- "%"

#amt <- 2
amt <- 1.5

library(ggplot2)

ggplot(dat, aes(y = Item, x=freq)) +
    geom_segment(data=diffs, aes(x=xmin, xend=xmax, y=Item, yend=Item), color="royalblue2", size=5.5) +
    geom_point(color = "navy", size=5.5) +
    geom_point(shape=21, color = "navy", size=4.8, aes(fill=sector)) +
    facet_grid(field~., space="free", scales="free") +
    scale_fill_manual(values = c("white", "darkblue"), guide=FALSE) +
    theme_bw() +
    coord_cartesian(xlim = c(24, 102)) +
    ylab(NULL) + xlab(NULL) +
    geom_text(data = diffs, color = "white", 
        aes(label = paste(diff, ext), x=mid, y=Item), 
        size=3, vjust = .4, hjust=.2) +
    geom_text(data = diffs, color = "grey60", aes(label = paste0(xmin, ext2), x=xmin -amt, y=Item), 
        size=3, vjust = .4, hjust=1) +     
    geom_text(data = diffs, color = "grey60", aes(label = paste0(xmax, ext2), x=xmax + amt, y=Item), 
        size=3, vjust = .4, hjust=0)   +
    theme(
        axis.text.y = element_text(hjust=0),
        axis.text.x = element_blank(),
        axis.ticks = element_blank(),
        panel.grid.minor.x = element_blank(),
        panel.grid.major.x = element_blank(),
        panel.grid.major.y = element_blank(),
        panel.border = element_blank(),
        strip.background = element_rect(color=NA, fill="grey75"),
        strip.text.y = element_text(size=12, face="bold", hjust=.05),
        panel.margin = unit(1, "lines")
    ) + 
    geom_hline(color="grey60", linetype="dotted", size=.7,  aes(yintercept=.5))
 

hlsmith

Omega Contributor
#70
Good work. However, I am not a fan of the figure - it is kind of confusing or not immediately readable. It needs the full scale 0-100 listed some where relative to the values.

Perhaps you can also see what happens if you color code the magnitude of the gaps based on say yellow to red continuum, instead of all being blue.
 

trinker

ggplot2orBust
#71
I can appreciate your thoughts on Pew's figure. It could probably use some upgrades. I'm not sure I agree with the points you raise. These is my thinking and justification. Feel free to disagree. I love chatting around these issues and debating them as I have mostly qualitative people in my department and these aren't the issues they discuss so the debate is usually in my head :)

hlsmith said:
It needs the full scale 0-100 listed some where relative to the values.
The scale is implicit with the inclusion of percentages. Whenever we can get rid of an axis scale and put it with the geom our eyes are doing less back and forth. I like that the scale info is on the geom as it gives precise values like a table would but it's in grey so it's not primary in where our eyes go. We have to focus attention to see these. The percentages are only done once in each multiple to avoid extra ink. I liked that. Related, I liked the direct labels of Scientist and Us adults. This again reduces back and forth looking up of labels.

hlsmith said:
Perhaps you can also see what happens if you color code the magnitude of the gaps based on say yellow to red continuum, instead of all being blue.
I can see your point on the gradient but this information is already coded in the length between the points. Adding a gradient would probably aid in comparison but the best preattentive attribute we have for comparing continuous values is length. Color gradient is typically a poorer choice, particularly with smaller geoms like points. I think because the segments have different zero values comparison is not as easy and a gradient may improve this. It also runs the risk of distracting from other trends shown that are already using color (i.e., the way the scientists and people flop on 3 items; this is shown with color on the point and a gradient may distract from this).

I liked it because it minimized the data ink ratio without compromising information presented. This blog post on data ink I thought was pretty powerful: http://darkhorseanalytics.com/blog/data-looks-better-naked/
 

hlsmith

Omega Contributor
#72
I can tell you like your graph. I had not notice the flip-flopping of the group orders and the respective differences. That is confusing as well, I may not have even noticed that it happened in the same categories. Perhaps you should use a different color when the groups flip-flop. You did a good job copying the figure, now you need to improve it. The original could not have been perfect or without any room for improvement - now is the time to maximize readability.

Yeah, I have no one at my work to bounce ideas off of, I can appreciate your situation!
 

trinker

ggplot2orBust
#73
hlsmith said:
now you need to improve it
Agreed. This sounds like a challenge. Anyone that wants to submit an idea I included the data and as a csv for non R users as well.

I was thinking a bumpchart may show the trends better.

View attachment 5081

Code:
dat <- structure(list(freq = c(37, 88, 47, 89, 28, 68, 65, 98, 68, 86, 
    50, 87, 59, 82, 45, 65, 52, 32, 59, 47, 68, 78, 39, 31, 64, 68
    ), sector = c("U.S. adults", "AAAS scientist", "U.S. adults", 
    "AAAS scientist", "U.S. adults", "AAAS scientist", "U.S. adults", 
    "AAAS scientist", "U.S. adults", "AAAS scientist", "U.S. adults", 
    "AAAS scientist", "U.S. adults", "AAAS scientist", "U.S. adults", 
    "AAAS scientist", "U.S. adults", "AAAS scientist", "U.S. adults", 
    "AAAS scientist", "U.S. adults", "AAAS scientist", "U.S. adults", 
    "AAAS scientist", "U.S. adults", "AAAS scientist"), field = structure(c(1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("Biomedical sciences", 
    "Climate, energy, space sciences"), class = "factor"), Item = structure(c(13L, 
    13L, 12L, 12L, 11L, 11L, 10L, 10L, 9L, 9L, 8L, 8L, 7L, 7L, 6L, 
    6L, 5L, 5L, 4L, 4L, 3L, 3L, 2L, 2L, 1L, 1L), .Label = c("Space station has been\na good investment for U.S.", 
    "Favor increased use of\nfracking", "Favor increased use of\nbioengineered fuel", 
    "Astronauts essential for\nfuture of U.S. space program", "Favor more offshore\ndrilling", 
    "Favor building more\nnuclear power plants", "Growing world population\nwill be a major problem", 
    "Climate change is mostly\ndue to human activity", "Childhood vaccines such\nas MMR should be required  ", 
    "Humans have evolved over\ntime", "Safe to eat foods grown\nwith pesticides", 
    "Favor the use of animas\nin research", "Safe to eat genetically\nmodified foods"
    ), class = "factor")), .Names = c("freq", "sector", "field", 
    "Item"), row.names = c(NA, -26L), class = "data.frame")
 

trinker

ggplot2orBust
#74
I decided a bar plot may be best here. I toned down colors all together and focused on the differences between scientists and the US people. TO me that's the narrative. As such I ordered the Items by larges difference. I didn't go with direct percent labels this time to draw attention to the geoms.



All in ggplot2 + grid/gridExtra:

Code:
library(dplyr) 

sect_dat <- split(dat, dat[["sector"]])


diffs <- dat %>%
    group_by(Item, field) %>%
    summarise(
        diff = -diff(freq),
        mid = mean(freq),
        xmin = min(freq),
        xmax=max(freq)
    )

dat[["Item"]] <- factor(as.character(dat[["Item"]]), 
    levels = as.character(unlist(diffs[order(diffs[["diff"]]), "Item"])))

gr1 <- ggplot(data = dat) + 
    geom_bar(data=subset(sect_dat[[1]], field == "Climate, energy, space sciences"), 
        aes(x=Item, weight=freq), fill="grey90") +
    geom_bar(data=subset(sect_dat[[2]], field == "Climate, energy, space sciences"), 
        aes(x=Item, weight=freq), width=.5, fill="grey70") + 
    coord_flip(ylim = c(0, 100)) +
    scale_y_continuous(breaks=c(0, 25, 50, 75, 100), labels=function(x) paste0(x, "%")) +
    theme_minimal() + xlab(NULL) + ylab(NULL) +
    theme( 
        axis.ticks = element_blank(),
        panel.grid.minor = element_blank(),
        panel.grid.major.y = element_blank(),
        axis.text.y = element_text(hjust=0),
        title = element_text(hjust=0, color="grey40"),
        axis.text = element_text(color = "grey50"),
        plot.margin=unit(c(-2,6,3,1),"mm")
    ) + ggtitle("Climate, energy, space sciences")

gr2 <- ggplot(data = dat) + 
    geom_bar(data=subset(sect_dat[[1]], field != "Climate, energy, space sciences"), 
        aes(x=Item, weight=freq), fill="grey90") +
    geom_bar(data=subset(sect_dat[[2]], field != "Climate, energy, space sciences"), 
        aes(x=Item, weight=freq), width=.5, fill="grey70") + 
    coord_flip(ylim = c(0, 100)) +
    scale_y_continuous(breaks=c(0, 25, 50, 75, 100), labels=function(x) paste0(x, "%")) +
    theme_minimal() + xlab(NULL) + ylab(NULL) +
    theme( 
        axis.ticks = element_blank(),
        panel.grid.minor = element_blank(),
        panel.grid.major.y = element_blank(),
        axis.text.y = element_text(hjust=0),
        title = element_text(hjust=0, color="grey40"),
        axis.text.x = element_blank(),
        axis.text = element_text(color = "grey50"),
        plot.margin=unit(c(0,6,0,1),"mm")
    ) + ggtitle("Biomedical sciences") +
    annotate("text",  x = 5, y = 2, hjust = 0, size=3.75, 
        label = "U.S. adults", color = "grey95")+ 
    annotate("text",  x = 5, y = 85, hjust = 1, size=3.75, 
        label = "AAAS scientist", color = "grey50")

library(gridExtra)
library(plotflow)

gA <- ggplotGrob(gr1)
gB <- ggplotGrob(gr2)
maxWidth <- unit.pmax(gA$widths[2:5], gB$widths[2:5])
gA$widths[2:5] <- as.list(maxWidth)
gB$widths[2:5] <- as.list(maxWidth)
grid.arrange(gB, gA, heights=c(.38, .62))
 

trinker

ggplot2orBust
#76
That's a Wickham specific term from ggplot2, his extension of Wilkinson's Grammar of Graphics. It's the geometric shape that represents the data. Attributes of that geometric shape (e.g., color, fill, alpha, shape) are the visible aspects that can encode information or be used to highlight.

So bars, lines, points, etc: http://docs.ggplot2.org/current/
 
#80
In the process of trying to understand a particle swarm optimization algorithm I ended up coding one.
It makes for a neat visualization, so I thought I would share it here.

Code:
################################################################################
## Experiment with PSO
################################################################################

## make data
mu <- 5
sigma <- 10

Y <- rnorm(1000,mu,sigma)

## make likelihood function
LL <- function(param){

    tryCatch(
        -sum(dnorm(Y,param[1],param[2],log=TRUE)),
        error=function(e) -log(0)
)
}


## upper and lower bounds in 2 dimensions
alpha <- c(-100,0)
omega <- c(100,100)

## initial swarm position
theta <- c(1,1)

## velocity scaler
beta <- 0.001
## partical independence and swarm dependence weights
phi <- c(0.015,0.085)


### plot likelihood landscape
plot(0,0,type="n",xlab="mu",ylab="sigma",
     xlim=c(-100,100),ylim=c(0,100))
abline(v=mu,h=sigma)
legend("topright",legend=c("partical","swarm's current best", "solution"),
       pch=c(20,3,4),col=c("blue","grey","red"),bty="n")

## Partical swarm optimizer
## @param S number of particles (swarm size)
## @param N maximum number of iterations
PSO <- function(LL,theta,alpha,omega,beta,phi,S=10,N=100){

    G <- theta
    K <- length(theta)
    V <- B <- P <- matrix(ncol=K,nrow=S)
       

    ## initialize 
    for(p in 1:S){
        P[p,] <- runif(K,alpha,omega)
        V[p,] <- runif(K,-abs(omega-alpha),abs(omega-alpha))
        if(LL(P[p,])<LL(G)) G <- P[p,]
    }

    ## set initial values as best position particles
    B <- P

    ## plot initial positons
    points(P,pch=20,col=rgb(0,0,1,alpha=0.8))
    
    ## start actual swarm
    for(i in 1:N){
        I <- P
        for(p in 1:S){
            ## movement vector
            rp <- runif(2,0,1)
            rg <- runif(2,0,1)
            M <- beta*V[p,]+phi[1]*rp*(B[p,]-P[p,])+phi[2]*rg*(G-P[p,])
            P[p,] <- P[p,] + M
            if(LL(P[p,])<LL(B[p,])) B[p,] <- P[p,]
            if(LL(P[p,])<LL(G)) G <- P[p,];print(G)
            points(G[1],G[2],col="grey",pch=3)
            }

        ## plot updated positons
        for(ss in 1:S){
        segments(I[ss,1],I[ss,2],P[ss,1],P[ss,2],
                 ,lwd=2,col=rgb(0,0,1,alpha=0.24))
        }
    Sys.sleep(0.1)
    }

    points(G[1],G[2],pch=4,col="red",cex=2)
    
}



PSO(LL,theta,alpha,omega,beta,phi,S=100,N=50)

Should work if you copy and paste!