Including Five-Number Summaries on a Side-By-Side Boxplots Graph

Rstatshelp

New Member
Hi,

I am having difficulty finding out how to include five-number summaries on a graph of side-by-side boxplots. I would like to include a five-number summary for each of 2 boxplots on one graph.
I was wondering if someone could please assist me. Thank you in advance for any help you maybe able to provide.

If it is helpful, the code I have used so far is as follows:

Code:
> cheese.matrix <- matrix (
+ c(310,220,420,240,45,180,40,90,270,130,180,260,250,340,290,310),
+ nrow=8,
+ ncol=2)
> boxplot.cheesedata <- boxplot(cheese.matrix, use.cols = TRUE, horizontal = TRUE, xlab = "Sodium Content", main = “Sodium Content of Two Types of Cheese", names=c("Real Cheese", "Cheese Substitute"))

Rstatshelp

New Member
Forgive me... Perhaps a better way for me to have posted the code in the original post would have been:

Code:
cheese.matrix <- matrix(c(310,220,420,240,45,180,40,90,270,130,180,260,250,340,290,310), nrow=8, ncol=2)
boxplot.cheesedata <- boxplot(cheese.matrix, use.cols = TRUE, horizontal = TRUE, xlab = "Sodium Content", main = "Sodium Content of Two Types of Cheese", names=c("Real Cheese", "Cheese Substitute"))

GretaGarbo

Human
What are the five numbers? And where do you want them in the graph?

Rstatshelp

New Member
The five numbers, for each boxplot, would be:
*the minimum/smallest data value
*Q1 (the lower quartile)
*Q2 (the median)
*Q3 (the upper quartile)
*the maximum/largest data value

So, for the first/bottom boxplot labeled "Real Cheese" (taking data values 310, 220, 420, 240, 45, 180, 40, and 90) the five numbers I came up with were:
*the minimum/smallest data value = 40
*Q1 (the lower quartile) = 67.5 (that is: (45+90)/2)
*Q2 (the median) = 200 (that is: (180+220)/2)
*Q3 (the upper quartile) = 275 (that is: (240+310)/2)
*the maximum/largest data value = 420

For the second/top boxplot labeled "Cheese Substitute" (taking data values 270, 130, 180, 260, 250, 340, 290, and 310), the five numbers I came up with were:
*the minimum/smallest data value = 130
*Q1 (the lower quartile) = 215 (that is: (180+250)/2)
*Q2 (the median) = 265 (that is: (260+270)/2)
*Q3 (the upper quartile) = 300 (that is: (290+310)/2)
*the maximum/largest data value = 340

I would ideally like each of the 5 numbers for each five-number summary to be placed directly above the vertical line of the respective boxplot corresponding to the given data value.

Thanks in advance for any help...

gianmarco

TS Contributor
To keep it simple, I would use annotations in the subtitle zone.

C-like:
cheese.matrix <- matrix(c(310,220,420,240,45,180,40,90,270,130,180,260,250,340,290,310), nrow=8, ncol=2)
A <- 10 # calculate here for instance your 1st quartile
B <- 20 # calculate here other statistics
subtitle <- paste0("Here some info: ", A, "\nhere some other info with the n used to skip to the next row: ", B)
boxplot.cheesedata <- boxplot(cheese.matrix, use.cols = TRUE, horizontal = TRUE, xlab = "", main = "Sodium Content of Two Types of Cheese", names=c("Real Cheese", "Cheese Substitute"), cex.main=0.8, sub=subtitle, cex.sub=0.7)
Hope this helps
Gm

Attachments

• 34.6 KB Views: 6

Rstatshelp

New Member
Thank you GretaGarbo and gianmarco for your comments and assistance.
It took much(!) effort, as I am new to R and statistics in general, but I was able to achieve that which I was looking to accomplish by doing the following:

Code:
cheese.matrix.final <- matrix(c(310,220,420,240,45,180,40,90,270,130,180,260,250,340,290,310), nrow=8, ncol=2)
boxplot.cheesedata.final <- boxplot(cheese.matrix, use.cols = TRUE, horizontal = TRUE, xlab = "Sodium Content", main = "Sodium Content of Two Types of Cheese", names=c("Real Cheese", "Cheese Substitute"))
text(x = boxplot.cheesedata.final$stats - .5, y = col(boxplot.cheesedata.final$stats) + .5, labels = boxplot.cheesedata.final\$stats)
Thanks again.

Attachments

• 122.8 KB Views: 4