Making Heat Maps In R
November 15, 2016
Amanda Birmingham (abirmingham at ucsd.edu)
Heat maps are a staple of data visualization for numerous tasks, including differential expression analyses on microarray and RNA-Seq data. Many people have already written heat-map-plotting packages for R, so it takes a little effort to decide which to use; here I investigate the performance of the six that I found referenced most frequently online.
My main goals (YMMV) beyond basic plotting were to be able to (a) annotate rows and columns with metadata information, (b) include scales and labels in the figure itself (since often figures are reused in presentations without caption information), and (c) do as much label customization as possible with the shallowest learning curve. I also want automatic dendrogram creation, so using ggplot2 or another graphics-only package was out. Note that throughout I have accepted the default colors for every heat map tool, as these are pretty easy to change after the fact if you care.
TL;DR: I recommend using heatmap3 (NB: not “heatmap.3”). It mimics the easy-to-use interface of heatmap.2 but can be extended into more complex settings if you later find you need fine-grained control.
Table of Contents¶
Set-Up¶
This blog post is adapted from a Jupyter Notebook and supporting files that you can download from GitHub and run yourself if you have Jupyter Notebook server with an R kernel installed.
All test “data” used here are just random numbers 🙂
# This line prevents SVG output, which does not play well with export to HTML
options(jupyter.plot_mimetypes = c("text/plain", "image/png" ))
# Load the example "data"
gLogCpmData = as.matrix(read.table("heatmap_test_matrix.txt"))
gLogCpmData
# Load the example annotation/metadata
gAnnotationData = read.table("heatmap_test_annotation.txt")
gAnnotationData
# Make helper function to map metadata category to color
mapDrugToColor<-function(annotations){
colorsVector = ifelse(annotations["subject_drug"]=="MiracleDrugA",
"blue", ifelse(annotations["subject_drug"]=="MiracleDrugB",
"green", "red"))
return(colorsVector)
}
# Test heatmap with column annotations
testHeatmap<-function(logCPM, annotations) {
sampleColors = mapDrugToColor(annotations)
heatmap(logCPM, margins=c(5,8), ColSideColors=sampleColors)
}
testHeatmap(gLogCpmData, gAnnotationData)
install.packages("gplots")
library(gplots)
# Test heatmap.2 with column annotations and custom legend text
testHeatmap2<-function(logCPM, annotations) {
sampleColors = mapDrugToColor(annotations)
heatmap.2(logCPM, margins=c(5,8), ColSideColors=sampleColors,
key.xlab="log CPM",
key=TRUE, symkey=FALSE, density.info="none", trace="none")
}
testHeatmap2(gLogCpmData, gAnnotationData)
I turned off a few of the default options (density.info, trace) to make the graphic a bit less busy. The default main legend is nice, but I don’t see an option to include a legend for the annotation information.
aheatmap¶
aheatmap, which stands for “annotated heatmap”, is a heat map plotting function from the NMF package:
install.packages("NMF")
library(NMF)
# Test aheatmap with column annotations
testAheatmap<-function(logCPM, annotations) {
aheatmap(logCPM, annCol=annotations[
"subject_drug"])
}
testAheatmap(gLogCpmData, gAnnotationData)
Yay, legends for both the main data and the annotations! However, note that something weird is going on here: The dendrograms aren’t showing up right. It appears that somehow the body of the heatmap is overlapping with the finer levels of the dendrograms at both top and left. There may be a way to fix this by digging further into the settings of aheatmap, but since I’m looking for something easy to use out-of-the-box, I consider this a disqualifier for my usage.
install.packages("pheatmap")
library(pheatmap)
# Test pheatmap with two annotation options
testPheatmap<-function(logCPM, annotations) {
drug_info = data.frame(annotations[,"subject_drug"])
rownames(drug_info) = annotations[["sample_name"]]
# Assign the column annotation straight from
# the input annotation dataframe
pheatmap(logCPM, annotation_col=drug_info,
annotation_names_row=FALSE,
annotation_names_col=FALSE,
fontsize_col=5)
# Assign the column annotation to an intermediate
# variable first in order to change the name
# pheatmap uses for its legend
subject_drug = annotations[["subject_drug"]]
drug_df = data.frame(subject_drug)
rownames(drug_df) = annotations[["sample_name"]]
pheatmap(logCPM, annotation_col=drug_df,
annotation_names_row=FALSE,
annotation_names_col=FALSE,
fontsize_col=5)
}
testPheatmap(gLogCpmData, gAnnotationData)
Again, nice to have legends for both main and annotation information. Note that:
- You control the annotation legend title through the variable name, which I consider suboptimal as variable names often do not read nicely as English text.
- The function uses row names to match annotations to data, so all data and annotations must be contained in dataframes (not matrices).
heatmap3¶
heatmap3 is the central function of the heatmap3 package. Beware that this is different from “heatmap.3”, of which there are numerous versions (e.g., here, here, and here–apparently a lot of people felt heatmap.2 needed an upgrade! I don’t investigate these others here because I haven’t seen them discussed online by users very often.)
install.packages("heatmap3")
library(heatmap3)
# Test heatmap3 with several annotation options
testHeatmap3<-function(logCPM, annotations) {
sampleColors = mapDrugToColor(annotations)
# Assign just column annotations
heatmap3(logCPM, margins=c(5,8), ColSideColors=sampleColors)
# Assign column annotations and make a custom legend for them
heatmap3(logCPM, margins=c(5,8), ColSideColors=sampleColors,
legendfun=function()showLegend(legend=c("MiracleDrugA",
"MiracleDrugB", "?"), col=c("blue", "green", "red"), cex=1.5))
# Assign column annotations as a mini-graph instead of colors,
# and use the built-in labeling for them
ColSideAnn<-data.frame(Drug=annotations[["subject_drug"]])
heatmap3(logCPM,ColSideAnn=ColSideAnn,
ColSideFun=function(x)showAnn(x),
ColSideWidth=0.8)
}
testHeatmap3(gLogCpmData, gAnnotationData)
This one follows the syntax of heatmap.2, which is good if you already know the latter. However, its added functionality is quite complicated … definitely complicated enough to get me into trouble (e.g., in the second option above, my annotation legend runs into my heat map and I’ve lost the main legend). It may also be complicated enough to get me out of trouble again (e.g., via explicit setting of the legend and/or heat map placement) but it would clearly take more digging.
annHeatmap2¶
annHeatmap2 is the core function of the Heatplus package. Unlike other packages discussed in this evaluation, Heatplus is available through the bioconductor bioinformatics software project rather than through CRAN.
# Source bioconductor
source("http://bioconductor.org/biocLite.R")
biocLite("Heatplus")
library(Heatplus)
# Test annHeatmap2 with column annotations
testAnnHeatmap2<-function(logCPM, annotations){
ann.dat = data.frame(annotations[,"subject_drug"])
plot(annHeatmap2(logCPM, legend=2,
ann = list(Col = list(data = ann.dat))))
}
testAnnHeatmap2(gLogCpmData, gAnnotationData)
Like heatmap3, annHeatmap2 does metadata annotations as a mini-graph; apparently it doesn’t do such annotations as color bars (?!) It requires that all annotation (and dendrogram, etc) options be passed in as lists, which clearly offers a lot of powerful abilities but is sort of heavy-weight.
Summary¶
Below I summarize the features I assessed for these tools:
feature | heatmap | heatmap.2 | aheatmap | pheatmap | heatmap3 | annHeatmap2 |
---|---|---|---|---|---|---|
source | built-in | cran | cran | cran | cran | bioconductor |
can add main legend | x | x | x | x | x | |
can control main legend text | x | x | ||||
can add row/col annotations | x | x | x | x | x | x |
can specify annotations as table column | x | x | x | |||
can add annotation legend | x | x | x | ~ | ||
can control annotation legend text | ~ | x | ~ | |||
notes | no clear advantages | pretty nice results with little tinkering | Appears UNUSABLE as dendrograms show up wrong, at least in notebook | pretty nice results with little tinkering | powerful but complicated | powerful but complicated; doesn’t support ColSideColors? |
Recommendation¶
I recommend using heatmap3 (NB: not “heatmap.3”). It mimics the easy-to-use interface of heatmap.2 but can be extended into more complex settings if you later find you need fine-grained control.