Outputting Beautiful Jupyter Notebooks (R-Kernel Edition)

Amanda Birmingham (abirmingham at ucsd.edu)

Jupyter notebooks are wonderful, but eventually you will need to present your work to someone unable (or unwilling) to view it on a notebook server. Unfortunately, there are surprising difficulties in printing or otherwise outputting Jupyter notebooks attractively into a static, offline format. These difficulties are not limited to Python-kernel notebooks: R-kernel notebooks have their own issues. Here’s a description of those issues, and a work-around that doesn’t require learning to modify jinja2 templates.

Table of Contents

Table of Contents

HTML Output: Mangled Graphics Text

At first blush, it looks as though the HTML conversion built into Jupyter notebooks (shown below) works fine for R-kernel notebooks, as no errors are thrown and the output generally looks attractive.

However, as you scroll through your document, you will find that something sinister has happened to plots after the first plot that uses legends/axis labels. For example, in my sample notebook, the first plot with a legend looks great:

… but all subsequent ones have some of their text labels sadly mangled (e.g., look at the legend at the far right of the plot below):

Apparently the cause of this mess is that (a) the Jupyter R kernel, IRKernel, by default outputs all graphics as inline SVG, but (b) the nbconvert tool that Jupyter uses to create HTML doesn’t "honor the ‘isolated’: true flag" in the metadata that tells it to put the SVG in its own iframe (about half of this statement is Greek to me, but feel free to get more details from the horse’s mouth–the nbconvert issue itself is at https://github.com/jupyter/nbconvert/issues/129, and is still open as of 08/25/2016).

Table of Contents

PDF Output: Line Truncation

So, let’s just output as PDF, which also "works" (i.e., doesn’t error out) for R-kernel notebooks, right?

Wrong! In PDFs, it is true that the plots all look lovely:

However, something else has gone off the rails! The HTML output, for all its plot failures, does pretty well with text: it tries (and often succeeds in) coercing tables to fit the screen size, and when that fails adds a scrollbar to allow access to content too wide for the screen:

Unfortunately, no such grace is forthcoming from the PDF output:

As shown by the gray bar at the right of each of these screenshots, long content, whether tabular or textual, simply runs off the edge of the pdf page (a problem, sadly, that plagues all Jupyter notebooks regardless of kernel, as they all use nbconvert to make PDFs). Oh, the humanity!

Table of Contents

Workaround: HTML to PDF Without SVG

Fortunately, this mishigas can be side-stepped with minimal loss of quality and sanity. As described at https://github.com/IRkernel/IRkernel/issues/331, simply add this line to the top of your R-kernel notebook:

options(jupyter.plot_mimetypes = c("text/plain", "image/png" ))

Then be sure to restart your kernel so it takes effect:

What effect? Well, it tells IRKernel to stop trying to output all graphics as SVG (by the way, this is also helpful if you have very LARGE plots that are bloating the on-disk size of your notebook or causing it to hang when rendered). You are instead telling it to make all graphics inline PNGs. PNGs look just slightly different than their SVG counterparts–a little blockier, since the former are raster graphics while the latter are vector graphics. I notice it most in the text:

SVG

PNG

But what if you need a PDF? Well, with the HTML conversion licked, we can now get an acceptable PDF by simply opening the HTML version of the notebook in the browser and using the browser’s ability to "print" it to PDF (as shown here in Chrome):

This gives us unmangled plots (that, by the way, are appropriately placed so they aren’t broken across pages–unlike PDFs created from HTML by tools such as wkhtmltopdf):

It also wraps long text lines:

and fits tables to the page width, where possible:

You still (of course) lose the view of very wide tables, as the HTML scroll bars don’t work in PDF, but at some point you’ve got to accept that you can’t fit 10 pounds in a 5-pound sack!

The "print to PDF" option in the browser also reproduces the appearance of the HTML notebook much more faithfully than Jupyter nbconvert‘s PDF conversion, which imposes a very LaTeX-y format (this of course makes sense, as the nbconvert PDF conversion goes by way of a trip through LaTeX). Finally, "Print to PDF" is also noticeably faster than nbconvert, although the time it takes to generate a PDF either way is unlikely to be a bottleneck!

« Return

See How CCBB Can Help With Your Bioinformatics Data

Request Free Consult 858-822-6258