Bringing interactivity to network visualization in Jupyter notebooks: visJS2Jupyter
Brin Rosenthal (sbrosenthal at ucsd.edu)
Introduction
Data is everywhere these days, and being able to interact with visual representations of that data in real time can help bring it to life. You have to look no further than the D3 (data-driven-documents) examples page to see this. If you haven’t spent time browsing through the D3 examples library, I would highly recommend doing so, but be warned it is easy to spend a few captivating hours here! (A few of my favorites: collision avoidance, collapsible force layout, NCAA march madness predictions, preferential attachment).
Unfortunately, D3 is pretty nontrivial to learn, which can be a significant barrier to those of us looking for a quick but awesome solution. There are some good visualization libraries which are based on D3, and simpler to use. One of our favorites is vis.js.
If you’re anything like me, you love the fast and flexible development and documentation environment that Jupyter notebooks provide. But I had been frustrated with the limited interactivity that is available for plotting of data. While matplotlib, seaborn, and networkx provide nice static ways of graphing data and networks, they left me wanting more. Python widgets are ok, but a bit clunky (see earlier post…) .
A group of us at the CCBB had the idea to write a tool which would bring the interactivity of D3 (through vis.js) into Jupyter notebook cells. This turned out to be quite simple. We repurposed some existing html code from another project, to set the styles of nodes and edges in a network. We modified this code to allow style arguments to be passed in through a function. Every time this function is called, a new style_file.html is created, containing the properties set by the user. This style_file.html is then loaded into the Jupyter cell using the python HTML module, and the network is rendered in the cell. Once we figured these pieces out, we had a fully interactive graph! Right there in the Jupyter notebook cell! We can now freely pan, zoom, click and drag nodes, and even embed more information in the node and edge hover-bubbles. One of the coolest things about this tool is that it is almost infinitely flexible, and we’ve designed it to work with networkx graph formats- are one of the most standard python graph libraries.
In this post, I’ll walk you through two simple examples of how to use visJS2Jupyter.
Installation
To install, run “pip install visJS2jupyter” in your terminal. To import, use the statement “import visJS2jupyter.visJS_module” in your notebook. Source code for the package may be found here https://github.com/ucsd-ccbb/visJS_2_jupyter.
Use example with default parameters
Now that we have the package installed, we’re going to walk through a very simple use example, using only the default parameters. First, we need a network to draw. Let’s make a random one using the networkx function ‘connected_watts_strogatz_graph’. This network has 30 nodes, each of which is initially connected to 5 nearest neighbors. Each of these connections randomly rewired with probability 0.2. We will also need the lists of nodes and edges that comprise this graph.
G=nx.connected_watts_strogatz_graph(30,5,.2)
nodes = G.nodes()
edges = G.edges()
Next, we will simply construct dictionaries which contain all of the node-specific and edge-specific traits which will be passed to the visualizer. (Note that we also need to make a node_map here, which maps the names of the nodes in the graph to integers, because of the way visJS interprets node/edge data).
nodes_dict = [{"id":n} for n in nodes]
node_map = dict(zip(nodes,range(len(nodes)))) # map to indices for source/target in edges
edges_dict = [{"source":node_map[edges[i][0]], "target":node_map[edges[i][1]],
"title":'test'} for i in range(len(edges))]
Now all that’s left is calling the visualizer function:
visJS_module.visjs_network(nodes_dict, edges_dict, time_stamp=0)
Done! Now we are free to click, drag, and zoom at will. Note that if you click on a node, that node’s nearest neighbors are highlighted.
Now that we have the basic use example under our belt, let’s move on to something more complicated, because there is so much potential here!
More complicated use example
In this example, we will start by mapping some features to node and edge properties. To map node/edge attributes to properties, simply add the property to the graph as a node/edge-attribute (using nx.set_node_attribute and nx.set_edge_attribute), then use the return_node_to_color function to select which property you would like to map to the node colors. You can map anything you want to node color, as long as you represent it numerically. You can also choose which matplotlib colormap you’d like to use for the mapping. For example, let’s calculate the node-level clustering coefficient and betweenness centrality and degree for our random network we made above, and add them as attributes.
# add a node attributes to color-code by
cc = nx.clustering(G)
degree = G.degree()
bc = nx.betweenness_centrality(G)
nx.set_node_attributes(G,'clustering_coefficient',cc)
nx.set_node_attributes(G,'degree',degree)
nx.set_node_attributes(G,'betweenness_centrality',bc)
Now that we’ve added each of these properties as node attributes, let’s map the node colors to betweenness centrality, and use the matplotlib colormap spring_r for our color scheme. We can also set the node transparency, using alpha, (1 = fully opaque, 0 = fully transparent), and we can choose which section of the colormap we’d like to use. Here we’re setting the lowest value of betweenness centrality to 10% of spring_r, and the highest value to 90%. This is useful if you like most of a colormap, but only want to use the part you like (if it starts too light or too dark for example). You can also transform your color scale, using the ‘color_vals_transform’ argument. Valid options are ‘log’, ‘sqrt’, and ‘ceil’.
node_to_color = visJS_module.return_node_to_color(G,field_to_map='betweenness_centrality',cmap=mpl.cm.spring_r,
alpha = 1, color_max_frac = .9,color_min_frac = .1)
Now that we have our color mapping, we can fill out nodes_dict, node_map, and edges_dict, as we did in the simple example. This time, however, we will set more node and edge level properties, including:
- the positions of each node (x and y) using the output from nx.spring_layout
- The color of each node using our color mapping node_to_color
- The degree of each node (if degree is passed in, it is used to map node size by default)
- We’ll pass in dummy values for the node title field (this is what will show up in the hover).
- The color of each edge (for now we set every edge to be the same color- gray, but you can easily individualize the edge colors too, using visJS_module.return_edge_to_color(…)).
This is the current list of properties you can modify at the node level
- ‘node_shape’
- ‘color’
- ‘border_width’
- ‘title’ (e.g. the hover information)
- The default node size is mapped to the node degree, but you can override that default by setting ‘node_size_field’ in the visjs_network function. For example, simply add a ‘node_size’ key:value entry to the nodes_dict, and call visjs_network with node_size_field = ‘node_size’.
- ‘degree’: the degree of each node- used for default size mapping
- All of the above are optional additions to nodes_dict. Default values will be filled in if they are missing.
pos = nx.spring_layout(G)
nodes_dict = [{"id":n,"color":node_to_color[n],
"degree":nx.degree(G,n),
"x":pos[n][0]*1000,
"y":pos[n][1]*1000} for n in nodes
]
node_map = dict(zip(nodes,range(len(nodes)))) # map to indices for source/target in edges
edges_dict = [{"source":node_map[edges[i][0]], "target":node_map[edges[i][1]],
"color":"gray","title":'test'} for i in range(len(edges))]
We’ll also pass in some more graph-level properties (properties that aren’t node and edge specific). These include:
- node_size_multiplier: multiply each node’s size by this (useful if you have very few or very many nodes)
- node_color_highlight_border
- node_color_highlight_background
- node_color_hover_border
- node_color_hover_background
- node_font_size
- edge_arrow_to: Should we draw arrows at the target end?
- edge_color_highlight
- edge_color_hover
- edge_width: how wide should the edges be?
- physics_enabled, min_velocity, max_velocity: controls the physics of the nodes
- Time_stamp: This appends the value to the end of the style-file, thus creating a new one instead of writing over the old one. You need a unique style-file for every network you render within the same Jupyter notebook.
We have mapped most (still working on getting the complete list) of the modifiable fields from visJS network into our package. You can find documentation on the full list here .
visJS_module.visjs_network(nodes_dict,edges_dict,time_stamp=1,
node_size_multiplier=5,
node_size_transform = '',
node_color_highlight_border='red',
node_color_highlight_background='#D3918B',
node_color_hover_border='blue',
node_color_hover_background='#8BADD3',
node_font_size=25,
edge_arrow_to=True,
edge_color_highlight='#8A324E',
edge_color_hover='#8BADD3',
edge_width=3,
physics_enabled=True,
min_velocity=1,
max_velocity=15)
Ok there we go! Now we have drawn a much more interesting network. Click on the image below to be redirected to the interactive version, hosted on bl.ocks.org.
For an even more complicated use case, see this notebook I wrote (http://bl.ocks.org/brinrosenthal/raw/fd7d7277ce74c2b762d3a4d66326215c/). In this example, we display the bipartite network composed of diseases in The Cancer Genome Atlas (http://cancergenome.nih.gov/), and the top 25 most common mutations in each disease. We also overlay information about drugs which target those mutations. Genes which have a drug targeting them are displayed with a bold black outline. The user may hover over each gene to get a list of associated drugs.