Archive for December, 2016

A Step-By-Step Guide to Generating Gene Interaction Networks with GeneMANIA

December 19, 2016

Amanda Birmingham (abirmingham at ucsd.edu)

The excellent network-building web tool GeneMANIA has recently been given a facelift. Its new user interface is minimal and uncluttered–so much so that it took me a little trial and error to figure out what all the buttons and options did! In case you’re in the same boat, here’s the missing manual for how to use the new-and-improved GeneMANIA website to generate a network from your gene list. (To be clear, I don’t speak for the GeneMANIA project and am not associated with it in any way–just a fan 🙂 )

  1. Before beginnning, prepare a list of the genes for which you wish to make a network. (Note that GeneMANIA accepts gene symbols or NCBI Gene IDs, but not Ensembl gene ids.) Ensure each gene identifier is on a separate line, as shown below:

  2. Visit genemania.org. The home page looks like this:

  3. If your organism of interest is not human (the default), click on the small picture of a person and select the desired organism from the resulting drop-down box:

  4. Click in the white text box with the letters “e.g.” in it . When it expands, paste the list of your genes of interest into it (to run GeneMANIA on a sample list of genes, click on the “e.g.”). Any genes that cannot be found in GeneMANIA’s database will be highlighted in red, as shown below. These genes will be ignored in the rest of the analysis, so check to ensure that they do not represent your most interesting genes!

  5. (Optional step) If you want to modify the interactions that are included in your network, click the three-dot icon next to the white text box. This displays all the interaction sources and allows you to select which should be included by checking their boxes:

  6. (Optional step) Clicking on “Customize advanced options” at the bottom of this menu provides the option to modify additional behaviors. For example, in some cases GeneMANIA defaults to adding 20 “related” genes into the network you build (determined based on the input genes’ interactions). You can change this number by sliding the slider under “Max resultant genes”.

  7. Click on the magnifying glass button to begin network creation. The magnifying glass will change to a spiral and the text box to a progress bar while the network creation is in process:

  8. The finished network will appear in the main screen. Genes with identified interactions will be connected, while those without identified interactions will be shown in a row across the bottom of the screen:

    • Genes that you input are shown with cross-hatched circles of a uniform size, while those that were added as “relevant” genes by GeneMANIA are shown with solid circles whose size is proportional to the number of interactions they have.

    • Different kinds of interactions are represented by different colored connector lines. Click on the three horizontal lines icon at the right of the screen to see the the color legend and to select or deselect which interactions are displayed:

    • Mousing over a single gene’s circle highights only its interactions:

    • Clicking on a single gene displays information about it and offers options to remove it from the network or rerun a new network analysis based on only that gene. Click the X at the top right of this info box to make it disappear when you are done with it.

    • Clicking the info button on the left of the screen shows basic information about the network, such as how many genes and interactions it includes, as well as links to the help documentation and other resources. Click again on the info button to clear it from the screen when you are done with it.

    • Clicking the pie chart button on the bottom left of the screen displays the functions associated with genes in the network and their FDR and coverage (as number of genes annotated with that function in the network versus number of genes annotated with that function in the genome):

    Checking any of the functions’ checkboxes will assign colors to those functions; once the functions list is hidden again, genes in the network image that are annotated with the chosen functions have their circles colored with the relevant colors:

    Clicking on the X next to any selected function clears its coloring from the network.

    • To move the network around the screen, click on any background space. A gray circle will appear under the mouse; drag the network to your desired position and release the mouse.

    • GeneMANIA stores your search history. To view that history, click on the circular arrow button on the bottom of the screen:

    From here, you can rerun a past search by clicking on its image, clear an individual past search by clicking on the red X at its top right corner, or clear all history by clicking on the large red button at the far left. To hide history information, click the circular arrow again.

  9. (Optional step) There are several options for modifying the on-screen layout of the network.

    • To reposition a single gene manually, simply grab that gene’s circle and drag it to where you would like it to be. To make the changed network fill up the screen again, click the diagonal arrow button on the left of the screen to reposition the modified network:
    • To redraw the network in a circular layout, click the target button on the left of the screen: . Depending on the size of your network, the re-layout may take a few moments. The resultant layout will look something like this:

    • To redraw the network in a linear layout, click the two downward arrows button on the left of the screen: . Depending on the size of your network, the re-layout may take a few moments. The resultant layout will look something like this:

    • To return to the default layout, click the intertwined arrows button on the left of the screen: . Depending on the size of your network, the re-layout may take a few moments.
  10. Once you have adjusted the network to your satisfaction, you can save it in one of several formats. These options are visible when clicking on the floppy disk button on the left of the screen:

    • Report generates and displays a PDF report of the network analysis, including the GeneMANIA software version, network image, search parameters, interaction sources searched, gene details, and interaction sources in which interactions for these genes were found. It can be downloaded from your browser window like any other PDF once generated.

    • Network image As shown downloads a jpeg of the image as currently shown on the screen.
    • Network image With plain, top labels downloads a jpeg of the network with gene labels shown above their circles rather than in them:

    • Network downloads a text file detailing the connected nodes in the network and the details of their connections:

    • Networks data downloads a text file listing the details (including citation information) for each interaction source used in the network generation:

    • Attributes data downloads a text file listing the attributes identified for each gene.

    Note that this list will be empty (except for the header line) if the Attributes checkbox is not checked in the Networks menu during set-up of the network generation:

    … or if the Max resultant attributes slider is set to zero in the Customize advanced options menu during set-up of the network generation:

    … or if there are no attributes found for the input genes in the selected Attributes sources.

    • Genes data downloads a text file with a list of all genes included in the network, as well as links to their NCBI Gene records and their GeneMANIA-assigned scores in the network:

    • Functions data downloads a text file containing functions found to be identified with genes in the network and the enrichment level of those functions (shown as FDR) based on the number of genes with that function in the network and the number of genes with that function in the genome. This list may be empty except for the header line if no such functions are identified.

    • Interactions data downloads a text file containing much the same information as the Network option but in a slightly different format:

    • Search parameters as text does not appear to function at the moment.
    • Search parameters as JSON downloads a representation of the search parameters for the network in JSON-formatted text:

  11. If you use GeneMANIA for your research, please give credit where credit is due: cite it! The appropriate citation is:

    • Warde-Farley D, Donaldson SL, Comes O, Zuberi K, Badrawi R, Chao P, Franz M, Grouios C, Kazi F, Lopes CT, Maitland A, Mostafavi S, Montojo J, Shao Q, Wright G, Bader GD, Morris Q. The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res. 2010 Jul;38(Web Server issue):W214-20. doi: 10.1093/nar/gkq537. PubMed PMID: 20576703; PubMed Central PMCID: PMC2896186.

Visualizing the similarity of two networks

December 9, 2016

 

Julia Len (jlen at ucsd.edu)

Introduction

When working with networks, it is often useful to consider how similar two networks are.  There are a number of ways of quantifying network similarity however.  One could simply consider the number of nodes two networks have in common.  However, this would miss any structural similarity, or lack thereof, between the edges.  For example, it is possible for two networks to have completely identical node sets, but have completely disjoint edge sets.  Note however that in order for two networks to share edges, they must share nodes as well (since edges are defined by the nodes they connect).  

In this post, we will introduce a network overlap visualization function (draw_graph_union) in the visJS2jupyter package, and explore a few possible scenarios.

Installation

To install visJS2jupyter, run

    pip install visJS2jupyter

in your terminal. To import the visualizations module, use the statement

    import visJS2jupyter.visualizations as visualizations

in your jupyter notebook. The source code is also available on github here.

Simple example with default parameters

We will now go through a simple example using two 10-node networks whose intersection is exactly 5 nodes. We create two small, random networks using the networkx function ‘connected_watts_strogatz_graph’. Each network will have 10 nodes, with each node initially connected to its 5 nearest neighbors. These connections are then randomly rewired with probability 0.1.

    G1 = nx.connected_watts_strogatz_graph(10,5,.1)

    G2 = nx.connected_watts_strogatz_graph(10,5,.1)

This produces two networks who both have nodes labelled from 0 to 9. Their intersection is then all the nodes for each graph. This is an unexciting case, so let’s relabel some nodes, so that they share only 5 nodes in common. We can do this by relabelling the nodes 0 to 9 of the second graph, G2, to 5 to 14 using the networkx function ‘relabel_nodes’. The code for this is shown below:

    old_nodes = range(5)

    new_nodes = range(10,15)

    new_node_labels = dict(zip(old_nodes,new_nodes))

    G2 = nx.relabel_nodes(G2,new_node_labels)

Now nodes 0 to 4 belong to only G1, nodes 5 to 9 belong to G1 and G2, and nodes 10 to 14 belong to only G2. Let’s see what this looks like by using draw_graph_union:

    visualizations.draw_graph_union(G1,G2)

And that’s it! We get an interactive graph fairly quickly and easily. Notice that the nodes are color-coded and shaped based on which network they belong to. For instance, nodes in the intersection of G1 and G2 are orange and triangular shapes, while nodes which only belong to G1 are red circles, and nodes which only belong to G2 are yellow squares. Also notice that edges found in both G1 and G2 are colored red while all other edges are colored blue.

You can take a look at the sample notebook here.  Notice that hovering over a node pops up a tooltip with information about the node’s name and graph membership.

From the previous example, we saw that draw_graph_union not only depicts the intersection of nodes, but it also visualizes the intersection of edges as well. Let’s take a look at how this works with two networks having identical nodes but but only a few overlapping edges.

Identical nodes, some overlapping edges

We’ll again be using the connected_watts_strogatz_graph to create our two networks, but this time both networks will contain 50 nodes:

    G1 = nx.connected_watts_strogatz_graph(50,5,.1)

    G2 = nx.connected_watts_strogatz_graph(50,5,.1)

This produces two networks with identical nodes and randomly intersecting edges. We want the sets of edges to only intersect over 5 nodes. Python’s built-in set object can help with this. We can get the edges for G1 and G2, convert the lists of edges to sets of edges, and then find their intersection using &. We can then subtract out the intersecting edges from each set of edges.

    edges_1 = set(G1.edges())

    edges_2 = set(G2.edges())

    intersecting_edges = edges_1 & edges_2

    edges_1_disjoint = edges_1 - intersecting_edges

    edges_2_disjoint = edges_2 - interesecting_edges

This produces two disjoint sets of edges. We now want to add back in 5 edges from the intersection into each disjoint set.

    for i in range(0,5):

        new_edge = intersecting_edges.pop()

        edges_1_disjoint.add(new_edge)

        edges_2_disjoint.add(new_edge)

We can then remove the current edges from G1 and add back in the desired edges. We do the same for G2.

    G1.remove_edges_from(edges_1)

    G1.add_edges_from(list(edges_1_disjoint))

Let’s now draw the two graphs using draw_graph_union, but this time let’s customize the graph a bit. The function sets the default color of the nodes to matplotlib’s colormap autumn and the default color of the edges to matplotlib’s colormap coolwarm. However, matplotlib has many wonderful colormaps available that we can choose from (click here for more details). To set the colormap of the nodes and edges, use the arguments node_cmap and edge_cmap. If you decide to change the colormap, make sure to import the matplotlib package:

    import matplotlib as mpl

We can add other customizations as well, such as setting edge width and edge shadows. The function allows for any argument available in visJS_module in the visJS2jupyter package. This allows many potential customizing features for the function. Now, let’s see what this looks like overall:

    visualizations.draw_graph_union(G1,G2,

        node_cmap=mpl.cm.cool,

        edge_cmap=mpl.cm.winter_r,

        edge_width=5,

        edge_shadow_enabled=True,

        edge_shadow_size=2)

As you can see in the graph above, there is now only one set of nodes, all of which are triangle shaped because all the nodes overlap. The edges are mostly colored in green except for 5 edges in blue: the edges in the intersection. Notice that the edges and nodes are colored differently from before and the edges now have added shadows. You can take a look at the interactive notebook with this example here.

One network contains the other network

So far, we’ve seen graphs where there is a small intersection of the nodes and the node sets are equal. What happens if the set of nodes for one graph is a subset of the nodes for the other graph? We’ll take a look at this case now.

We again create two networks using connected_watts_strogatz. Graph 1 will have 50 nodes and graph 2 will have 20 so that all of the second graph’s nodes intersect with graph 1. We will then call draw_graph_union on these two graphs. This time, we use some more features of the function. We can set the name of the nodes for graph 1 and graph 2. Notice that previously when we hovered over a node, the tooltip showed something like “graph 1 + graph 2”. Using the arguments node_name_1 and node_name_2, we can customize what is shown in the tooltip. Pretty cool!

If you’ve played around with some of the example notebooks, you’ve probably noticed that the nodes move around when dragged as if they have a gravitational field. This is the physics_enabled feature. It is set by default for graphs of less than 100 nodes, while it is turned off for any larger graphs. One nice feature is that you can override this by setting the physics_enabled argument to true or false. Let’s turn off this setting for this example.

    visualizations.draw_graph_union(G1,G2,edge_width=5

        node_name_1=”superset”,

        node_name_2=”subset”,

        physics_enabled=False)

In just a couple of lines of code, we have produced an interactive network! Notice that when we hover over a node, it has the name that we set, just like we wanted. You can also see that dragging a node around makes it stay stuck in place, so the physics_enabled setting has been turned off. The example notebook can be found here.

Overall, draw_graph_union provides a quick and easy way to create customizable and interactive visualizations for network similarities, enabling visual assessment of what two networks share and what they don’t.   

See How CCBB Can Help With Your Bioinformatics Data

Request Free Consult 858-822-6258