Skip to Tutorial Content

Why we graph

Network visualisation is important and non-trivial. As Tufte (1983: 9) said:

“At their best, graphics are instruments for reasoning about quantitative information. Often the most effective way to describe, explore, and summarize a set of numbers – even a very large set – is to look at pictures of those numbers”

All of this is crucial with networks. As a first step, network visualisation – or graphing – offers us a way to vet our data for anything strange that might be going on, both revealing and informing our assumptions and intuitions.

It is also crucial to the further communication of the lessons that we have learned through investigation with others. While there may be many dead-ends and time-sinks to visualisation, it is worth taking the time to make sure that your main points are easy to appreciate.

Brandes et al (1999) argue that visualising networks demands thinking about:

  • substance: a concise and precise delivery of insights to the researcher and/or readers
  • design: the ergonomics of function are 98% of the purpose of good design, aesthetics only 2%
  • and algorithm: the features of the e.g. the layout algorithm

Different approaches

There are several main packages for plotting in R, as well as several for plotting networks in R. Plotting in R is typically based around two main approaches:

  • the ‘base’ approach in R by default, and
  • the ‘grid’ approach made popular by the famous and very flexible {ggplot2} package.1

Approaches to plotting graphs or networks in R can be similarly divided:

  • two classic packages, {igraph} and {sna}, both build upon the base R graphics engine,
  • newer packages {ggnetwork} and {ggraph} build upon a grid approach.2

While the coercion routines available in {manynet} make it easy to use any of these packages’ graphing capabilities, {manynet} itself builds upon the ggplot2/ggraph engine to help quickly and informatively graph networks.

Dimensions of visualisation

On her excellent and helpful website, Katya Ognyanova outlines some key dimensions of control that network researchers have to play with:

  • vertex position (layout)
  • vertex shape (e.g. circles, squares)
  • vertex color
  • vertex size
  • vertex labels
  • edge shape (e.g. straight, bends)
  • edge type (e.g. solid, dashed)
  • edge color
  • edge size
  • edge arrows

In the following sections, we will learn how {manynet} provides sensible defaults on many of these elements, and offers ways to extend or customise them.


  1. ‘gg’ stands for the Grammar of Graphics.↩︎

  2. Others include: ‘Networkly’ for creating 2-D and 3-D interactive networks that can be rendered with plotly and can be easily integrated into shiny apps or markdown documents; ‘visNetwork’ interacts with javascript (vis.js) to make interactive networks (http://datastorm-open.github.io/visNetwork/); and ‘networkD3’ interacts with javascript (D3) to make interactive networks (https://www.r-bloggers.com/2016/10/network-visualization-part-6-d3-and-r-networkd3/).↩︎

Graphing

graphr

To get a basic visualisation of the network before adding various specifications, the graphr() function in {manynet} is a quick and easy way to obtain a clear first look of the network for preliminary investigations and understanding of the network. Let’s quickly visualise one of the ison_ datasets included in the package.

graphr(ison_lotr)

We can also specify the colours, groups, shapes, and sizes of nodes and edges in the graphr() function using the following parameters:

  • node_colour
  • node_shape
  • node_size
  • node_group
  • edge_color
  • edge_size
graphr(ison_lotr, node_color = "Race")

graphs

graphr() is not the only graphing function included in {manynet}. To graph sets of networks together, graphs() makes sure that two or more networks are plotted together. This might be a set of ego networks, subgraphs, or waves of a longitudinal network.

graphs(to_subgraphs(ison_lotr, "Race"),
       waves = c(1,2,3,4))

grapht

grapht() is another option, rendering network changes as a gif.

ison_lotr %>%
  mutate_ties(year = sample(1:12, 66, replace = TRUE)) %>%
  to_waves(attribute = "year", cumulative = TRUE) %>%
  grapht()

Titles and subtitles

Append ggtitle() to add a title. {manynet} works well with both {ggplot2} and {ggraph} functions that can be appended to create more tailored visualisations of the network.

graphr(ison_adolescents) + 
  ggtitle("Visualisation")

Arrangements

{manynet} also uses the {patchwork} package for arranging graphs together, e.g. side-by-side or above one another. The syntax is quite straight forward and is used throughout these vignettes.

graphr(ison_adolescents) + graphr(ison_algebra)
graphr(ison_adolescents) / graphr(ison_algebra)

Legends

While {manynet} attempts to provide legends where necessary, in some cases the legends offer insufficient detail, such as in the following figure, or are absent.

ison_lotr %>% 
  mutate(maxbet = node_is_max(node_betweenness(ison_lotr))) %>% 
  graphr(node_color = "maxbet")

{manynet} supports the {ggplot2} way of adding legends after the main plot has been constructed, using guides() to add in the legends, and labs() for giving those legends particular titles. Note that we can use "\n" within the legend title to make the title span multiple lines.

ison_lotr %>% 
  mutate(maxbet = node_is_max(node_betweenness(ison_lotr))) %>% 
  graphr(node_color = "maxbet") +
  guides(color = "legend") + 
  labs(color = "Maximum\nBetweenness")

An alternative to colors and legends is to use the ‘node_group’ argument to highlight groups in a network. This works best for quite clustered distributions of attributes.

graphr(ison_lotr, node_group = "Race")

Layouts

The aim of graph layouts is to position nodes in a (usually) two-dimensional space to maximise some analytic and aesthetically pleasing function. Quality measures might include:

  • minimising the crossing number of edges/ties in the graph (planar graphs require no crossings)
  • minimising the slope number of distinct edge slopes in the graph (where vertices are represented as points on a Euclidean plane)
  • minimising the bend number in all edges in the graph (every graph has a right angle crossing (RAC) drawing with three bends per edge)
  • minimising the total edge length
  • minimising the maximum edge length
  • minimising the edge length variance
  • maximising the angular resolution or sharpest angle of edges meeting at a common vertex
  • minimising the bounding box of the plot
  • evening the aspect ratio of the plot
  • displaying symmetry groups (subgraph automorphisms)

A range of graph layouts are available across the {igraph}, {graphlayouts}, and {manynet} packages that can be used together with graphr().

Force-directed layouts

Force-directed layouts updates some initial placement of vertices through the operation of some system of metaphorically-physical forces. These might include attractive and repulsive forces.

(graphr(ison_southern_women, layout = "kk") + ggtitle("Kamada-Kawai") |
   graphr(ison_southern_women, layout = "fr") + ggtitle("Fruchterman-Reingold") |
   graphr(ison_southern_women, layout = "stress") + ggtitle("Stress Minimisation"))

The Kamada-Kawai method inserts a spring between all pairs of vertices that is the length of the graph distance between them. This means that edges with a large weight will be longer. KK offers a good layout for lattice-like networks, because it will try to space the network out evenly.

The Fruchterman-Reingold method uses an attractive force between directly connected vertices, and a repulsive force between all vertex pairs. The attractive force is proportional to the edge’s weight, thus edges with a large weight will be shorter. FR offers a good baseline for most types of networks.

The Stress Minimisation method is related to the KK algorithm, but offers better runtime, quality, and stability and so is generally preferred. Indeed, {manynet} uses it as the default for most networks. It has the advantage of returning the same layout each time it is run on the same network.

Other force-directed layouts available include:

  • Simulated annealing (Davidson and Harel 1993): "dh"
  • Graph embedder (Frick et al. 1995): "gem"
  • Graphopt (Schmuhl): "graphopt"
  • Distributed recursive graph layout (Martin et al. 2008): "drl"

Layered layouts

Layered layouts arrange nodes into horizontal (or vertical) layers, positioning them so that they reduce crossings. These layouts are best suited for directed acyclic graphs or similar.

(graphr(ison_southern_women, layout = "bipartite") + ggtitle("Bipartite")) /
(graphr(ison_southern_women, layout = "hierarchy") + ggtitle("Hierarchy")) /
(graphr(ison_southern_women, layout = "railway") + ggtitle("Railway"))

Note that "hierarchy" and "railway" use a different algorithm to {igraph}’s "bipartite", and generally performs better, especially where there are multiple layers. Whereas "hierarchy" tries to position nodes to minimise overlaps, "railway" sequences the nodes in each layer to a grid so that nodes are matched as far as possible. If you want to flip the horizontal and vertical, you could flip the coordinates, or use something like the following layout.

graphr(ison_southern_women, layout = "alluvial") + ggtitle("Alluvial")

Other layered layouts include:

  • Tree: "tree"
  • Dominance layouts

Circular layouts

Circular layouts arrange nodes around (potentially concentric) circles, such that crossings are minimised and adjacent nodes are located close together. In some cases, location or layer can be specified by attribute or mode.

graphr(ison_southern_women, layout = "concentric") + ggtitle("Concentric")

Other such layouts include:

  • circular: "circle"
  • sphere: "sphere"
  • star: "star"
  • arc or linear layouts: "linear"

Spectral layouts

Spectral layouts arrange nodes according to the eigenvalues of the Laplacian matrix of a graph. These layouts tend to exaggerate clustering of like-nodes and the separation of less similar nodes in two-dimensional space.

graphr(ison_southern_women, layout = "eigen") + ggtitle("Eigenvector")

Somewhat similar are multidimensional scaling techniques, that visualise the similarity between nodes in terms of their proximity in a two-dimensional (or more) space.

graphr(ison_southern_women, layout = "mds") + ggtitle("Multidimensional Scaling")
graphr(ison_southern_women, layout = "mds") + ggtitle("Multidimensional Scaling")

Other such layouts include:

  • Pivot multidimensional scaling: "pmds"

Grid layouts

Grid layouts arrange nodes based on some cartesion coordinates. These can be useful for making sure all nodes’ labels are visible, but horizontal and vertical lines can overlap, making it difficult to distinguish whether some nodes are tied or not.

graphr(ison_southern_women, layout = "grid") + ggtitle("Grid")

Other grid layouts include:

  • orthogonal layouts for e.g. printed circuit boards
  • grid snapping for other layouts

Colors

Who’s hue?

By default, graphr() will use a color palette that offers fairly good contrast and, since v1.0.0 of {manynet}, better accessibility. However, a different hue might offer a better aesthetic or identifiability for some nodes. Because the graphr() function is based on the grammar of graphics, it’s easy to extend or alter aesthetic aspects. Here let’s try and change the colors assigned to the different races in the ison_lotr dataset.

graphr(ison_lotr,
           node_color = "Race")

graphr(ison_lotr,
           node_color = "Race") +
  ggplot2::scale_colour_hue()

Grayscale

Other times color may not be desired. Some publications require grayscale images. To use a grayscale color palette, replace _hue from above with _grey:

graphr(ison_lotr,
           node_color = "Race") +
  ggplot2::scale_colour_grey()

Manual override

Or we may want to choose particular colors for each category. This is pretty straightforward to do with ggplot2::scale_colour_manual(). Some common color names are available, but otherwise hex color codes can be used for more specific colors. Unspecified categories are coloured (dark) grey.

graphr(ison_lotr,
           node_color = "Race") +
  ggplot2::scale_colour_manual(
    values = c("Dwarf" = "red",
               "Hobbit" = "orange",
               "Maiar" = "#DEC20B",
               "Human" = "lightblue",
               "Elf" = "lightgreen",
               "Ent" = "darkgreen")) +
  labs(color = "Color")

Theming

Perhaps you are preparing a presentation, representing your institution, department, or research centre at home or abroad. In this case, you may wish to theme the whole network with institutional colors and fonts. Here we demonstrate one of the color scales available in {manynet}, the colors of the Sustainable Development Goals:

graphr(ison_lotr, node_color = "Race") + 
  scale_color_sdgs()

More institutional scales and themes are available, and more can be implemented upon pull request.

Further flexibility

For more flexibility with visualizations, {manynet} users are encouraged to use the excellent {ggraph} package. {ggraph} is built upon the venerable {ggplot2} package and works with tbl_graph and igraph objects. As with {ggplot2}, {ggraph} users are expected to build a particular plot from the ground up, adding explicit layers to visualise the nodes and edges.

library(ggraph)
ggraph(ison_greys, layout = "fr") + 
  geom_edge_link(edge_colour = "dark grey", 
                  arrow = arrow(angle = 45,
                                length = unit(2, "mm"),
                                type = "closed"),
                  end_cap = circle(3, "mm")) +
  geom_node_point(size = 2.5, shape = 19, colour = "blue") +
  geom_node_text(aes(label=name), family = "serif", size = 2.5) +
  scale_edge_width(range = c(0.3,1.5)) +
  theme_graph() +
  theme(legend.position = "none")

As we can see in the code above, we can specify various aspects of the plot to tailor it to our network.

First, we can alter the layout of the network using the layout = argument to create a clearer visualisation of the ties between nodes. This is especially important for larger networks, where nodes and ties are more easily obscured or misrepresented. In {ggraph}, the default layout is the “stress” layout. The “stress” layout is a safe choice because it is deterministic and fits well with almost any graph, but it is also a good idea to explore and try out other layouts on your data. More layouts can be found in the {graphlayouts} and {igraph} R packages. To use a layout from the {igraph} package, enter only the last part of the layout algorithm name (eg. layout = "mds" for “layout_with_mds”).

Second, using geom_node_point() which draws the nodes as geometric shapes (circles, squares, or triangles), we can specify the presentation of nodes in the network in terms of their shape (shape=, choose from 1 to 21), size (size=), or colour (colour=). We can also use aes() to match to node attributes. To add labels, use geom_node_text() or geom_node_label() (draws labels within a box). The font (family=), font size (size=), and colour (colour=) of the labels can be specified.

Third, we can also specify the presentation of edges in the network. To draw edges, we use geom_edge_link0() or geom_edge_link(). Using the latter function makes it possible to draw a straight line with a gradient. The following features can be tailored either globally or matched to specific edge attributes using aes():

  • colour: edge_colour=

  • width: edge_width=

  • linetype: edge_linetype=

  • opacity: edge_alpha=

For directed graphs, arrows can be drawn using the arrow= argument and the arrow() function from {ggplot2}. The angle, length, arrowhead type, and padding between the arrowhead and the node can also be specified.

To change the position of the legend, add the theme() function from {ggplot2}. The legend can be positioned at the top, bottom, left, or right, or removed using “none”.

For more see David Schoch’s excellent resources on this.

Exporting plots to PDF

We can print the plots we have made to PDF by point-and-click by selecting ‘Save as PDF…’ from under the ‘Export’ dropdown menu in the plots panel tab of RStudio.

If you want to do this programmatically, say because you want to record how you have saved it so that you can e.g. make some changes to the parameters at some point, this is also not too difficult.

After running the (gg-based) plot you want to save, use the command ggsave("my_filename.pdf") to save your plot as a PDF to your working directory. If you want to save it somewhere else, you will need to specify the file path (or change the working directory, but that might be more cumbersome). If you want to save it as a different filetype, replace .pdf with e.g. .png or .jpeg. See ?ggsave for more.

Visualisation

by James Hollway