1  Best practices

1.1 Using ggplot

Aim to use the ggplot R package for creating plots. ggplot is a powerful visualization package and should enable us to create the kind of plots we want to present. If we all use ggplot, we can write code that can be re-used by each other. It also makes it easy to write custom themes and color palettes that can be re-used, which will produce a consistency within and between projects that people are likely to appreciate.

Always save the plot as an object in the environment as well as printing it where it belongs.

‘Piping data into the ggplot’ to make the code more portable and readable. Keep steps separate.

1.2 Re-usability

Ideally, we will come up with a set of functions and default options for the key sets of plots we use and like. Otherwise define sets of common options for plots in one place, and repeated ggplot elements together as a list to avoid repetition and clutter and to to allow easy ‘global’ adjustment.

However, there are times in which too much automation makes things less flexible and increases mental overhead. We’ll come to good compromises.

1.3 Labeling

Use readable but concise labels. Examples of good labels are:

  • “Amount donated”
  • “GWWC” (we can assume people know what GWWC is)

Examples of bad labels are:

  • amt_don

1.4 Consistency

Be consistent throughout an article or report in terms of:

  • Labels

  • Color scheme

  • Colors matched to categories (e.g., if you match a particular color to a gender, make sure to keep using the same color to represent that gender)

  • Fonts

1.5 Anonymity

We should make sure that the real data we plot is anonymous. There are several tricks we can use to assure the anonymity of the data.

  1. Plot approximations of the data, rather than actual data itself. This can be done with geoms such as geom_jitter or the position argument.
  2. Disable hover information in interactive plots.