{ggplot2}’s two level aesthetics: dataviz or geom-specific – Graph | Plot | Chart

In {ggplot2} it’s aes() that does all the heavy lifting.

But there are two levels of aesthetics. And a few ways of specifying them. If you don’t really feel confident with aes() start reading what does aes() actually do?.

Dataviz-level aesthetics
Geom-specific aesthetics.

The differences and nuances between these are unfortunately often misunderstood and lead to confusion. So let’s get into them.

Dataviz-level aesthetics.

This is the type of aesthetic used in most data visualisations built with {ggplot2}, but you’ll see them written in two slightly different ways:

library("tidyverse")
library("fivethirtyeight")

# Traditional
bechdel |> 
  ggplot(aes(x = budget_2013, 
             y = domgross_2013))

library("tidyverse")
library("fivethirtyeight")

# Explicit
bechdel |> 
  ggplot() +
  aes(x = budget_2013,
      y = domgross_2013)

Both of these will produce the exact same chart and there are no functional differences between them, but we’re huge fans of the “explicit dataviz-level aesthetics” and recommend that you use it for three reasons:

Fewer nested brackets makes code easier to read!
Commenting out aesthetics and messing about with them is much easier than in the “traditional” format
It is extremely flexible when you get into advanced dataviz with {ggplot2} in two important use cases

Add conditional aesthetics when building functions
Programmatically map over base plots with alternating dataviz-level aesthetics

Build all your charts like this instead of swapping between traditional and explicit for consistency. It will also make scope creep much easier to deal with!

Great. But what does aes() actually do?

What does aes() actually do?

ggplot2 uses aes() to turn your data into dataviz. It does that by creating aesthetic mappings between your data and the coordinate systems of your chart, let’s take a proper look at that chart:

library("tidyverse")
library("fivethirtyeight")

bechdel |> 
  ggplot() +
  aes(x = budget_2013,
      y = domgross_2013)

We have axes! Where did they come from? Well, {ggplot2} sends the aesthetic mappings over to the {scales} package which builds the axes that we see. In fact, {scales} is the unsung hero of {ggplot2} which simply wouldn’t work without it. It’s such an integral part of many R tools that you’ll find {scales} in the R-lib Infrastructure collection of packages, instead of the {tidyverse}.

In our R Dataviz Bootcamp we go deep into how to take control of these scales, but for now let’s move on to geom-specific aesthetics.

Geom-level aesthetics

You need to use geom-level aesthetics in one of three circumstances:

A geom needs additional aesthetics not provided at the dataviz-level
A geom is being given different aesthetics to the dataviz-level aesthetics
A geom is being given an entirely different dataset and new aesthetic mappings are required.

Let’s go through these in turn.

Some geoms need additional aesthetics

Some geoms have specific aesthetics they need to work. A nice example for our chart is geom_vline() which requires xintercept. It would be confusing if that appeared in the dataviz-level aesthetics, and in some cases could break other geoms.

bechdel |> 
  ggplot() +
  aes(x = budget_2013,
      y = domgross_2013) +
  geom_point() +
  geom_vline(aes(xintercept = mean(budget_2013, na.rm = TRUE)))

Be careful with the more fiddly geoms as there are big differences between x, xend and xmax. Refer to the documentation for the specific geoms you’re working with.

Geom is given different aesthetics

Something you could do is visualise the mean() and median() values like this:

bechdel |> 
  ggplot() +
  aes(x = budget_2013,
      y = domgross_2013) +
  geom_point() +
  geom_point(aes(x = mean(budget_2013, na.rm = TRUE),
                 y = mean(domgross_2013, na.rm = TRUE)),
             colour = "#1E8F91",
             size = 10) +
   geom_point(aes(x = median(budget_2013, na.rm = TRUE),
                  y = median(domgross_2013, na.rm = TRUE)),
             colour = "#AA3D4F",
             size = 10)

Now. This is not an easy chart to work with. If we wanted to add legends to this it would be chaos. Oh, and {ggplot2} gives us this warning:

1: In geom_point(aes(x = mean(budget_2013, na.rm = TRUE), y = mean(domgross_2013, All aesthetics have length 1, but the data has 1794 rows. ℹ Please consider using annotate() or provide this layer with data containing a single row.

But, the thing is that annotate() isn’t your friend here as you can’t create aesthetic mappings with it.

The best solution to this problem is giving a geom an entirely different dataset

Geom is given an entirely different dataset

This is where {ggplot2} gets really powerful because we can combine multiple datasets into one chart and take advantage of shared {scales} within the chart.

To elegantly show that let’s use a little bit of {tidyverse} jiggery pockery to make a nice tidy dataset with the summary statistics we showed before:

Code

bechdel_summary_stats_2013 <- bechdel |> 
  summarise(across(contains("2013"), list(mean = ~mean(.x, na.rm = TRUE), median = ~median(.x, na.rm = TRUE)))) |> 
  pivot_longer(everything(),
               names_sep = "_2013_",
               names_to = c("variable", "measure")) |> 
  pivot_wider(names_from = variable,
              values_from = value)

measure	budget	domgross	intgross
mean	55464608	95174784	197837985
median	36995786	55993640	96239640

Now, let’s add that dataset into our chart using geom-specific aesthetics and the data argument.

bechdel |> 
  ggplot() +
  aes(x = budget_2013,
      y = domgross_2013) +
  geom_point() +
  geom_point(data = bechdel_summary_stats_2013,
             aes(x = budget,
                 y = domgross,
                 colour = measure),
             size = 10)

That’s a really powerful construction for a data visualisation thanks to smartly combining dataviz-level and geom-specific aesthetics.

Combining everything together into (an admittedly bad) chart, here we have all of the different combinations of aes.

bechdel |> 
  ggplot() +
  aes(x = budget_2013,
      y = domgross_2013) +
  geom_point() +
  geom_point(data = bechdel_summary_stats_2013,
             aes(x = budget,
                 y = domgross,
                 colour = measure),
             size = 10) +
  geom_vline(aes(xintercept = max(budget_2013, na.rm = TRUE))) +
  geom_hline(aes(yintercept = max(domgross_2013, na.rm = TRUE)))

Wait. What about inherit.aes?

Can you predict what this chart will look like?

bechdel |> 
  ggplot() +
  aes(x = budget_2013,
      y = domgross_2013) +
  geom_point(aes(y = intgross_2013), inherit.aes = FALSE)

That’s a question for another blogpost.