Mapping Data to Graphics

Dr. Nathaniel Cline

Agenda

1

Grammar of Graphics

2

Grammatical Layers

3

Review and to do

Stories about the Census


  1. Factoid

  2. Interaction

  3. Comparison

Data Story Critique


Make your own mobility animation

History

Before walking through the modern grammar of graphics, let’s look at some historical examples of creative ways people have mapped data to aesthetics to tell a story.

W.E.B. Dubois at the Paris Exhibition

Mapping data to aesthetics

Aesthetic

Visual property of a graph

Position, shape, color, etc.


Data

A column in a dataset

Describing Minard

Data Aesthetic Graphic/Geometry
Longitude Position (x-axis) Point
Latitude Position (y-axis) Point
Army size Size Path
Army direction Color Path
Date Position (x-axis) Line+text
Temperature Position (y-axis) Line+text

Describing Minard in R

Data aes() geom
Longitude x geom_point()
Latitude y geom_point()
Army size Size geom_path()
Army direction color geom_path()
Date x geom_line() + geom_text
Temperature y geom_line() + geom_text

Recall the basic ggplot framework

ggplot(data = DATA) +
  GEOM_FUNCTION(mapping = aes(AESTHETIC MAPPINGS))

Minard

ggplot(data = troops) +
  geom_path(mapping = aes(x = longitude,
                          y = latitude,
                          color = direction,
                          size = survivors))

This is a dataset named troops:

# A tibble: 3 × 4
  longitude latitude direction survivors
  <chr>     <chr>    <chr>     <chr>    
1 24        54.9     A         340000   
2 24.5      55       A         340000   
3 …         …        …         …        
ggplot(data = troops) +
  geom_path(mapping = aes(x = longitude,
                          y = latitude,
                          color = direction,
                          size = survivors))

Code
ggplot(data = troops,
       mapping = aes(x = longitude,
                     y = latitude,
                     color = direction,
                     size = survivors)) +
  geom_path(lineend = "round", linejoin = "mitre") + 
  scale_size_continuous(range = c(1, 20),
                        labels = scales::comma) +
  theme_gray(base_size = 20)

Remember this?

Data aes() geom
Wealth (GDP/capita) x geom_point()
Health (Life expectancy) y geom_point()
Continent color geom_point()
Population size geom_point()

This is a dataset named gapminder_2007

# A tibble: 3 × 5
  country     continent gdpPercap   lifeExp pop     
  <chr>       <chr>     <chr>       <chr>   <chr>   
1 Afghanistan Asia      974.5803384 43.828  31889923
2 Albania     Europe    5937.029526 76.423  3600523 
3 …           …         …           …       …       
ggplot(data = gapminder_2007,
       mapping = aes(x = gdpPercap,
                     y = lifeExp,
                     color = continent,
                     size = pop)) +
  geom_point() +
  scale_x_log10()

Code
ggplot(data = gapminder_2007,
       mapping = aes(x = gdpPercap,
                     y = lifeExp,
                     color = continent,
                     size = pop)) +
  geom_point() +
  scale_x_log10() +
  scale_size_continuous(range = c(1, 15),
                        labels = scales::comma) +
  theme_gray(base_size = 20)

Layers


We can layer data, aesthetics, and mappings on to the base ggplot() with the ggplot specific pipe operator: +

Aesthetics: color(discrete)

Aesthetics: color(continuous)

Aesthetics: size

Aesthetics: fill

Aesthetics: shape

Aesthetics: alpha

Example geom What it makes
geom_col() Bar charts
geom_text() Text
geom_point() Points
geom_boxplot() Boxplots
geom_sf() Maps

Scales

Scales change the properties of the variable mapping

Example layer What it does
scale_x_continuous() Make the x-axis continuous
scale_x_continuous(breaks = 1:5) Manually specify axis ticks
scale_x_log10() Log the x-axis
scale_color_gradient() Use a gradient
scale_fill_viridis_d() Fill with discrete viridis colors

scale_x_log10()

Code
ggplot(gapminder_2007, aes(x = gdpPercap, y = lifeExp, color = continent, size = pop)) +
  geom_point() +
  scale_x_log10()

scale_color_viridis_d()

Code
ggplot(gapminder_2007, aes(x = gdpPercap, y = lifeExp, color = continent, size = pop)) +
  geom_point() +
  scale_x_log10() +
  scale_color_viridis_d()

Data Viz Catalog

Review

  • We explored the translation of a visual graphic to a grammar (ggplot)

  • We will keep practicing this translation process, and the process of adding layers to a ggplot graphic

  • Next time we will talk about some other dimensions (like time) and we will explore some more creative approaches to data story telling

Code
ggplot(gapminder, aes(x = gdpPercap, y = lifeExp, 
                      size = pop, color = country)) +
  geom_point(alpha = 0.7) +
  scale_size(range = c(2, 12)) +
  scale_x_log10(labels = scales::dollar) +
  guides(size = "none", color = "none") +
  facet_wrap(~continent) +
  # Special gganimate stuff
  labs(title = 'Year: {frame_time}', x = 'GDP per capita', y = 'life expectancy') +
  transition_time(year) +
  ease_aes('linear')

To-Do

Before next class:

  • Make a graph using the county census data set. You can use any geom you like, but you must map a variable to color.

    • note pay attention to whether you are mapping color to a discrete variable or a continuous variable

    • Use ggsave("yourfilename.png") after your ggplot code and post it to our Teams site

  • Pick a data viz from Data Viz Catalog and be ready to summarize what it is, and when it might be useful in class.