… based on the concepts of functions as verbs that manipulate data frames
select
: pick columns by namearrange
: reorder rowsslice
: pick rows using index(es)filter
: pick rows matching criteriadistinct
: filter for unique rowsmutate
: add new variablessummarise
: reduce variables to valuesgroup_by
: for grouped operationsIn programming, a pipe is a technique for passing information from one process to another.
Think about the following sequence of actions - find keys, unlock car, start car, drive to work, park.
%>%
used mainly in dplyr pipelines, we pipe the output of the previous line of code as the first input of the next line of code
+
used in ggplot2 plots is used for “layering”, we create the plot in layers, separated by +
A grammar of graphics is a tool that enables us to concisely describe the components of a graphic
ggplot()
is the main function in ggplot2Commonly used characteristics of plotting characters that can be mapped to a specific variable in the data are
colour
shape
size
alpha
(transparency)aes()
geom_*()
Smaller plots that display different subsets of the data
Useful for exploring conditional relationships and large data
Econ 255 - Data Storytelling