SMM635 - Week 2
Bayes Business School
Part 1: Grammar of Graphics
Part 2: Visual Forms
By the end of today’s session, you will:
Traditional Approach:
Note
We can use labels or conceptual categories
Grammar Approach:
Note
We can refer to a chart’s constitutive components
“Grammar makes language expressive. A language consisting of words and no grammar expresses only as many ideas as there are words.” - Leland Wilkinson
Important
A pie chart is just a stacked bar chart in polar coordinates! 🤯
Source: https://r.qcbs.ca/
Data Variables
Visual Variables
Best for: Comparisons, counts
Note
All data in a single plot
Note
Creates separate panels for each region
Note
Adds linear regression line with confidence interval
Note
The standard x-y coordinate system
Note
Swaps x and y axes for horizontal bars
Note
Clean, minimal design
Note
Black and white with borders
Note
Classic look with axis lines only
Part 1: Foundation
flowchart TD START((" ")) --> A["DATA"] --> B["AESTHETICS"] --> C["GEOMETRY"] style START fill:#90EE90,stroke:#333,stroke-width:1px style A fill:#e1f5ff style B fill:#e1f5ff style C fill:#e1f5ff
Part 2: Refinement
flowchart TD D["FACETS"] --> E["STATISTICS"] --> F["COORDINATES"] --> G["THEME"] --> END((" ")) style D fill:#e1f5ff style E fill:#e1f5ff style F fill:#e1f5ff style G fill:#e1f5ff style END fill:#FF6B6B,stroke:#333,stroke-width:1px
Note
Each layer adds information without obscuring previous layers
Continuous Data
Categorical Data
Histograms divide data into bins and count observations in each bin.
Density plots show a smoothed version of the distribution.
Bar charts use bar length to encode category counts or values.
Pie charts show parts of a whole as slices of a circle.
X Variable | Y Variable | Best Chart Types |
---|---|---|
Continuous | Continuous | Scatter plot, Line chart |
Continuous | Categorical | Box plot, Violin plot |
Categorical | Categorical | Heatmap, Grouped bars |
Time | Continuous | Line chart, Area chart |
Scatter plots display individual data points in 2D space.
Scatter plot with trend line adds a fitted model to show the relationship.
2D density plots show concentration of points as contours or filled regions.
Grouped box plots compare distributions across multiple categories.
Violin plots combine box plots with kernel density estimation.
Strip charts (jittered) show all individual data points.
Strategies for encoding multiple variables:
Note
Dataset Variables:
Tip
📊 The Grammar of Graphics provides a systematic framework for creating any visualization
🔧 Complex visualizations are built from simple, reusable components
🎨 Visual variables (position, size, color, etc.) are tools for encoding information
📈 Choose chart types based on data types and relationships
🔄 Iteration and layering lead to rich, informative graphics
🌐 Course website: https://simonesantoni.github.io/data-viz-smm635
💬 Office hours: Wednesdays 3-5 PM
SMM635 - Data Visualization | Week 2