
Key Questions:
Topics:
Networks are not random - they exhibit consistent structural patterns:
Central Question: What social, biological, or physical processes create these patterns?

Typical network: hubs, clusters, paths
Two complementary perspectives:
Challenge: Disentangling these effects requires sophisticated statistical models
Major categories of network formation processes:
| Category | Description | Example |
|---|---|---|
| Dyadic | Two-node processes | Homophily, reciprocity |
| Triadic | Three-node processes | Transitivity, balance |
| Higher-order | Four+ node processes | K-cores, cliques |
| Node-level | Individual preferences | Preferential attachment |
| Exogenous | External factors | Geography, institutions |
Dyadic processes operate on pairs of nodes (dyads)
A dyad can be in three states:
Dyadic processes explain why specific pairs of nodes connect

Definition: The tendency for directed edges to be reciprocated
Mechanism: If i → j exists, j → i becomes more likely
Examples:
Statistical signature: Higher proportion of mutual dyads than expected by chance

Definition: “Similarity breeds connection” - nodes with similar attributes connect more
Types:
Formula: \(Pr(e_{ij} = 1 | phi(A_{i}, A_{j}))\)
where \(\phi\) is a similarity or distance function.
Empirical Evidence:

Critical distinction:
Problem: Observational data shows correlation, not causation!
Empirical example - Wikipedia editors (Crandall et al., 2008):
Solutions:
Heterophily: Attraction to difference (opposite of homophily)
Distance/Geography: Physical proximity increases connection probability
Status homophily: Nodes connect within same status level
Resource exchange: Complementary needs drive connections

Triadic processes involve three nodes forming triangles
Key configurations:
Core insight: The presence of two edges affects the probability of the third

Definition: “Friends of friends become friends”
Mechanism: If A→B and B→C, then A→C becomes more likely
Also called: Triadic closure, clustering
Examples:
Measurement: Clustering coefficient, transitivity ratio

Kossinets & Watts (2006)3: Email network at large US university
Data: ~22,000 students over one year
Key findings:
Triadic closure (shared friends):
Focal closure (shared activities/classes):
Implication: Both structural and contextual effects matter!
3 Kossinets, G., & Watts, D. J. (2006). Empirical analysis of an evolving social network. science, 311(5757), 88-90.

Focal Closure (Easley & Kleinberg, 2010)2:
Membership Closure:
Key insight: Context matters as much as network structure!
2 Easley, D., & Kleinberg, J. (2010). Networks, crowds, and markets: Reasoning about a highly connected world (Vol. 1). Cambridge: Cambridge university press.

Structural balance (Heider, 1946): Cognitive consistency in signed networks
Balanced triangles (all relationships consistent):
Unbalanced triangles (tension):
Prediction: Networks evolve toward balanced states
16 possible triadic configurations (triads census)1:
Key patterns:
Different processes favor different configurations
Each tells us something about network formation mechanisms!
1 Vladimir Batagelj and Andrej Mrvar, A subquadratic triad census algorithm for large sparse networks with small maximum degree, University of Ljubljana, http://vlado.fmf.uni-lj.si/pub/networks/doc/triads/triads.pdf

Transitivity: A→B→C implies A→C (hierarchical)
Cyclicity: A→B→C implies C→A (circular)
Distinction matters for understanding network function!

Challenge: Network data violates independence assumptions
Classical approach fails: Standard statistical tests assume independent observations
Solutions:
Logic:
Key decision: What to preserve when randomizing?

In network analyses, permutation tests are often called Graph Uniform Tests.
Purpose: Test whether an observed network differs significantly from random networks
Fundamental Aspects:
Interpretation: - p < 0.05: Network structure differs significantly from random - p ≥ 0.05: Cannot reject that network is random
Applications: Detect non-random patterns in social networks, communication networks, biological networks
Quadratic Assignment Procedure: Test relationship between network matrices
Use case: Does network Y depend on network X?
Example: Does friendship (Y) depend on geographic proximity (X)?
Procedure:
Advantage: Accounts for network dependencies
Core idea: Model probability of observing a network as function of its features
Form: \[P(Y = y) = \frac{1}{\kappa} \exp\left(\sum_k \theta_k s_k(y)\right)\]
Where:
Think of it as logistic regression for networks
Instead of: P(person votes yes) = f(age, income, …)
We have: P(network has this structure) = f(edges, triangles, homophily, …)
Key insight: Model entire network as outcome, not individual ties
Advantages:
| Term | Effect | Interpretation |
|---|---|---|
edges |
Density | Overall connection probability |
mutual |
Reciprocity | Tendency for reciprocation |
triangle |
Transitivity | Tendency for closure |
nodematch("attr") |
Homophily | Same-attribute connection |
nodecov("attr") |
Attribute effect | Node attribute influence |
gwesp |
Shared partners | Weighted transitivity |
Model specification: Choose terms based on theory!
Positive coefficient (θ > 0): Configuration is more likely than chance
Negative coefficient (θ < 0): Configuration is less likely than chance
Example interpretation:
edges: -2.5 (low baseline density)
mutual: 1.8 (strong reciprocity)
triangle: 0.4 (moderate transitivity)
nodematch("gender"): 0.6 (gender homophily)
Reading: Controlling for density, reciprocity, and triangles, same-gender ties are more likely
Degeneracy problem: Some specifications lead to extreme networks
Computational challenge: Calculating κ is often intractable
Solutions:
So far: Models of static networks (single time point)
Reality: Networks evolve over time
Dynamic approaches:
Key assumption: Actors make rational decisions about ties
Process:
Utility function includes:
Software: RSiena package in R
Advantage: Separates selection from influence effects!

```
| Approach | Time | Question | Strength |
|---|---|---|---|
| ERGM | Static | What structure? | Multiple mechanisms |
| SAOM | Panel | How does it change? | Selection vs influence |
| REM | Continuous | When do events occur? | Precise timing |
| TERGM | Discrete steps | What changes? | Network evolution |
Choice depends on:
Network structure emerges from systematic processes, not randomness
Dyadic processes (reciprocity, homophily) explain pairwise connections
Triadic processes (transitivity, balance) explain local clustering
Statistical tests must account for network dependencies (permutation, QAP)
ERGMs model network probability as function of multiple mechanisms
Dynamic models capture how networks evolve over time
Next steps: Applying these models to real data!
Core texts:
Software:
statnet package (R): ERGMsRSiena package (R): SAOMsigraph package (R/Python): Basic network analysisOnline resources: INSNA workshops, Summer institutes