Types of Two Level Branch
Predictor
Detailed Explanation of the 9 Types of Two-Level Adaptive Branch Predictors
The Two-Level Adaptive Branch Predictor, introduced by Yeh and Patt (1991), is an
advanced dynamic branch prediction technique that achieves high accuracy by using two
levels of branch history tracking:
1. First Level (Branch History Register - BHR or BHSR) → Captures recent branch
outcomes.
2. Second Level (Pattern History Table - PHT) → Uses history patterns to predict
future branches.
Yeh and Patt categorized nine different implementations of two-level predictors based on:
First-Level Branch History Mechanism (Global G, Per-Address P, Per-Set S)
Second-Level Pattern Table Mechanism (Global g, Per-Address p, Per-Set s)
These implementations use the notation XAy, where:
X represents the first-level history type (G, P, S)
A signifies an Adaptive FSM-based prediction
y represents the second-level pattern table type (g, p, s).
1. First-Level Mechanism (Branch History Register -
BHR)
The first level of the predictor determines how the branch history is maintained:
(G) Global History
Single global BHR shared by all branches.
Tracks the last N dynamic branches.
Advantage: Captures long-range correlations between branches.
Disadvantage: Can cause aliasing, where different branches interfere with each
other.
(P) Per-Branch (Per-Address) History
Each branch has its own individual history register.
Advantage: Avoids aliasing between different branches.
Disadvantage: Requires more storage.
(S) Per-Set History
Branches are divided into sets; each set shares a history register.
Uses hashing or PC-based partitioning.
Advantage: Reduces aliasing while using less storage than P.
Disadvantage: Some aliasing remains.
2. Second-Level Mechanism (Pattern History Table -
PHT)
The second level consists of Pattern History Tables (PHTs) that use 2-bit saturating
counters to make predictions based on history patterns.
(g) Global Pattern Table
Single shared PHT for all branches.
Indexed using global branch history.
Advantage: Captures long-range correlations.
Disadvantage: High aliasing.
(p) Per-Branch (Per-Address) Pattern Table
Each branch gets its own PHT.
Indexed using per-branch history.
Advantage: Reduces aliasing significantly.
Disadvantage: Requires more storage.
(s) Per-Set Pattern Table
Groups of branches share a PHT.
Indexed using set-based partitioning (e.g., hashing PC bits).
Advantage: Reduces storage compared to p while limiting aliasing.
Disadvantage: Some aliasing may still occur.
3. The 9 Types of Two-Level Adaptive Branch Predictors
Predictor First-Level Second-Level
Characteristics
Type (History) (Pattern Table)
Simple & compact, but suffers high
GAg Global (G) Global (g)
aliasing.
GAp Global (G) Per-Address (p) Reduces aliasing while using less
Predictor First-Level Second-Level
Characteristics
Type (History) (Pattern Table)
storage than PAp.
Balances aliasing & storage but still
GAs Global (G) Per-Set (s)
shares history globally.
Separate history for each branch but
PAg Per-Address (P) Global (g)
shared prediction table.
Highest accuracy, but largest storage
PAp Per-Address (P) Per-Address (p)
cost.
PAs Per-Address (P) Per-Set (s) Compromise between PAp & PAg.
Reduces aliasing but has limited
SAg Per-Set (S) Global (g)
correlation tracking.
Tracks history per-set, but uses
SAp Per-Set (S) Per-Address (p)
separate tables per branch.
Balanced tradeoff between aliasing &
SAs Per-Set (S) Per-Set (s)
storage.
4. Why is Two-Level Prediction Called Adaptive?
The two-level adaptive predictor is adaptive because: ✔ It learns dynamic branch
behavior over time.
✔ It adjusts to changing patterns in branch execution.
✔ It captures correlations between multiple branches, unlike single-level predictors that
only track individual branches.
By using historical branch behavior, these predictors dynamically refine their predictions,
reducing mispredictions and improving performance in modern superscalar processors.
5. Real-World Use Cases
Many modern CPUs use variants of two-level adaptive prediction:
🔹 Intel Processors (Pentium Pro, Pentium 4, Core i7)
Use GAp & PAp-based hybrid predictors.
Integrated with gshare (global history XOR’d with branch address).
🔹 AMD Processors (Athlon, Opteron, Ryzen)
Implement tournament predictors combining GAp and PAp.
🔹 IBM POWER Processors
POWER4 & POWER5 used PAs and hybrid predictors.
🔹 AI & Gaming (GPUs & Consoles)
NVIDIA & AMD GPUs employ two-level branch prediction to optimize shaders
and parallel execution.
6. Summary
1️ Two-level adaptive prediction improves accuracy by tracking historical branch
patterns.
2️ Nine different variations exist, balancing accuracy, storage, and aliasing reduction.
3️ Global (G), Per-Address (P), and Per-Set (S) history tracking determines first-level
behavior.
4️ Global (g), Per-Address (p), and Per-Set (s) pattern tables define second-level
behavior.
5️ Adaptive predictors dynamically learn and adjust to changing program behavior,
making them highly efficient for modern processors.
These predictors are widely used in high-performance CPUs and help maintain pipeline
efficiency in superscalar architectures. 🚀