Chapter6:
Probabilistic Analysis and Randomized Algorithms
This chapter introduces:
•What randomized algorithms are
•Why they are useful
•How to analyze them using probability
•Their time & space complexity
•Examples (Randomized QuickSort,
Hashing, Max-Cut)
Randomized algorithms
• Randomized algorithms are algorithms that make
use of random choices.
In addition to the regular input, these algorithms
take random bits as a second input.
• They differ from deterministic algorithms because:
- They may produce different outputs for the
same input
- Their running time or result may depend on
randomness
• Randomization often simplifies design and
improves performance
Why Randomization?
Randomization is used because it brings major advantages:
1. Simplicity
Many deterministic algorithms are complex. The randomized
version is often shorter, easier, and avoids special-case handling.
2. Performance improvement
Randomized algorithms may avoid worst-case behaviors
common to deterministic ones.
Example: QuickSort is slow on sorted input — unless pivot is
chosen randomly.
3. Better average-case behavior
Randomization removes dependency on "bad" inputs chosen
intentionally or accidentally.
4. Applications in hard problems
For many NP-hard problems (Max-Cut), only randomized
approximation algorithms are known.
Randomized algorithms
Two Types of Randomized Algorithms
Monte Carlo Algorithms
•Always run in fixed time
•Might produce incorrect answers
•The probability of error can be made arbitrarily small by
repetition
•Example: randomized Min-Cut
Las Vegas Algorithms
•Always produce a correct answer
•Running time depends on luck (random choices)
•Expected time is polynomial
•Example: randomized QuickSort (pivot random but
result exact)
probability theory
Randomized algorithms are analyzed using
probability theory, especially:
Random variables
Quantities depending on random choices (running
time, number of satisfied clauses, etc.)
Expected value
E(X)=∑x⋅P(X=x)
Represents the average performance over all
possible random choices.
Probability of correctness
Ensures that even though randomness is used, results
are reliable.
Example
Expected Running Time
For randomized algorithms, expected running
time is key.
Example:
•QuickSort worst-case: O(n²)
•But expected time: O(n log n)
Because random pivot typically splits the array
well
Monte Carlo algorithms
Expected Correctness
Monte Carlo algorithms:
•May fail
•But if repeated many times, the
probability of failure becomes extremely
small
Example:
•If error probability = 1/2
•After k repetitions, error = (1/2)^k
•20 repetitions → error ≈ 1 in 1,000,000
Expectation of a Random Variable”
This is the general definition of expectation:
E(X)=∑x⋅P(X=x)
Meaning:
•X is a random variable
•It can take different values x
•Each value occurs with probability P(X=x)
•The expectation is the weighted average of all possible
values
This is the general formula used everywhere.
Expected Space
Expected Space of an Algorithm
Now apply the general formula to a specific random variable:
Let
•X = “space used by the algorithm”
•It can take different values:
•Space(1),Space(2),… depending on the random choices made
by the algorithm.
So:
E[Space]=∑Space(i)⋅P(i)
This is the same formula, but with different variable names.
How to compute P(i) — The general rule
If an algorithm makes several independent
random choices:
P(i)=∏j=1k P(choice_j)
Multiply the probability of every random
decision along the path.
Example 1 — Randomized QuickSort
You pick a pivot uniformly at random from the array.
•Suppose there are elements
•Probability to choose any pivot = 1/n
If the algorithm makes 3 random pivot choices:
•Pivot 1 probability = 1/n
•Pivot 2 probability = 1/(n1)
•Pivot 3 probability = 1/(n2)
So:
P(i)=1n⋅1n1⋅1n2
This is the probability of that exact sequence of pivots.
Example: QuickSort
Randomized QuickSort
Steps:
[Link] a pivot randomly
[Link] array into elements < and > pivot
[Link] same method recursively
Why does randomness help?
Because choosing the pivot randomly avoids:
•Already sorted arrays
•Reverse sorted arrays
•Repeated worst-case splits
Expected performance:
T(n)=O(nlogn)
Randomized Hashing
Randomized Hashing
Randomization is used to pick a hash function from a
family.
Benefits:
•Reduces chance of collision
•Ensures average-case performance stays good
•Search / insert / delete in O(1) expected time
Applications:
•Symbol tables
•Dictionaries
•Compiler implementation
•Databases
Randomized Min-Cut
Monte Carlo Example: Randomized Min-Cut
What is the minimum cut problem?
We have a graph (vertices connected by
edges).
The goal is to:
👉 separate the vertices into two groups
👉 by cutting the minimum number of edges
Randomized Min-Cut
This algorithm is randomized:
We choose an edge at random
We merge (contract) the two vertices connected by this
edge
We repeat until only 2 vertices remain
The edges remaining between these two vertices form a
cut
⚠️ The result may be incorrect, but if we repeat the
algorithm multiple times, the probability of finding the
true minimum cut increases.
Simple Example
Starting graph
Let’s suppose we have this graph:
A ----- B
|\ |
| \ |
| \ |
C ----- D
Simple Example
AD
Random execution of the algorithm
Step 1: choose an edge at random / \
Example: we choose A–D B C
→ We merge A and D → new vertex: AD \ /
The graph becomes:
Step 2: choose another random edge
Suppose we choose AD–B
→ We merge AD and B → new vertex: ADB ----- C
ADB
Only 2 vertices remain:
Simple Example: Min-Cut
The edges between ADB and C
form the cut that was found.
This is not always the minimum
cut, but it is a likely solution
Randomized Min-Cut
Its main use is to determine the weak point of a
network or the most “fragile” separation between
two parts of a graph. Here is a more detailed
explanation:
Imagine a water network with 5 reservoirs connected by
pipes with different capacities.
The minimum cut tells you which connections to remove
so that water can no longer flow from one side to the
other, while cutting the smallest total capacity.
If you want to protect the network, you reinforce the
pipes in the minimum cut.
Primality Algorithm
DeterministicTest(n):
if n <= 1:
return "composite"
if n == 2 or n == 3:
return "prime"
for i = 2 to floor(sqrt(n)):
if n mod i == 0:
return "composite"
return "prime"
Fermat Primality Test
Fermat Primality Test
•A simple randomized algorithm to test if a number
n>1 is prime.
•Type: Monte Carlo algorithm
• Runs fast
• May return “probably prime” for a composite
number
• Error can be reduced by repetition
Fermat Primality Test
Idea
•Based on Fermat’s Little Theorem:
•If p is prime and 1<a<p then
a^{p-1} ≡1 (mod p)
•Conversely:
•If a^{n-1} ≠1 (modn), then is definitely composite.
•Randomness comes from choosing a random base a.
Algorithm (one trial)
Input: integer n>2, random base a
Steps:
[Link] a random integer a in [2,n−2]
[Link] a^{n-1} mod n
[Link] a^{n-1} ≢ 1 (mod n) → COMPOSITE
[Link] → probably prime for this
trial
Note: One trial is a Monte Carlo test —
may be wrong if nnn is composite
Repetition to Reduce Error
•Repeat the test k times with
independent random bases a_1, a_2,
…, a_k
•If any trial detects composite →
return COMPOSITE
•If all trials say “probably prime” →
return PROBABLY PRIME
Error probability:
Total error= p^k (p = error per trial)
Test n=15, one trial, base a=2:
[Link] 2^14mod 15 using repeated
squaring:
1.2^1=2
2.2^2 = 4
3.2^4 = 16 ≡1 (mod15)
4.2^8=1^2≡1 (mod15)
5.2^14=2^8⋅2^4⋅2^2≡1.1.4=4 (mod15)
[Link] 4≢1 → COMPOSITE
✅ Fermat test successfully detects that 15
is composite.
Monte Carlo Principle
•Randomness: choosing base a randomly
•Error possible: may return “probably prime”
for some composite numbers (Carmichael
numbers)
•Repetition: multiple independent trials
reduce the error probability
•Fast: each trial runs in polynomial time
Fermat test
•Fermat test is simple, fast, Monte
Carlo
•One trial can detect some
composites, but may fail for special
numbers
•Use k independent trials to make
error probability very small
•Base principle:
a^n−1≢1(modn) ⟹ n is composite
Randomized
Deterministic (Trial
Criteria (Fermat/Miller-
Division/AKS)
Rabin)
Speed Fast Slow
Certainty Probably correct Absolutely correct
Large numbers Good Bad
Small numbers OK Excellent
Las Vegas Algorithms
Las Vegas Algorithms
Principle
A Las Vegas algorithm is a randomized algorithm
that:
•Always returns a correct answer
•But the running time is random
•The algorithm may keep trying random choices until
it succeeds
•Expected running time is usually efficient
📌 Monte Carlo vs Las Vegas (1-line comparison):
•Monte Carlo: fast but may be wrong
•Las Vegas: always correct but running time may
vary
Randomized Search for a Maximum
You have an unsorted array, and you want to find the
maximum element.
A Las Vegas algorithm keeps picking random elements
until it samples the maximum.
Algorithm (Las Vegas version)
[Link] a random element from the array.
[Link] it is the maximum → return it (correct result).
[Link] → pick again randomly.
[Link] until the maximum is found.
✔️ Always correct
✔️ Time is random (expected O(n))
Randomized QuickSort
Randomized QuickSort is the classic Las Vegas
example.
Idea
•Choose the pivot randomly
•Sorting result is always correct
•But running time depends on the luck of the random
pivot
Why Las Vegas?
•✔️ Always outputs a correctly sorted array
•❗ Running time (best/avg/bad) depends on random
pivot choice
Strengths of Randomized Algorithms
[Link]
Randomized solutions are often shorter
and easier.
[Link]
Often faster expected performance.
[Link] worst-case scenarios
Randomness avoids adversarial input.
[Link] for NP-hard problems
Good approximations for Max-Cut,
Max-SAT, etc.
[Link]
Multiple executions run independently.
Weaknesses
[Link] Carlo errors
Solutions may be incorrect.
[Link] Vegas unpredictability
Running time may vary widely.
[Link] to analyze
Expectations and probabilities can be hard to
compute.
[Link]-repeatability
Same input can produce different output.
[Link] guarantee worst-case performance
Randomized algorithms often lack a strict worst-
case bound.
Conclusion
•Randomization is a powerful tool
•Expected time analysis is more meaningful than worst-case
•Monte Carlo algorithms trade correctness for speed
•Las Vegas algorithms trade running-time predictability for
correctness
•Key examples include QuickSort, Hashing, Max-Cut,