How I achieved 17x-400x performance improvement by switching from offset to cursor-based pagination in my mass notification server?
When your Firebase push notification server needs to process millions of user tokens, every query matters. I learned this the hard way when my offset-based pagination started taking 30+ seconds to process deep pages. Here's how I achieved up to 400x performance improvement with one architectural change. —and the critical bug I almost shipped to production.
The Problem: Offset Pagination at Scale
My original implementation used TypeORM's skip() and take() methods—the standard offset-based pagination approach that works beautifully for small datasets but becomes a performance nightmare at scale.
Before Refactoring
// notification.service.ts - Original Implementation
async getUserTokensForNotification(offset: number, limit: number) {
const users = await this.userRepository
.createQueryBuilder('user')
.select(['user.push_token', 'user.cell_phone', 'user.logindate'])
.orderBy('user.id', 'ASC')
.skip(offset) // OFFSET in SQL
.take(limit) // LIMIT in SQL
.getMany();
return users;
}
// Processing loop
for (let page = 0; page < totalPages; page++) {
const offset = page * BATCH_SIZE;
const tokens = await getUserTokensForNotification(offset, BATCH_SIZE);
await sendNotifications(tokens);
}
This generated SQL queries like:
SELECT push_token, cell_phone, logindate
FROM users
ORDER BY id ASC
LIMIT 1000 OFFSET 50000;
Why This Hurts Performance
The fundamental issue with OFFSET is what database engineers call the row skipping problem. Here's what happens under the hood:
- Database scans rows from the beginning
- Counts and discards all rows up to the OFFSET value
- Returns only the next LIMIT rows
- For
OFFSET 100000, LIMIT 1000, the database processes 101,000 rows but returns only 1,000
Real performance data from my PostgreSQL instance with 1M user records:
| Scenario | Query Time | Slowdown Factor |
|---|---|---|
| OFFSET 0 | 0.025ms | baseline |
| OFFSET 10,000 | 3.138ms | 125x slower |
| OFFSET 50,000 | 16.933ms | 677x slower |
| OFFSET 100,000 | 30.166ms | 1,207x slower |
As you paginate deeper, performance degrades linearly. When processing millions of tokens for mass notifications, pages 100+ were taking 30-80 seconds each.
Additional Problems I Encountered
Beyond performance, offset pagination caused operational issues:
- Duplicate notifications: When new users registered during processing, existing users shifted between pages, causing some to receive duplicate notifications
- Missing users: Deleted accounts caused page shifts, skipping some users entirely
- Non-deterministic ordering: Without a stable sort key (just using id), results could vary between requests on tie-breaks
The Solution: Cursor-Based Pagination
Cursor pagination (also called "keyset pagination" or "seek method") eliminates row scanning by using indexed WHERE clauses instead of OFFSET.
After Refactoring
// notification.service.ts - Refactored Implementation
async getUserTokensForNotification(
lastSeq?: number,
limit: number = 1000
): Promise {
const queryBuilder = this.userRepository
.createQueryBuilder('user')
.select(['user.seq', 'user.push_token', 'user.cell_phone', 'user.logindate'])
.orderBy('user.seq', 'ASC')
.limit(limit + 1); // Fetch one extra to detect hasMore
// Cursor-based filtering: seek directly to position
if (lastSeq) {
queryBuilder.where('user.seq > :lastSeq', { lastSeq });
}
const users = await queryBuilder.getMany();
const hasMore = users.length > limit;
if (hasMore) {
users.pop(); // Remove the extra item
}
return {
users,
nextCursor: hasMore ? users[users.length - 1].seq : null
};
}
// Processing loop - now cursor-based
let cursor: number | null = null;
do {
const result = await getUserTokensForNotification(cursor, BATCH_SIZE);
await sendNotifications(result.users);
cursor = result.nextCursor;
} while (cursor !== null);
The generated SQL now uses an indexed seek:
-- First page
SELECT seq, push_token, cell_phone, logindate
FROM users
ORDER BY seq ASC
LIMIT 1001;
-- Subsequent pages (lastSeq = 1000)
SELECT seq, push_token, cell_phone, logindate
FROM users
WHERE seq > 1000
ORDER BY seq ASC
LIMIT 1001;
Why This Works So Much Better
The magic is in the WHERE seq > :lastSeq clause. With a proper index on the seqcolumn, the database can:
- Jump directly to the position using the index (O(log n) operation)
- Read forward from that position
- No row scanning or counting required
With my composite index in place:
CREATE INDEX idx_users_seq ON users(seq ASC);
The database execution plan changed from a sequential scan to an index seek:
Before (OFFSET 100,000):
Execution Time: 30.166ms
- Sequential Scan on users
- Rows Scanned: 101,000
- Rows Returned: 1,000
After (Cursor with Index):
Execution Time: 0.027ms (1,117x faster!)
- Index Scan using idx_users_seq
- Index Cond: (seq > 100000)
- Rows Scanned: 1,001
- Rows Returned: 1,000
Performance became constant regardless of position in the dataset.
Implementation Pitfall: The Infinite Loop Bug
Just when I thought the refactoring was complete, I discovered a critical bug that could have caused an infinite loop in production. This happened when I combined cursor pagination with in-memory filtering for complex business logic.
The Bug: Post-Query Filtering Breaks Cursor Logic
My notification system needed to filter users based on their last login date—but the date format in the database was inconsistent (mix of Korean locale strings like "2025-10-30오후 5:28:28" and standard datetime formats). Parsing this in SQL would be complex and slow, so I decided to filter in-memory after fetching.
The Broken Implementation:
while (true) {
// Fetch 10,000 records from database
let usersInPage = await queryBuilder
.where('user.seq > :lastSeq', { lastSeq })
.orderBy('user.seq', 'ASC')
.take(10000)
.getMany();
if (usersInPage.length === 0) break;
// Apply in-memory filter (removes most records!)
usersInPage = filterByLoginDate(usersInPage, {
minDaysSinceLogin: 10,
maxDaysSinceLogin: 30
});
// Only ~1,875 records remain after filtering
// Collect tokens...
for (const user of usersInPage) {
tokens.add(user.push_token);
}
// 🔴 BUG: Using filtered array to update cursor!
if (usersInPage.length > 0) {
lastSeq = usersInPage[usersInPage.length - 1].seq;
}
}
What Went Wrong: The Infinite Loop Mechanism
Here's the deadly sequence that caused infinite pagination:
-
First Query:
WHERE seq > 0→ Returns seq 1-10000 (10,000 records) - In-Memory Filter: Apply logindate conditions → Only 1,875 records remain (Sparse distribution: seq values like 1, 50, 100, 250, ..., 2000)
-
Cursor Update:
lastSeq = 2000← Using the filtered array's last seq! -
Second Query:
WHERE seq > 2000→ Returns seq 2001-12000 (10,000 records) - ❌ Problem: Seqs 2001-10000 were already processed in the first query but are being fetched again!
The cursor jumped backward in the actual dataset, causing overlapping queries:
Query 1: seq 1 ─────────────────> seq 10000 (fetch)
↓ filter ↓
seq 1, 50, 100, ..., 2000 (keep)
↑
lastSeq = 2000 ❌
Query 2: seq 2001 ──────> seq 12000 (fetch)
├─── OVERLAP! ──┤
2001-10000 processed twice!
Query 3: seq 12001 ──────> ...
├──── OVERLAP! ─────┤
...continues forever...
Real Production Logs:
[Job abc123] 1,875 records matched (Total: 1,068,158)
[Job abc123] 1,875 records matched (Total: 1,070,033)
[Job abc123] 1,875 records matched (Total: 1,071,908)
[Job abc123] 1,875 records matched (Total: 1,073,783)
...
[Job abc123] 1,875 records matched (Total: 1,096,283)
Notice how we're accumulating 1,875 records per iteration infinitely—the filtered count, not the fetched count. The cursor was stuck in a loop, re-processing the same dataset range over and over. We had processed over 1 million records when the actual target was only 65,679 unique users!
The Fix: Capture Cursor Before Filtering
The solution is simple but critical: save the original cursor position before applying any in-memory filtering.
Corrected Implementation:
while (true) {
// Fetch from database
let usersInPage = await queryBuilder
.where('user.seq > :lastSeq', { lastSeq })
.orderBy('user.seq', 'ASC')
.take(10000)
.getMany();
if (usersInPage.length === 0) break;
// ✅ FIX: Capture the last seq BEFORE filtering
const lastSeqInOriginalBatch = usersInPage[usersInPage.length - 1].seq;
// Now safe to filter in-memory
usersInPage = filterByLoginDate(usersInPage, {
minDaysSinceLogin: 10,
maxDaysSinceLogin: 30
});
// Collect tokens from filtered results
for (const user of usersInPage) {
tokens.add(user.push_token);
}
// ✅ Use the original batch's last seq for cursor
lastSeq = lastSeqInOriginalBatch;
await delay(100); // Brief delay to reduce DB load
}
Why This Works:
Now the cursor correctly advances through the entire dataset:
Query 1: seq 1 ───────────> seq 10000 (fetch)
↑
lastSeq = 10000 ✅
Query 2: seq 10001 ───────────> seq 20000
↑
lastSeq = 20000 ✅
Query 3: seq 20001 ──────> ...
No overlaps, no infinite loops. Each query picks up exactly where the previous one left off, regardless of how many records were filtered out.
Results After the Fix:
[Job abc123] 10,000 fetched → 1,875 matched (Total: 1,875)
[Job abc123] 10,000 fetched → 2,134 matched (Total: 4,009)
[Job abc123] 10,000 fetched → 1,956 matched (Total: 5,965)
...
[Job abc123] 6,234 fetched → 1,103 matched (Total: 65,679)
[Job abc123] 0 fetched → pagination complete ✅
Final: 65,679 unique users processed (matching SQL query result)
Key Lessons Learned
1. Cursor Must Always Reflect Database State, Not Filtered State
Your cursor should advance through the actual dataset, not your application's filtered view of it. The cursor is a database concept, not an application concept.
// ❌ Wrong: Cursor based on application logic
lastSeq = filteredResults[filteredResults.length - 1].seq;
// ✅ Correct: Cursor based on database scan position
lastSeq = originalResults[originalResults.length - 1].seq;
2. If You Must Filter In-Memory, Capture Cursor First
When combining cursor pagination with post-query filtering:
const originalBatch = await query.getMany();
const cursorValue = originalBatch[originalBatch.length - 1].cursorField;
// Now safe to mutate or filter
const filtered = applyBusinessLogic(originalBatch);
// Use original cursor, not filtered cursor
nextCursor = cursorValue;
3. Better: Push Filtering Into SQL Whenever Possible
In-memory filtering was a code smell. I later refactored to handle date parsing in SQL:
queryBuilder
.where('user.seq > :lastSeq', { lastSeq })
.andWhere('user.logindate IS NOT NULL')
.andWhere(`
DATEDIFF(DAY,
CONVERT(DATE, LEFT(user.logindate, 10)),
GETDATE()
) BETWEEN :minDays AND :maxDays
`, { minDays: 10, maxDays: 30 });
This eliminated in-memory filtering entirely, improved performance further, and made the code more maintainable.
4. Add Defensive Pagination Limits
To catch infinite loop bugs early in development:
const MAX_ITERATIONS = 1000;
let iterations = 0;
while (true) {
if (++iterations > MAX_ITERATIONS) {
throw new Error(
`Pagination exceeded ${MAX_ITERATIONS} iterations. Possible infinite loop!`
);
}
// ... pagination logic
}
5. Monitor Cumulative Metrics, Not Just Per-Page Metrics
Track total records processed vs. expected:
console.log(`Processed: ${totalProcessed} | Expected: ~${expectedTotal}`);
if (totalProcessed > expectedTotal * 1.5) {
logger.warn('Processing significantly more records than expected - investigate for duplicates');
}
Implementation Best Practices
1. Always Use a Unique, Sequential Column
I used seq (a sequential integer column) rather than id (UUID) because:
- Sequential integers are index-friendly
- Natural ordering enables efficient range scans
- UUIDs lack sequential properties
@Entity()
export class User {
@PrimaryGeneratedColumn('uuid')
id: string;
@Column()
@Index() // Critical for performance!
seq: number; // Auto-incrementing sequence
@Column()
push_token: string;
}
2. Handle the Edge Case: First Page
Notice my code handles the optional cursor:
if (lastSeq) {
queryBuilder.where('user.seq > :lastSeq', { lastSeq });
}
When lastSeq is null or undefined, skip the WHERE clause to fetch from the beginning.
3. Fetch N+1 Items to Detect hasMore
Request limit + 1 items:
.limit(limit + 1); // Request one extra
Then check if you got more than requested:
const hasMore = users.length > limit;
if (hasMore) {
users.pop(); // Remove the extra item
}
This tells us whether there's a next page without a separate COUNT query.
4. Return Structured Results
Our refactored method returns both data and pagination state:
return {
users, // The actual data
nextCursor: hasMore ? users[users.length - 1].seq : null // Next page cursor
};
This makes the consuming code clean and the pagination state explicit.
5. Process in a Simple Loop
The cursor-based loop is elegantly simple:
let cursor: number | null = null;
let processedCount = 0;
do {
const result = await getUserTokensForNotification(cursor, BATCH_SIZE);
// Process this batch
await sendNotifications(result.users);
processedCount += result.users.length;
// Move cursor forward
cursor = result.nextCursor;
this.logger.log(`Processed ${processedCount} users so far...`);
} while (cursor !== null);
this.logger.log(`Completed: ${processedCount} total users processed`);
Performance Results
After deploying cursor pagination to production:
| Metric | Before (Offset) | After (Cursor) | Improvement |
|---|---|---|---|
| First page | 0.025ms | 0.025ms | Same |
| Page 50 (50K offset) | 16.933ms | 0.027ms | 627x faster |
| Page 100 (100K offset) | 30.166ms | 0.027ms | 1,117x faster |
| End-to-end (1M users) | ~45 minutes | ~2 minutes | 22x faster |
| Database CPU usage | 85% peak | 12% peak | 7x reduction |
| Infinite loop bug caught | N/A | Prevented | Avoided disaster |
The real-world impact was dramatic. Our daily batch notification job that previously took 45 minutes now completes in under 2 minutes. And by catching the infinite loop bug before production, we avoided what could have been a severe incident.
Trade-offs and When to Use Each Approach
Use Cursor Pagination When:
- ✅ Dataset is large (>100K records)
- ✅ Processing sequentially (batch jobs, exports)
- ✅ Performance is critical
- ✅ Real-time feeds or infinite scroll
- ✅ High write volume (avoids duplicate/skip issues)
- ✅ Need stable iteration (no duplicates/skips from concurrent writes)
Use Offset Pagination When:
- ✅ Small datasets (<10K records)
- ✅ Need page numbers or jump-to-page functionality
- ✅ Admin interfaces where UX expectations include page numbers
- ✅ Read-only or rarely-changing data
- ✅ Sorting by non-unique, non-indexed columns
Conclusion
Switching from offset to cursor pagination in my Firebase push notification server was one of the highest-impact refactorings I've made. The 1,000x+ performance improvement on deep pages and 22x faster end-to-end processing came from understanding a fundamental database principle: indexed seeks are orders of magnitude faster than sequential scans.
But the journey also taught me a critical lesson about cursor integrity: your cursor must always reflect the database's position, not your application's filtered view. Mixing cursor pagination with in-memory filtering nearly shipped an infinite loop to production that would have cost us thousands in unnecessary database load and failed notification deliveries.
For any system processing large datasets—whether sending mass notifications, exporting data, or building infinite scroll feeds—cursor-based pagination isn't just an optimization; it's a necessity for scalability. Just remember: capture your cursor before any filtering, push logic into SQL when possible, and always monitor for anomalous iteration counts.
In Part 2 of this series, I'll explore moving from cell-phone-based to push-token-based deduplication, and why understanding Firebase token lifecycle is critical for notification systems.
Key Takeaways
- Offset pagination degrades linearly with dataset size (O(n) where n = offset)
- Cursor pagination maintains constant performance via indexed seeks (O(log n))
- Always index your cursor column and use sequential/unique values
- Capture cursor position BEFORE any in-memory filtering to avoid infinite loops
- Monitor cumulative metrics to detect pagination bugs early
- Fetch N+1 items to detect hasMore without extra COUNT queries
- For batch processing at scale, cursor pagination is essential
Top comments (0)