Skip to content

Conversation

@skypher
Copy link

@skypher skypher commented Dec 21, 2025

Summary

This PR adds comprehensive OSS-Fuzz integration to pdf-lib, enabling continuous fuzzing through Google's infrastructure.

Fuzz Targets (10 total)

Target Description
pdf_parser Main PDF parsing via PDFDocument.load()
pdf_modify PDF modification, page addition, saving with round-trip validation
pdf_form AcroForm field parsing and manipulation
jpeg_embed JPEG image embedding
png_embed PNG image embedding
stream_decode Stream decoders (Flate, LZW, Ascii85, AsciiHex, RunLength)
object_parser Direct PDFObjectParser targeting for high performance
pdf_string PDFString/PDFHexString escape sequences and encoding
xref_stream XRef stream parsing
page_embed PDF page embedding/copying

Files Added

  • fuzz/*.fuzz.ts - 10 TypeScript fuzz targets
  • fuzz/*.options - Fuzzer resource limits (max_len, timeout)
  • fuzz/corpus/ - Seed corpora with valid and edge-case inputs
  • fuzz/dictionaries/ - PDF, JPEG, PNG dictionaries for guided fuzzing
  • fuzz/README.md - Documentation and usage instructions
  • oss-fuzz/ - OSS-Fuzz configuration (Dockerfile, build.sh, project.yaml)
  • .github/workflows/fuzz.yml - CI workflow for regression testing

Local Testing

npm install
npx esbuild fuzz/pdf_parser.fuzz.ts --bundle --platform=node --target=node18 --outfile=fuzz/pdf_parser.fuzz.js --format=cjs
npx jazzer fuzz/pdf_parser.fuzz.js --corpus fuzz/corpus/pdf_parser

Coverage

Current test suite achieves 87.56% line coverage and 98.55% parser coverage. These fuzz targets provide additional coverage through randomized input generation.

OSS-Fuzz Integration

A companion PR will be submitted to google/oss-fuzz with the project configuration files after this PR is merged.


This work prepares pdf-lib for inclusion in Google's OSS-Fuzz continuous fuzzing infrastructure, helping identify potential parsing bugs and edge cases.

- pdf_parser: Main PDF document parsing
- pdf_modify: PDF modification and saving
- pdf_form: AcroForm field parsing
- jpeg_embed/png_embed: Image embedding
- stream_decode: Flate/LZW/Ascii85/Hex/RunLength decoders
- object_parser: Individual PDF object parsing
- pdf_string: String escape sequence handling
- xref_stream: Cross-reference stream parsing
- page_embed: PDF page embedding

Includes:
- 10 TypeScript fuzz targets
- 3 dictionaries (PDF, JPEG, PNG tokens)
- 6 options files for resource limits
- 260+ seed corpus files
- GitHub Actions CI workflow for regression testing
- Documentation (fuzz/README.md)
- Add oss-fuzz/ directory with OSS-Fuzz config files (Dockerfile, build.sh, project.yaml)
- Add fuzzing dependencies to package.json devDependencies (@jazzer.js/core, esbuild, c8)
- Remove continue-on-error from regression test step for strict corpus validation
- Expand coverage job to run all 10 fuzzers instead of just pdf_parser
- Add round-trip validation to pdf_modify.fuzz.ts (load saved PDF to verify output)
- Update fuzz/README.md with correct paths and simplified setup
Fuzz Target Logic Fixes:
- stream_decode.fuzz.ts: Clarify that decode() is intentional (triggers lazy decoding)
- pdf_string.fuzz.ts: Use latin1 encoding to test arbitrary hex characters properly
- xref_stream.fuzz.ts: Fix offset calculation - xrefOffset now points to start of XRef object
- object_parser.fuzz.ts: Target PDFObjectParser directly for 10x+ performance improvement

Performance & Limits:
- page_embed.fuzz.ts: Align limit to 1MB to match page_embed.options
- Add missing options files: jpeg_embed, png_embed, pdf_string, xref_stream

Code Quality:
- pdf_form.fuzz.ts: Refactor to use instanceof checks instead of try-catch blocks
- stream_decode.fuzz.ts: Add defensive check before calling decode()
- oss-fuzz/build.sh: Remove redundant npm install (deps now in package.json)
The fuzzing dependencies (@jazzer.js/core, esbuild) bring in newer
@types/node that are incompatible with TypeScript 3.9.5 used by pdf-lib.

Instead, install them separately:
- GitHub Actions workflow installs them with --no-save
- OSS-Fuzz build.sh installs them with --no-save
- README updated with manual install instructions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant