pText is a library for creating and manipulating PDF files in python.
pText is a pure python library to read, write and manipulate PDF documents. It represents a PDF document as a JSON-like datastructure of nested lists, dictionaries and primitives (numbers, string, booleans, etc)
This is currently a one-man project, so the focus will always be to support those use-cases that are more common in favor of those that are rare.
Most examples double as tests, you can find them in the 'tests' directory.
They include;
- reading a PDF and extracting meta-information
- changing meta-information
- extracting text from a PDF
- extracting images from a PDF
- changing images in a PDF
- adding annotations (notes, links, etc) to a PDF
- adding text to a PDF
- adding tables to a PDF
- adding lists to a PDF
- using a layout and much more
pText is dual licensed as AGPL/Commercial software.
AGPL is a free / open source software license. This doesn't mean the software is gratis!
Buying a license is mandatory as soon as you develop commercial activities distributing the pText software inside your product or deploying it on a network without disclosing the source code of your own applications under the AGPL license. These activities include:
- offering paid services to customers as an ASP
- serving PDFs on the fly in the cloud or in a web application
- shipping pText with a closed source product
Contact sales for more info.
I would like to thank the following people, for their contributions / advice with regards to developing pText:
- Benoît Lagae
- Michael Klink