Skip to main content

A Python Document Management Framework for generating and sending (pdf, docx, etc) documents to customers

Project description

PyPI

Creates, merges, splits, edits documents(mainly docx/pdf) as well as sending them by email. Originally created for QR bills integration but is generic and can be used for much more.

Installation

Installation with pip:

$ pip install doc-workflow

Usage

From the command line:

$ docwf <path_to_json_config_file>

From Python:

from docwf import DocWorkflow

config_obj = {
    "globals": {
        "data": {
            "workbook": "source.xlsx",
            "sheet": "mailmergesheet",
        },
        "constants": {
            "language": "fr"
        }
    },
    "tasks": [
        {
            "active": 1, # you can activate/deactivate tasks
            "name": "create bills", # name for debug purpose
            "locals": {
                "data" : {
                    "sheet": "overridesheetfortask"
                },
                "key" : "value", # overrides global arguments for the task
            },
            "task": {
                "type": "myplugin", # or builtin plugins (see below)
                "task_dependent_argument": "value{param}",
            }
        },
    ]
}
my_plugins = {
    "myplugin": MyPluginClass
}
DocWorkflow(config_obj, plugins=my_plugins).gen()

Typical workflow tasks

Assume the data is in the source.xlsx in the sheet named bills

clientnr

email

send_email

total

reference

etc

1

c1@gmail.com

yes

1032

ref2022c1

2

c2@gmail.com

yes

1232

ref2022c2

Create bills from Word template

{
    "active": 1, # you can activate/deactivate tasks
    "name": "create bills", # name for debug purpose
    "task": {
        "type": "mailmerge",
        "input_docx": "templates/template_bill.docx",
        "output_docx": "bills/bill_{year}.docx" # output depends on the column year, it should be constant throughout all rows
    }
},

Create pdf from the generated docx

It uses the Word Application (Mac/Windows). If the docx template has dynamic fields (IF, etc), the generated docx will ask permission to update all fields before saving it as pdf.

{
    "name": "save pdf from docx (uses Word)",
    "task": {
        "type": "makepdf",
        "input_docx": "bills/bill_{year}.docx",
        "output_pdf": "bills/bill_{year}.pdf"
    }
},

Fills in QR codes

for the bills by adding a page to each bill or by merging the QR bill into one of the pages.

{
    "name": "create qr bills",
    "locals": {
        "creditor": {
            "iban": "CH....",
            "name": "The Good Company",
            "pcode": "xyzt",
            "city": "Bern",
            "street": "Dorfstrasse 1"
        },
        "task_params": {
            "extra_infos": "reference", # fixed keys for bill reason ...
            "amount": "total"   # and the amount. With task_params you can create data entries out of existing columns
        }
    },
    "task": {
        "type": "qr",
        "merge_type": "merge", # or "append"
        "input_filename": "bills/bill_{year}.pdf",
        "delete_input": true, # delete the input filename after creating the output
        "pages": 2, # the number of pages per each bill
        "merge_pos": 2, # or "insert_pos" if "append"
        "output_filename": "bills/bill_{year}_with_qr.pdf"
    }
},

Split the bills into separate pdf files.

From one input to multiple outputs

{
    "name": "split bills",
    "task": {
        "type": "split_pdf",
        "input_filename": "bills/bill_{year}_with_qr.pdf",
        "pages": 2,
        "makedir": "bills/bills_{year}", # if the output directory doesn't exist, create it
        "output_filename": "bills/bills_{year}/bill_{year}_{clientnr}.pdf" # output filename using unique name for each customer
    }
},

Unify bills that are to be printed

This shows how to filter rows. The same split_pdf plugin is used, from multiple inputs to one output.

{
    "name": "unify bills for print",
    "filter": {"column": "send_email", "value": "no"},
    "task": {
        "type": "split_pdf",
        "input_filename": "bills/bills_{year}/bill_{year}_{clientnr}.pdf",
        "delete_input": true,
        "pages": 2,
        "output_filename": "bills/bills_{year}_paper.pdf"
    }
},

Send the bills by email

{
    "name": "send emails",
    "locals": {
        "sender": {
            "email": "[email protected]",
            "name": "Info",
            "server": "smtp.gmail.com:587",
            "username": "[email protected]",
            "password": "strongpassword",
            "bcc": "[email protected]",
            "headers": {
                "Reply-To": "[email protected]"
            }
        },
    },
    "filter": {"column": "send_email", "value": "yes"},
    "task": {
        "type": "email",
        "recipient": "email", # the key/column name for the customer email
        "subject" : "Bill for year {year}", # can contain dynamic parts
        "body_template_file" : "templates/email_template.txt", # text template for the email body
        "attachments" : [ "bills/bills_{year}/bill_{year}_{clientnr}.pdf" ] # list of attachments
    }
},

Watermark PDF files

Mark reminder bills

{
    "name": "save reminder",
    "filter": {"column": "reminder", "value": "yes"},
    "task": {
        "type": "watermark",
        "makedir": "bills/bills_{key_year}/reminders/",
        "watermark": "REMINDER",
        "input_filename": "bills/bills_{year}/bill_{year}_{clientnr}.pdf",
        "pages": 2,
        "output_filename": "bills/bills_{year}/reminders/bill_{year}_{clientnr}_reminder.pdf"
    }
},

Send reminder bills

{
    "name": "send reminder emails",
    "locals": {
        "sender": {
            ...
        },
    },
    "filter": [
        {"column": "send_email", "value": "yes"},
        {"column": "reminder", "value": "yes"}
    ],
    "task": {
        "type": "email",
        "recipient": "email", # the key/column name for the customer email
        "subject" : "Bill for year {year} (reminder)", # can contain dynamic parts
        "body_template_file" : "templates/reminder_email_template.txt", # text template for the email body
        "attachments" : [ "bills/bills_{year}/reminders/bill_{year}_{clientnr}_reminder.pdf" ] # list of attachments
    }
},

Use Google Spreadsheets instead of Excel

To support google spreadsheets you need a service account and credentials as JSON. Follow the tutorial gspread with service account.

Change the “workbook” value

"globals": {
    "data": {
        "workbook": "https://docs.google.com/spreadsheets/d/1u...",
        "sheet": "mailmergesheet",
        "credentials": {
            "type": "service_account",
            "project_id": "...",
            "private_key_id": "...",
            "private_key": "-----BEGIN PRIVATE KEY....\n-----END PRIVATE KEY-----\n",
            "client_email": "[email protected]",
            "client_id": "...",
            "auth_uri": "https://accounts.google.com/o/oauth2/auth",
            "token_uri": "https://oauth2.googleapis.com/token",
            "auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
            "client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/..."
        }
    },
    ...
}

Export Google Spreadsheets in a PDF file

Only works with gspread type data

{
    "#import": ["inc/inc_workbook_gspread.json"]
    "name": "export sheets as pdf",
    "globals": {
        "printsheets_defaults" : {
            "gridlines": true,
            "printnotes": false
        }
    },
    "tasks": [
        {
            "active": 1,
            "name": "bill documents",
            "task": {
                "makedir": "bills/web",
                "type": "printsheets",
                "printsheets": [
                    {
                        "gid": "1571231333"
                    },
                    {
                        "gid": "291382312357"
                    },
                    {
                        "gid": "3712318114",
                        "portrait": false,
                        "printnotes": true
                    }
                ],
                "output_filename": "bills/web/heizung_unterlagen_{key_year}.pdf"
            }
        }
    ]
}

Todo / Wish List

  • Create unit tests

  • Develop the command line to be able to run simple tasks directly

  • Create more advanced filters

  • Auto-magically create directories (remove the makedir argument)

Contributing

  • Fork the repository on GitHub and start hacking

  • Send a pull request with your changes

Credits

This repository is created and maintained by Iulian Ciorăscu.

Project details


Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

doc_workflow-0.1.2a4.tar.gz (15.7 kB view details)

Uploaded Source

Built Distribution

If you're not sure about the file name format, learn more about wheel file names.

doc_workflow-0.1.2a4-py3-none-any.whl (20.1 kB view details)

Uploaded Python 3

File details

Details for the file doc_workflow-0.1.2a4.tar.gz.

File metadata

  • Download URL: doc_workflow-0.1.2a4.tar.gz
  • Upload date:
  • Size: 15.7 kB
  • Tags: Source
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for doc_workflow-0.1.2a4.tar.gz
Algorithm Hash digest
SHA256 e999e34805fc1856b292f468bfae5506126d2d7da261f93c8b53bee67e56bdf4
MD5 c60bf3804f273161ad8fcddca6a12e61
BLAKE2b-256 e0183c6c65ff301df39b92819848bfc660df5b1824c8ce72ebae2169cde64886

See more details on using hashes here.

File details

Details for the file doc_workflow-0.1.2a4-py3-none-any.whl.

File metadata

  • Download URL: doc_workflow-0.1.2a4-py3-none-any.whl
  • Upload date:
  • Size: 20.1 kB
  • Tags: Python 3
  • Uploaded using Trusted Publishing? No
  • Uploaded via: twine/6.1.0 CPython/3.13.7

File hashes

Hashes for doc_workflow-0.1.2a4-py3-none-any.whl
Algorithm Hash digest
SHA256 725fdbb88d0739e5829dc36cccfb675dbd48e2a9abdfd849cfd5482f0599b116
MD5 e5ee6bd01aaec1b909b7373681a6fbc5
BLAKE2b-256 7fa813658894e140e04f24b2de62cc79c763ac71c2a2e2b6f13f5cfdeb930d84

See more details on using hashes here.

Supported by

AWS Cloud computing and Security Sponsor Datadog Monitoring Depot Continuous Integration Fastly CDN Google Download Analytics Pingdom Monitoring Sentry Error logging StatusPage Status page