summaryrefslogtreecommitdiff
path: root/docs/arch.rst
blob: f6f8a3e1e1054c25bc3890cdfc2f960dc296ff0f (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
Architecture
============

pikepdf uses `pybind11 <https://github.com/pybind/pybind11>`_ to bind the
C++ interface of QPDF. pybind11 was selected after evaluating Cython, CFFI and
SWIG as possible binding solutions.

In addition to bindings pikepdf includes support code written in a mix of C++
and Python, mainly to present a clean Pythonic interface to C++ and implement
higher level functionality.

Internals
---------

Internally the package presents a module named ``pikepdf`` from which objects
can be imported. The C++ extension module is currently named ``pikepdf._qpdf``.
Users of ``pikepdf`` should not directly access ``_qpdf`` since it is an
internal interface.

In general, modules or objects behind an underscore are private (although they
may be returned in some situations).

Thread safety
-------------

Because of the global interpreter lock (GIL), it is safe to read pikepdf
objects across Python threads. Also because of the GIL, there may not be much
performance gain from doing so.

If one or more threads will be modifying pikepdf objects, you will have to
coordinate read and write access with a :class:`threading.Lock`.

It is not currently possible to pickle pikepdf objects or marshall them across
process boundaries (as would be required to use pikepdf in
:mod:`multiprocessing`). If this were implemented, it would not be much more
efficient than saving a full PDF and sending it to another process.

File handles
------------

Because of technical limitations in underlying libraries, pikepdf keeps the
source PDF file open when a content is copied from it to another PDF, even when
all Python variables pointing to the source are removed. If a PDF is being
assembled from many sources, then all of those sources are held open in memory.