summaryrefslogtreecommitdiff
path: root/README.md
blob: f84e2223857ce4f646d0c218184a3a29b5aa8447 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
# libpipeline, a pipeline manipulation library

Git repository: https://gitlab.com/cjwatson/libpipeline

libpipeline is a C library for setting up and running pipelines of
processes, without needing to involve shell command-line parsing which is
often error-prone and insecure.  This alleviates programmers of the need to
laboriously construct pipelines using lower-level primitives such as fork(2)
and execve(2).

Full programmers' documentation may be found using `man libpipeline`, and
the [project homepage](https://nongnu.org/libpipeline/) has more background.

## Installation

If you need to install libpipeline starting from source code, then you will
need these separate packages installed before configuring libpipeline in
order to run its test suite:

 * [pkg-config](https://www.freedesktop.org/wiki/Software/pkg-config)
 * [check >= 0.9.10](https://libcheck.github.io/check/)

See the INSTALL file for general installation instructions.

## Using the library

When the author took over [man-db](https://nongnu.org/man-db) in 2001, one
of the major problems that became evident after maintaining it for a while
was the way it handled subprocesses.  The nature of man and friends means
that it spends a lot of time calling sequences of programs such as `zsoelim
< input-file | tbl | nroff -mandoc -Tutf8`.  Back then, it was using C
library facilities such as `system` and `popen` for all this, and there were
several bugs where those functions were being called with untrusted input as
arguments without properly escaping metacharacters.  Of course it was
possible to chase around every such call inserting appropriate escaping
functions, but this was always bound to be error-prone and one of the tasks
that rapidly became important was arranging to start subprocesses in a way
that was fundamentally immune to this kind of bug.

In higher-level languages, there are usually standard constructs which are
safer than just passing a command line to the shell.  For example, in Perl
you can use something like this to invoke a program with arguments without
the interference of the shell:

```perl
system([$command, $arg1, $arg2, ...])
```

[perlipc(1)](https://perldoc.perl.org/perlipc) describes various facilities
for connecting them together.

In Python, the
[subprocess](https://docs.python.org/3/library/subprocess.html) module
allows you to create pipelines easily and safely (as long as you remember
the [SIGPIPE
gotcha](https://www.chiark.greenend.org.uk/~cjwatson/blog/python-sigpipe.html)).

By contrast, C has the `fork` and `execve` primitives, but assembling these
to construct full-blown pipelines correctly is difficult and error-prone, so
many programmers don't bother and use the simple but unsafe library
facilities instead.

libpipeline solves this problem.  In the following examples, function names
starting with `pipecmd_` or `pipeline_` are real functions in the library,
while any other function names are pseudocode.

Constructing the simplified example pipeline from the first paragraph above
using this library looks like this:

```c
pipeline *p;
int status;

p = pipeline_new ();
pipeline_want_infile (p, "input-file");
pipeline_command_args (p, "zsoelim", NULL);
pipeline_command_args (p, "tbl", NULL);
pipeline_command_args (p, "nroff", "-mandoc", "-Tutf8", NULL);
status = pipeline_run (p);
```

You might want to construct a command more dynamically:

```c
pipecmd *manconv = pipecmd_new_args ("manconv", "-f", from_code,
                                     "-t", "UTF-8", NULL);
if (quiet)
	pipecmd_arg (manconv, "-q");
pipeline_command (p, manconv);
```

Perhaps you want an environment variable set only while running a certain
command:

```c
pipecmd *less = pipecmd_new ("less");
pipecmd_setenv (less, "LESSCHARSET", lesscharset);
```

You might find yourself needing to pass the output of one pipeline to
several other pipelines, in a "tee" arrangement:

```c
pipeline *source, *sink1, *sink2;

source = make_source ();
sink1 = make_sink1 ();
sink2 = make_sink2 ();
pipeline_connect (source, sink1, sink2, NULL);
/* Pump data among these pipelines until there's nothing left. */
pipeline_pump (source, sink1, sink2, NULL);
pipeline_free (sink2);
pipeline_free (sink1);
pipeline_free (source);
```

Maybe one of your commands is actually an in-process function, rather than
an external program:

```c
pipecmd *inproc = pipecmd_new_function ("in-process", &func, NULL, NULL);
pipeline_command (p, inproc);
```

Sometimes your program needs to consume the output of a pipeline, rather
than sending it all to some other subprocess:

```c
pipeline *p = make_pipeline ();
const char *line;

pipeline_want_out (p, -1);
pipeline_start (p);
line = pipeline_peekline (p);
if (!strstr (line, "coding: UTF-8"))
	printf ("Unicode text follows:\n");
while (line = pipeline_readline (p))
	printf ("  %s", line);
pipeline_free (p);
```

## Building programs with libpipeline

libpipeline supplies a pkg-config file which lists appropriate compiler and
linker flags for building programs using it.  The output of `pkg-config
--cflags libpipeline` should be passed to the compiler (typically `CFLAGS`)
and the output of `pkg-config --libs libpipeline` should be passed to the
linker (typically `LDFLAGS`).

If your program uses the GNU Autotools, then you can put this in
configure.ac:

```
PKG_CHECK_MODULES([libpipeline], [libpipeline])
```

... and this in the appropriate Makefile.am (replacing 'program' with the
Automake-canonicalised name for your program):

```make
AM_CFLAGS = $(libpipeline_CFLAGS)
program_LDADD = $(libpipeline_LIBS)
```

The details may vary for particular build systems, but this should be a
reasonable start.

When building with GCC, you should use at least the `-Wformat` option
(included in `-Wall`) to ensure that the 'sentinel' function attribute is
checked.  This means that your program will produce a warning if it calls
any of the several libpipeline functions that require a trailing NULL
without passing that trailing NULL.

## Copyright and licensing

Copyright (C) 1994 Markus Armbruster.
Copyright (C) 1989, 1990, 1991, 1992, 2000, 2001, 2002, 2003, 2004, 2005,
              2006, 2007, 2008, 2009, 2010
              Free Software Foundation, Inc.
Copyright (C) 2003-2020 Colin Watson.

libpipeline is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 3 of the License, or (at
your option) any later version.

libpipeline is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
General Public License for more details.

You should have received a copy of the GNU General Public License
along with libpipeline; if not, write to the Free Software
Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301
USA.

## Note on GPL versions

(This note is informative, and if it conflicts with the terms of the licence
then the licence is correct.  See the full text of the licence in the
COPYING file for precise details.)

The core source code of libpipeline is licensed under GPL v2 or later.
However, libpipeline incorporates parts of the Gnulib portability library,
copyrighted by the Free Software Foundation and others, and much of Gnulib
is distributed under GPL v3 or later.  This means that libpipeline as a
whole falls under the terms of the GPL v3 or later.  Unless you take special
pains to remove the GPL v3 portions, you must therefore follow the terms and
conditions of the GPL v3 or later when distributing libpipeline itself, or
distributing code linked against it.

Note that this does not require that your own source code be licensed under
the GPL v3, contrary to popular misunderstanding.  However, you must be
prepared to distribute your work as a whole under the terms of the GPL v3 or
later, which requires that your licence be compatible with the GPL v3.  See
https://www.gnu.org/licenses/license-list.html#GPLCompatibleLicenses if you
need advice on compatibility.

The GPL mainly restricts distribution ("conveying", in the specific language
of GPL v3), and is careful not to restrict private use.  Therefore, you may
write programs for your own use that use libpipeline without needing to
license them under GPL v3-compatible terms.  If you distribute these
programs to others, then you must take care to use compatible licensing.

## Credits

Thanks to Scott James Remnant for code review, Ian Jackson for an extensive
design review, and Kees Cook and Matthias Klose for helpful conversations.

## Bug reporting

You can [report bugs on
GitLab](https://gitlab.com/cjwatson/libpipeline/-/issues), or see [bugs from
before the migration to
GitLab](https://savannah.nongnu.org/bugs/?group=libpipeline).