[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Groff] pdfmom grep (was parallel text processing)
From: |
Peter Schaffter |
Subject: |
Re: [Groff] pdfmom grep (was parallel text processing) |
Date: |
Sat, 9 Sep 2017 17:47:46 -0400 |
User-agent: |
Mutt/1.5.24 (2015-08-30) |
Ralph --
On Sat, Sep 09, 2017, Ralph Corderoy wrote:
> I think you're smuggling a -k or -K through to the first groff that
> pdfmom runs. Here's its -Tpdf pipeline again.
>
> groff -Tpdf -dPDF.EXPORT=1 -mom -z $cmdstring 2>&1 |
> grep '^.ds' |
> groff -Tpdf -mom - $preconv $cmdstring
The pipeline in the current pdfmom is actually
groff -Tpdf -dLABEL.REFS=1 -mom -z $preconv $cmdstring 2>&1 |
grep '^\\. *ds' |
groff -Tpdf -dPDF.EXPORT=1 -dLABEL.REFS=1 -mom -z - $preconv $cmdstring 2>&1 |
grep '^\\. *ds' |
groff -Tpdf -mom $preconv - $cmdstring
> The problem is grep seeing invalid UTF-8 and thus deciding stdin is
> binary. A preconv(1) would turn your UTF-8 troff source into
> ISO-8859-1, and any non-ASCII characters in that would probably be
> invalid UTF-8. But pdfmom has tried to spot -k or -K in its arguments
> and arrange for them to be moved from $cmdstring to $preconv and so used
> only by the second groff. If it's simplistic argv[] parsing has failed,
> because you've -xyzk for example, then your -k remains in $cmdstring and
> affects the first groff.
I wish that were the case, but consider this:
***pdfmom pipeline entered literally at the command line
groff -Tpdf -dLABEL.REFS=1 -mom -z -k camus.mom 2>&1 | \
grep '^\. *ds' | \
groff -Tpdf -dPDF.EXPORT=1 -dLABEL.REFS=1 -mom -z -k - camus.mom 2>&1 | \
grep '^\. *ds' | \
groff -Tpdf -mom -k - camus.mom > camus.pdf
- grep does not report a binary file hit
***pdfmom itself at the command line
pdfmom -k camus.mom > camus.pdf
- grep reports a binary file hit
strace on 'pdfmom -k camus.mom > camus. pdf' produces
3225 execve("/usr/local/bin/pdfmom", ["pdfmom", "-k", "camus.mom"], [/* 86
vars */]) = 0
3226 execve("/bin/sh", ["sh", "-c", "groff -Tpdf -dLABEL.REFS=1 -mom "...],
[/* 86 vars */]) = 0
3227 execve("/usr/local/bin/groff", ["groff", "-Tpdf", "-dLABEL.REFS=1",
"-mom", "-z", "-k", "camus.mom"], [/* 86 vars */]) = 0
3228 execve("/bin/grep", ["grep", "^\\. *ds"], [/* 86 vars */]) = 0
3229 execve("/usr/local/bin/groff", ["groff", "-Tpdf", "-dPDF.EXPORT=1",
"-dLABEL.REFS=1", "-mom", "-z", "-", "-k", "camus.mom"], [/* 86 vars */]
<unfinished ...>
3230 execve("/bin/grep", ["grep", "^\\. *ds"], [/* 86 vars */] <unfinished
...>
3229 <... execve resumed> ) = 0
3230 <... execve resumed> ) = 0
3231 execve("/usr/local/bin/groff", ["groff", "-Tpdf", "-mom", "-k", "-",
"camus.mom"], [/* 86 vars */]) = 0
3232 execve("/usr/local/bin/preconv", ["preconv", "-", "camus.mom"], [/* 87
vars */]) = 0
3233 execve("/usr/local/bin/troff", ["troff", "-dPDF.EXPORT=1",
"-dLABEL.REFS=1", "-mom", "-z", "-Tpdf"], [/* 87 vars */]) = 0
3234 execve("/usr/local/bin/preconv", ["preconv", "camus.mom"], [/* 87 vars
*/]) = 0
3235 execve("/usr/local/bin/troff", ["troff", "-dLABEL.REFS=1", "-mom",
"-z", "-Tpdf"], [/* 87 vars */]) = 0
3234 +++ exited with 0 +++
3227 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=3234,
si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
3237 execve("/usr/local/bin/troff", ["troff", "-mom", "-Tpdf"], [/* 87 vars
*/]) = 0
3236 execve("/usr/local/bin/preconv", ["preconv", "-", "camus.mom"], [/* 87
vars */] <unfinished ...>
3238 execve("/usr/local/bin/gropdf", ["gropdf"], [/* 87 vars */]) = 0
3236 <... execve resumed> ) = 0
3235 +++ exited with 0 +++
3227 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=3235,
si_uid=1000, si_status=0, si_utime=8, si_stime=0} ---
3227 +++ exited with 0 +++
3226 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=3227,
si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
3228 +++ exited with 1 +++
3226 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=3228,
si_uid=1000, si_status=1, si_utime=0, si_stime=0} ---
3232 +++ exited with 0 +++
3229 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=3232,
si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
3230 +++ exited with 0 +++
3226 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=3230,
si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
3236 +++ exited with 0 +++
3231 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=3236,
si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
3233 --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=3233,
si_uid=1000} ---
3233 +++ killed by SIGPIPE +++
3229 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_KILLED, si_pid=3233,
si_uid=1000, si_status=SIGPIPE, si_utime=7, si_stime=0} ---
3229 +++ exited with 0 +++
3226 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=3229,
si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
3237 +++ exited with 0 +++
3231 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=3237,
si_uid=1000, si_status=0, si_utime=8, si_stime=0} ---
3238 +++ exited with 0 +++
3231 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=3238,
si_uid=1000, si_status=0, si_utime=11, si_stime=0} ---
3231 +++ exited with 0 +++
3226 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=3231,
si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
3226 +++ exited with 0 +++
3225 --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=3226,
si_uid=1000, si_status=0, si_utime=0, si_stime=0} ---
3225 +++ exited with 0 +++
Unless my eyesight is worse than I think (very possible), it looks
as if pdfmom is processing its pipeline identically to the long
version at the command line, where the reinvocations of preconv(1)
(via the repetitions of the -k flag) aren't doing any harm. Yet the
binary file match shows up when the file is processed with pdfmom.
--
Peter Schaffter
http://www.schaffter.ca
- Re: [Groff] parallel text processing ; vertical and horizontal mode, (continued)
- Re: [Groff] parallel text processing ; vertical and horizontal mode, Deri James, 2017/09/07
- Re: [Groff] parallel text processing ; vertical and horizontal mode, Ralph Corderoy, 2017/09/08
- [Groff] pdfmom grep (was parallel text processing), Peter Schaffter, 2017/09/08
- Re: [Groff] pdfmom grep (was parallel text processing), Steffen Nurpmeso, 2017/09/08
- Re: [Groff] pdfmom grep (was parallel text processing), Steffen Nurpmeso, 2017/09/08
- Re: [Groff] pdfmom grep (was parallel text processing), Steffen Nurpmeso, 2017/09/08
- Re: [Groff] pdfmom grep (was parallel text processing), Peter Schaffter, 2017/09/08
- Re: [Groff] pdfmom grep (was parallel text processing), Ralph Corderoy, 2017/09/09
- Re: [Groff] pdfmom grep (was parallel text processing), Peter Schaffter, 2017/09/09
- Re: [Groff] pdfmom grep (was parallel text processing), Ralph Corderoy, 2017/09/09
- Re: [Groff] pdfmom grep (was parallel text processing),
Peter Schaffter <=
- Re: [Groff] pdfmom grep (was parallel text processing), Ralph Corderoy, 2017/09/10
- Re: [Groff] pdfmom grep (was parallel text processing), Peter Schaffter, 2017/09/10
- Re: [Groff] pdfmom grep (was parallel text processing), Deri James, 2017/09/09
- Re: [Groff] parallel text processing ; vertical and horizontal mode, Ralph Corderoy, 2017/09/07
- Re: [Groff] parallel text processing ; vertical and horizontal mode, Mike Bianchi, 2017/09/07
Re: [Groff] parallel text processing ; vertical and horizontal mode, Ted Harding, 2017/09/06
Re: [Groff] parallel text processing ; vertical and horizontal mode, Larry Kollar, 2017/09/13