groff mom + make + ghostscript + ImageMagick == success?
It's been a long time coming, but I am finally completely sick of LaTeX and LibreOffice, so I wanted to challenge myself and see if I could write a job application (artifact is an pdf) using only groff (with the mom macro sets) and some other basic utilities such as make, ImageMagick and Ghostscript to finally put everything together. At the end it did work and in my humble opinion looks decent enough. So I'd like to share how I did it here. Please note I am no expert at this, but seeing the lack of real examples out there, I guess this isn't going to hurt.
I am only going to explain how to use all tooling around groff used to create the final artifacts, as this is in my opinion quite tricky. What goes into the .mom files will be explained on another page in future, as otherwise this page will get even bigger than it already is.
Prerequisites:
It's expected that you know GNU make
to understand how all of this
stuff is compiled together and you have ghostscript, groff (and the mom
macro packages) installed. Later I'm also going to do some image
conversion via ImageMagick, but it is quite trivial to use.
Obviously you need to be running some sort of UNIX-like operating
system on which all of this will be done. Then again, why else would you
be looking at this page?
First the basics. The job application has the sections you would expect:
- Cover Letter
- CV
- Credentials (University Certificate / Certificates / work reviews etc.)
All of this stuff was done in separate folders, each with a different Makefile so I could work and compile and inspect each of these things separately and merge all the sub-artifacts together at the end.
The Cover Letter + CV
My current cover letter is one .mom file compiled with the following Makefile:
OUT=cover.pdf
$(OUT): cover.mom
pdfmom -k < $^ > $@.temp
gs -q -dNOPAUSE -dBATCH -dPDFSETTINGS=/prepress -sDEVICE=pdfwrite -sOutputFile=$@ $@.temp
rm $@.temp
The output of calling make
to build this should be:
pdfmom -k < cover.mom > cover.pdf.temp
troff: pdfmom-RHxff:11: can't transparently output node at top level
gs -q -dNOPAUSE -dBATCH -dPDFSETTINGS=/prepress -sDEVICE=pdfwrite -sOutputFile=cover.pdf cover.pdf.temp
rm cover.pdf.temp
Some explantions seem in order. I am using the pdfmom command instead of
using groff directly. pdfmom -k < cover.mom >
cover.pdf.temp
reads the cover.mom document via stdin (lesser than sign) and outputs the
resulting data to stdout, which is then redirected into cover.pdf.temp (via
the greater than sign). cover.pdf.temp
is already an pdf document but must be processed further.
The -k
option makes sure UTF-8 encoded (or other) .mom files are properly supported via the
implicit preconv
command. If you would like you can use preconv
explicitly, if you have a file which is specially encoded. I'm not going
to, as I strictly create all text-files in UTF-8.
Note: If you are German like me don't forget to pass the groff parameter
-m de
(Alte Rechtschreibung) or -m den
(Neue Rechtschreibung) to tell groff which language
the .mom file is written in, as otherwise things like hyphenation
aren't going to work properly. See the man page groff_tmac
.
So the command in the Makefile for German documents should be
pdfmom -m de -k < $^ > $@.temp
If you are creating an english document don't worry about it.
As indicated in the pdfmom
man page simply ignore the following line:
troff: pdfmom-RHxff:11: can't transparently output node at top
level
Finally lets get to:
gs -q -dNOPAUSE -dBATCH -dPDFSETTINGS=/prepress -sDEVICE=pdfwrite -sOutputFile=cover.pdf
cover.pdf.temp
.
The qs
command is Ghostscript which does.... well, it does
very many complicated things to PDF/PS files. The -q
option simply says to be
quiet, -dNOPAUSE
and -dBATCH
tells Ghostscript to run
non-interactively, meaning it will not pause or ask you questions (the
last thing we need). -dPDFSETTINGS=/prepress
will do special
processing, in this case it will mainly add the fonts to the pdf so it
can be printed properly by most printers.
Just to illustrate this - Lets check which fonts are embedded in the final/temp files:
$cover> pdffonts cover.pdf.temp
name type encoding emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
Times-Roman Type 1 Custom no no yes 5 0
$cover> pdffonts cover.pdf
name type encoding emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
ACRXYS+Times-Roman Type 1C Custom yes yes yes 9 0
The CV is done exactly the same way, except in my case it was a bit more tricky as I wanted to include an image. This was actually more of an hassle than I realized, so I will put it into the groff tutorial page when I have time.
Credentials
The idea of the credentials is a bit different than the CV/Cover Letter. Basically
I scan in all my certificates via xsane
and convert
each of the resulting images into a separate 8-bit PNM file. This PNM file is then
downscaled to a PDF file with the right page-size via ImageMagick. As the scanning
software already scanned the document in (more or less) the right aspect, the conversion
loss should be minimal.
If you want you can directly scan into 8-bit PDF files (see file->info in
xsane
).
You _could_ skip the conversion via ImageMagick,
but it does some things like proper downsampling and putting things into the
right aspect ratio. Play with some settings to find the best options for your
individual needs.
If you want to do some OCR stuff to get everything into text and skip the hassle of working with images, go ahead, but I didn't bother. I think I'll try it next time I do a project like this.
If you are going the ImageMagick route consider this Makefile:
PDFS=work0001.pdf work0001.pdf uni0000.pdf uni0001.pdf uni0002.pdf uni0003.pdf
OUT=credentials.pdf
%.pdf: %.pnm
convert -page A4 -density 72 $^ $@
$(OUT): $(PDFS)
gs -q -dNOPAUSE -dBATCH -dPDFSETTINGS=/prepress -sDEVICE=pdfwrite -sOutputFile=$@ $^ \
-c "[ /Title (My Credentials) /DOCINFO pdfmark"
clean:
rm -f $(PDFS) $(OUT)
The gs
command is the same as elsewhere, except that by specifying multiple
PDFs in sequence, it will merge all of them together. Obviously replace
the variable PDFS
with whatever input files you have.
If you already have pdfs simply remove the Makefile pattern
%.pdf: %.pnm
convert -page A4 -density 72 work0000.pnm work0000.pdf
convert -page A4 -density 72 work0001.pnm work0001.pdf
convert -page A4 -density 72 uni0000.pnm uni0000.pdf
convert -page A4 -density 72 uni0001.pnm uni0001.pdf
convert -page A4 -density 72 uni0002.pnm uni0002.pdf
convert -page A4 -density 72 uni0003.pnm uni0003.pdf
gs -q -dNOPAUSE -dBATCH -dPDFSETTINGS=/prepress -sDEVICE=pdfwrite -sOutputFile=credentials.pdf \
work0000.pdf work0001.pdf uni0001.pdf uni0001.pdf uni0002.pdf uni0003.pdf \
-c "[ /Title (My Credentials) /DOCINFO pdfmark"
Note: Make sure the -page
fits with whatever country you are applying in.
If you are in America it may be prudent to use letter
instead of A4
.
If you ever see any pages of different sizes in the resulting PDF make sure
that all of the PDFs are in the correct size.
You can check this via the pdfinfo
utility.
You can choose whatever -density
you want. Choose the right value
according to if it looks right. I recommend to trust your eyes on this one.
If it looks too blurry consider upping the density in convert
or scanning again with a higher density in xsane
.
Remember: garbage in - garbage out.
The -c
parameter changes the title via pdfmark
,
as otherwise the last title instruction from the pdfs you merged is used,
which might look bad.
I had previously used pdfunite
, but for some reason zathura
always
complained about some broken xref something or the other, so, seeing that I
had ghostscript installed anyway, I just used gs
instead. pdftk
also works,
but I just really despise its strange command-line syntax. Oh, and it
needs to start an jre every time it wants to do something...
and I'd rather not install Java on my machine.
Merging it all together
Finally I merged everything together. I wanted two versions, as depending on the employer / platform they either want the cover letter separate or everything merged into one. The following Makefile, which is located in the parent folder containing the three subdirectories, does the trick:
ARTS_IN = cover/cover.pdf cv/cv.pdf credentials/credentials.pdf
ARTS_NOCOVER_IN = cv/cv.pdf credentials/credentials.pdf
ARTS_OUT = complete.pdf
ARTS_NOCOVER_OUT = complete_nocover.pdf
default: $(ARTS_OUT) $(ARTS_NOCOVER_OUT)
.PHONY: clean
%.pdf:
$(MAKE) -C $(dir $@)
$(ARTS_NOCOVER_OUT): $(ARTS_NOCOVER_IN)
gs -q -dNOPAUSE -dBATCH -dPDFSETTINGS=/prepress -sDEVICE=pdfwrite -sOutputFile=$@ $^ -c "[ /Title (CV - Credentials) /DOCINFO pdfmark"
$(ARTS_OUT): $(ARTS_IN)
gs -q -dNOPAUSE -dBATCH -dPDFSETTINGS=/prepress -sDEVICE=pdfwrite -sOutputFile=$@ $^ -c "[ /Title (Cover Letter - CV - Credentials) /DOCINFO pdfmark"
clean:
rm -f $(ARTS_OUT) $(ARTS_NOCOVER_OUT)
The %.pdf:
target simply says switch to the directory
the target is placed in and start make
(directory change via -C
is implicit).
As before we set the title explicitely via pdfmark
.
It will automatically create any dependencies it needs to create
the final two artifacts (one with a cover letter, one without).
The Size is right
Most places here in German-speaking Europe have a limit of 10 megs (and accept only PDF). My resulting artifact was around 4.4 megs, so everything was well within bounds. If the resulting artifact is too big it probably is because of the images you scanned in. So rescan it with a lower resolution or use OCR.
Personal Conclusion
At the end of the day you may be asking: was it worth it? For me personally: Yeah, it's pretty good. I've been switching practically everything to groff mom from communications to government agencies to writing cooking recipes or contract cancelations. These all do not require any special formatting and are basically all plain letters with a header and some minor formatting here and there. Next step: Using "edit" inside dosbox with an old line-printer.