About

About

Opinion

groff mom and you
Option Overload
No Software Engineering
UML and the Professional

Legal

Data Protection
Datenschutz
Impressum

groff mom + make + ghostscript + ImageMagick == success?

It's been a long time coming, but I am finally completely sick of LaTeX and LibreOffice, so I wanted to challenge myself and see if I could write a job application (artifact is an pdf) using only groff (with the mom macro sets) and some other basic utilities such as make, ImageMagick and Ghostscript to finally put everything together. At the end it did work and in my humble opinion looks decent enough. So I'd like to share how I did it here. Please note I am no expert at this, but seeing the lack of real examples out there, I guess this isn't going to hurt.

I am only going to explain how to use all tooling around groff used to create the final artifacts, as this is in my opinion quite tricky. What goes into the .mom files will be explained on another page in future, as otherwise this page will get even bigger than it already is.

Prerequisites: It's expected that you know GNU make to understand how all of this stuff is compiled together and you have ghostscript, groff (and the mom macro packages) installed. Later I'm also going to do some image conversion via ImageMagick, but it is quite trivial to use. Obviously you need to be running some sort of UNIX-like operating system on which all of this will be done. Then again, why else would you be looking at this page?

First the basics. The job application has the sections you would expect:

All of this stuff was done in separate folders, each with a different Makefile so I could work and compile and inspect each of these things separately and merge all the sub-artifacts together at the end.

The Cover Letter + CV

My current cover letter is one .mom file compiled with the following Makefile:


OUT=cover.pdf

$(OUT): cover.mom
	pdfmom -k < $^ > $@.temp
	gs -q -dNOPAUSE -dBATCH -dPDFSETTINGS=/prepress -sDEVICE=pdfwrite -sOutputFile=$@ $@.temp
	rm $@.temp

The output of calling make to build this should be:


pdfmom -k < cover.mom > cover.pdf.temp
troff: pdfmom-RHxff:11: can't transparently output node at top level
gs -q -dNOPAUSE -dBATCH -dPDFSETTINGS=/prepress -sDEVICE=pdfwrite -sOutputFile=cover.pdf cover.pdf.temp
rm cover.pdf.temp

Some explantions seem in order. I am using the pdfmom command instead of using groff directly. pdfmom -k < cover.mom > cover.pdf.temp reads the cover.mom document via stdin (lesser than sign) and outputs the resulting data to stdout, which is then redirected into cover.pdf.temp (via the greater than sign). cover.pdf.temp is already an pdf document but must be processed further. The -k option makes sure UTF-8 encoded (or other) .mom files are properly supported via the implicit preconv command. If you would like you can use preconv explicitly, if you have a file which is specially encoded. I'm not going to, as I strictly create all text-files in UTF-8.

Note: If you are German like me don't forget to pass the groff parameter -m de (Alte Rechtschreibung) or -m den (Neue Rechtschreibung) to tell groff which language the .mom file is written in, as otherwise things like hyphenation aren't going to work properly. See the man page groff_tmac. So the command in the Makefile for German documents should be


	pdfmom -m de -k < $^ > $@.temp

If you are creating an english document don't worry about it.

As indicated in the pdfmom man page simply ignore the following line: troff: pdfmom-RHxff:11: can't transparently output node at top level

Finally lets get to: gs -q -dNOPAUSE -dBATCH -dPDFSETTINGS=/prepress -sDEVICE=pdfwrite -sOutputFile=cover.pdf cover.pdf.temp. The qs command is Ghostscript which does.... well, it does very many complicated things to PDF/PS files. The -q option simply says to be quiet, -dNOPAUSE and -dBATCH tells Ghostscript to run non-interactively, meaning it will not pause or ask you questions (the last thing we need). -dPDFSETTINGS=/prepress will do special processing, in this case it will mainly add the fonts to the pdf so it can be printed properly by most printers.

Just to illustrate this - Lets check which fonts are embedded in the final/temp files:


$cover> pdffonts cover.pdf.temp
name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
Times-Roman                          Type 1            Custom           no  no  yes      5  0

$cover> pdffonts cover.pdf
name                                 type              encoding         emb sub uni object ID
------------------------------------ ----------------- ---------------- --- --- --- ---------
ACRXYS+Times-Roman                   Type 1C           Custom           yes yes yes      9  0

The CV is done exactly the same way, except in my case it was a bit more tricky as I wanted to include an image. This was actually more of an hassle than I realized, so I will put it into the groff tutorial page when I have time.

Credentials

The idea of the credentials is a bit different than the CV/Cover Letter. Basically I scan in all my certificates via xsane and convert each of the resulting images into a separate 8-bit PNM file. This PNM file is then downscaled to a PDF file with the right page-size via ImageMagick. As the scanning software already scanned the document in (more or less) the right aspect, the conversion loss should be minimal.

If you want you can directly scan into 8-bit PDF files (see file->info in xsane). You _could_ skip the conversion via ImageMagick, but it does some things like proper downsampling and putting things into the right aspect ratio. Play with some settings to find the best options for your individual needs.

If you want to do some OCR stuff to get everything into text and skip the hassle of working with images, go ahead, but I didn't bother. I think I'll try it next time I do a project like this.

If you are going the ImageMagick route consider this Makefile:


PDFS=work0001.pdf work0001.pdf uni0000.pdf uni0001.pdf uni0002.pdf uni0003.pdf
OUT=credentials.pdf

%.pdf: %.pnm
	convert -page A4 -density 72 $^ $@

$(OUT): $(PDFS)
	gs -q -dNOPAUSE -dBATCH -dPDFSETTINGS=/prepress -sDEVICE=pdfwrite -sOutputFile=$@ $^ \
	-c "[ /Title (My Credentials) /DOCINFO pdfmark"
        
clean:
	rm -f $(PDFS) $(OUT)

The gs command is the same as elsewhere, except that by specifying multiple PDFs in sequence, it will merge all of them together. Obviously replace the variable PDFS with whatever input files you have. If you already have pdfs simply remove the Makefile pattern %.pdf: %.pnm

Output on the console:

convert -page A4 -density 72 work0000.pnm work0000.pdf
convert -page A4 -density 72 work0001.pnm work0001.pdf
convert -page A4 -density 72 uni0000.pnm uni0000.pdf
convert -page A4 -density 72 uni0001.pnm uni0001.pdf
convert -page A4 -density 72 uni0002.pnm uni0002.pdf
convert -page A4 -density 72 uni0003.pnm uni0003.pdf
gs -q -dNOPAUSE -dBATCH -dPDFSETTINGS=/prepress -sDEVICE=pdfwrite -sOutputFile=credentials.pdf \
work0000.pdf work0001.pdf uni0001.pdf uni0001.pdf uni0002.pdf uni0003.pdf \
-c "[ /Title (My Credentials) /DOCINFO pdfmark"

Note: Make sure the -page fits with whatever country you are applying in. If you are in America it may be prudent to use letter instead of A4. If you ever see any pages of different sizes in the resulting PDF make sure that all of the PDFs are in the correct size. You can check this via the pdfinfo utility.

You can choose whatever -density you want. Choose the right value according to if it looks right. I recommend to trust your eyes on this one. If it looks too blurry consider upping the density in convert or scanning again with a higher density in xsane. Remember: garbage in - garbage out.

The -c parameter changes the title via pdfmark, as otherwise the last title instruction from the pdfs you merged is used, which might look bad.

I had previously used pdfunite, but for some reason zathura always complained about some broken xref something or the other, so, seeing that I had ghostscript installed anyway, I just used gs instead. pdftk also works, but I just really despise its strange command-line syntax. Oh, and it needs to start an jre every time it wants to do something... and I'd rather not install Java on my machine.

Merging it all together

Finally I merged everything together. I wanted two versions, as depending on the employer / platform they either want the cover letter separate or everything merged into one. The following Makefile, which is located in the parent folder containing the three subdirectories, does the trick:


ARTS_IN = cover/cover.pdf cv/cv.pdf credentials/credentials.pdf
ARTS_NOCOVER_IN = cv/cv.pdf credentials/credentials.pdf

ARTS_OUT = complete.pdf
ARTS_NOCOVER_OUT = complete_nocover.pdf

default: $(ARTS_OUT) $(ARTS_NOCOVER_OUT)

.PHONY: clean

%.pdf:
	$(MAKE) -C $(dir $@)

$(ARTS_NOCOVER_OUT): $(ARTS_NOCOVER_IN)
	gs -q -dNOPAUSE -dBATCH -dPDFSETTINGS=/prepress -sDEVICE=pdfwrite -sOutputFile=$@ $^ -c "[ /Title (CV - Credentials) /DOCINFO pdfmark"

$(ARTS_OUT): $(ARTS_IN)
	gs -q -dNOPAUSE -dBATCH -dPDFSETTINGS=/prepress -sDEVICE=pdfwrite -sOutputFile=$@ $^ -c "[ /Title (Cover Letter - CV - Credentials) /DOCINFO pdfmark"

clean: 
	rm -f $(ARTS_OUT) $(ARTS_NOCOVER_OUT)

The %.pdf: target simply says switch to the directory the target is placed in and start make (directory change via -C is implicit). As before we set the title explicitely via pdfmark. It will automatically create any dependencies it needs to create the final two artifacts (one with a cover letter, one without).

The Size is right

Most places here in German-speaking Europe have a limit of 10 megs (and accept only PDF). My resulting artifact was around 4.4 megs, so everything was well within bounds. If the resulting artifact is too big it probably is because of the images you scanned in. So rescan it with a lower resolution or use OCR.

Personal Conclusion

At the end of the day you may be asking: was it worth it? For me personally: Yeah, it's pretty good. I've been switching practically everything to groff mom from communications to government agencies to writing cooking recipes or contract cancelations. These all do not require any special formatting and are basically all plain letters with a header and some minor formatting here and there. Next step: Using "edit" inside dosbox with an old line-printer.