Including external tables for markdown/pandoc
I use pandoc for all sorts of workflows. From one report, we can create a PDF, docx, or HTML page.
One difficulty though is managing tabular data that is dynamic, or which
is susceptible to change. In its basic form, the markdown format doesn’t
have an “include” concept, like that of
LaTeX. But you can easily work around
this shortcoming using awk
.
What I like to do is something similar to this:
This is a sample paragraph introducing the table.
TAB table1.md
More prose....
I essentially use TAB
as a placeholder for where I want the contents
of a separate file to go.
Replacing that line with the contents of the file can then easily be
done using a one-line awk
script.
awk '/TAB/ { system("cat " $2; next } { print }' my-document.md
If you don’t know awk
, you should, but here is the basic jist of the
script.
awk
scans line by line. If the line matches the text “TAB”, then run
the commands in the brackets. The first command is to call the cat
program, which prints the contents of files to standard output. The file
that we want printed out is the second field ($2) of the line. awk
splits fields by whitespace by default, so make sure you don’t have
spaces in your filename (which hopefully you don’t do, right?).
The next
statement just moves onto processing the next line, so that
we don’t do the default behavior, which is to just print the line, which
is what the print
command does in the second set of brackets.
So this allows a different program to have the sole job of just creating the data for the table, leaving the actual insertion of the contents to this short script.
Tie it all together in a Makefile
and you’re on your way to building
some cool documents!
Disclaimer: First came across this in the awk
manual, which everyone
should peruse.