using pandoc filters to create graphs with hakyll
When you want to convert one document format into an other, Pandoc is your friend. Hakyll is using it for converting Markdown into HTML. Once installed (eg. via cabal / stack) you can call pandoc from command line.
$ echo "# test" | pandoc -t native
[Header 1 ("test",[],[]) [Str "test"]]
This simple example shows the native format. A list of definitions can be found at Hackage. Every document format read is converted into this native format. It is the pandoc internal representation of the document.
$ echo "# test" | pandoc -w html
<h1 id="test">test</h1>
You can get a html output as well. A pandoc filter can be used to inject a custom behavior between reading and writing a document. This feature is needed to write filters to work with Hakyll
At first I have to look, how to get a graph, more precisely the graph visualization into the Haskell world. Due to my input comes from a markdown document, it will be plain text. The simple approach is to call the external dot
process with this String
and read the result. If a library is needed for further implementation this part can be switched out.
import System.Process
graph = "digraph { a -> b; b -> c; a -> c; }"
main :: IO()
main = do
svg <- readProcess "dot" ["-Tsvg"] graph
putStr svg
In this example you can pipe a String
to an external process and get a result as a String IO
.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.1//EN"
"http://www.w3.org/Graphics/SVG/1.1/DTD/svg11.dtd">
<!-- Generated by graphviz version 2.38.0 (20140413.2041)
-->
<!-- Title: %3 Pages: 1 -->
<svg width="89pt" height="188pt"
viewBox="0.00 0.00 89.00 188.00" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
<g id="graph0" class="graph" transform="scale(1 1) rotate(0) translate(4 184)">
<title>%3</title>
<polygon fill="white" stroke="none" points="-4,4 -4,-184 85,-184 85,4 -4,4"/>
...
<g id="edge2" class="edge"><title>b->c</title>
<path fill="none" stroke="black" d="M33.3986,-72.411C36.5136,-64.3352 40.3337,-54.4312 43.8346,-45.3547"/>
<polygon fill="black" stroke="black" points="47.1265,-46.5458 47.4597,-35.9562 40.5955,-44.0267 47.1265,-46.5458"/>
</g>
</g>
</svg>
This looks promising. I choose svg
, because it can be easily integrated into a html document.
At first a create a simple environment for testing. I use a index.html
as a simple template,
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<title>hakyll test</title>
</head>
<body>
<div id="content">
$body$
</div>
</body>
</html>
and a index.markdown
with some test data.
# hallo
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam
```
codeblock
```
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam
```{lang="dot"}
digraph graphName { a -> b; b -> c; a -> c; }
```
A hakyll main function can look as following.
{-# LANGUAGE OverloadedStrings #-}
import Hakyll
main :: IO ()
main = hakyll $ do
match "index.markdown" $ do
route $ setExtension "html"
compile $ pandocCompiler
>>= loadAndApplyTemplate "template.html" defaultContext
match "template.html" $ compile templateCompiler
This function will create a single index.html
in the output folder. The interesting part here is the pandocCompiler
. There is a derived compiler pandocCompilerWithTransform
which allows you to specify a transformation for the given content. Given the type signature pandocCompilerWithTransform :: ReaderOptions -> WriterOptions -> (Pandoc -> Pandoc) -> Compiler (Item String)
, I have an entry point for the filter. I need something that takes a Pandoc
and returns a Pandoc
.
graphViz :: Pandoc -> Pandoc
graphViz = walk codeBlock
codeBlock :: Block -> Block
codeBlock (CodeBlock _ contents) = Para [Str contents]
codeBlock x = x
The walk function is used to do something with a specified Pandoc structure. For the filter it is a CodeBlock
to look for. This example converts all CodeBlock
s into paragraphs.
At this point I need the String
representation of the dot lang graph.
svg :: String -> String
svg contents = unsafePerformIO $ readProcess "dot" ["-Tsvg"] contents
unsafePerformIO is a kind of ‘backdoor’. It should be used only with care.
With the new walker,
codeBlock :: Block -> Block
codeBlock cb@(CodeBlock (id, classes, namevals) contents) =
case lookup "lang" namevals of
Just f -> RawBlock (Format "html") $ svg contents
nothing -> cb
codeBlock x = x
I can call the custom compiler with the graphViz
function.
pandocPostCompiler :: Compiler (Item String)
pandocPostCompiler = pandocCompilerWithTransform
defaultHakyllReaderOptions
defaultHakyllWriterOptions
graphViz
Putting it all together
{-# LANGUAGE OverloadedStrings #-}
import Hakyll
import Text.Pandoc
import Text.Pandoc.Walk ( walk )
import System.Process ( readProcess )
import System.IO.Unsafe ( unsafePerformIO )
main :: IO ()
main = hakyll $ do
match "index.markdown" $ do
route $ setExtension "html"
compile $ pandocPostCompiler
>>= loadAndApplyTemplate "template.html" defaultContext
match "template.html" $ compile templateCompiler
pandocPostCompiler :: Compiler (Item String)
pandocPostCompiler = pandocCompilerWithTransform
defaultHakyllReaderOptions
defaultHakyllWriterOptions
graphViz
graphViz :: Pandoc -> Pandoc
graphViz = walk codeBlock
codeBlock :: Block -> Block
codeBlock cb@(CodeBlock (id, classes, namevals) contents) =
case lookup "lang" namevals of
Just f -> RawBlock (Format "html") $ svg contents
nothing -> cb
codeBlock x = x
svg :: String -> String
svg contents = unsafePerformIO $ readProcess "dot" ["-Tsvg"] contents
This code transforms a markdown document into html and converts all codeblocks with a lang
tag into a svg version of the given graph. At this point, I don’t use the value of lang
. It is possible to implement a different behaviour for other tags or different values.