Making Jekyll play nice with MathJax: switching from Redcarpet to Pandoc

2016-12-14

In the process of writing a math-heavy blog post, I ran into several problems with the existing Jekyll configuration. I had set up Jekyll to use Redcarpet as the Markdown renderer, but it simply does not play well with MathJax: it will screw up \ and & inside the MathJax code.

It is said that Kramdown does better with MathJax, but it doesn’t support syntax highlighting on fenced code blocks and its MathJax syntax is very non-standard: $$ for inline math is just weird.

As a last resort, I decided to integrate Jekyll with Pandoc, which is arguably the Swiss army knife of markup formats. It is unopinionated and has a lot of flexibility in what it can do.

Fortunately, there is already a jekyll-pandoc plugin to do this. It was as simple as running gem install jekyll-pandoc and then tweaking the configuration file:

markdown: Pandoc
gems:
- …
- jekyll-pandoc
pandoc:
  extensions:
  - mathjax

I had wanted to integrate with Pygments as well, but Pandoc’s highlight-kate isn’t too bad and I didn’t feel like adding another complication to the process, so I will just stick with it unless I run into something I don’t like.

To make the colors show up, I just needed to write some CSS akin to the default themes (see Kate docs for details on the categories):

code.sourceCode span.co { color: …; } /* comment */
code.sourceCode span.al { color: …; } /* alert (warning) */
code.sourceCode span.er { color: …; } /* error (wrong_syntax) */
code.sourceCode span.bn { color: …; } /* base_n (nondecimal) */
code.sourceCode span.dv { color: …; } /* decimal_value */
code.sourceCode span.fl { color: …; } /* float */
code.sourceCode span.ch { color: …; } /* character */
code.sourceCode span.st { color: …; } /* string */
code.sourceCode span.dt { color: …; } /* data_type */
code.sourceCode span.fu { color: …; } /* function */
code.sourceCode span.im { color: …; } /* import */
code.sourceCode span.cf { color: …; } /* control_flow */
code.sourceCode span.kw { color: …; } /* keyword */
code.sourceCode span.pp { color: …; } /* preprocessor */
code.sourceCode span.re { color: …; } /* region_marker */
code.sourceCode span.ot { color: …; } /* other */

Hopefully none of the older posts broke. If so, let me know!

Comments

Computational interpretation of double negation elimination

2016-12-11

Today’s shower thought is: Is there a way to interpret double negation elimination as a program?

((a -> Void) -> Void) -> a

(Here, Void denotes the empty type.)

At first glance, it seems preposterous: how does one expect to produce a return value of a completely arbitrary type a merely from a function of type (a -> Void) -> Void, which doesn’t even return a value of type a?

A hint comes from Peirce’s law, also known as call-with-current-continuation, which has a similar form:

callCC :: ((a -> Cont r) -> Cont a) -> Cont a

(We use Cont to allow the code to be tested in a Haskell program.)

Peirce’s law gives the program full control over its own state (i.e. stack frame). The application of callCC to a callee g :: (a -> Cont r) -> Cont a bestows upon it the power of a nonlocal jump:

example :: Cont String
example =
  callCC $ \ yield -> do
    yield "escaped"
    pure "stayed"

This would return "escaped" rather than "stayed". It’s as if the inner function performed a goto/return/break/longjmp to the outer scope, passing the result along the way.

Turns out doubleNegationElimination can be expressed in terms of callCC:

doubleNegationElimination :: ((a -> Cont Void) -> Cont Void) -> Cont a
doubleNegationElimination doubleNegation =
  callCC $ \ yield -> do
    void <- doubleNegation $ \ x ->
      yield x -- goodbye!
    absurd void

From this, we can see the loophole: by passing in a continuation a -> Cont Void into the doubleNegation :: (a -> Cont Void) -> Cont Void, the program can capture and send the result to the caller through a non-local jump. The remaining part involving absurd is just to satisfy the type checker.

For example, if the doubleNegation was constructed like this:

doubleNegation :: (String -> Cont Void) -> Cont Void
doubleNegation negation = negation "strings do exist"

then doubleNegationElimination would be able to manifest the evidence "strings do exist" by capturing the value and then immediately jumping to the outer scope, avoiding the crash that would have arisen from evaluating a Void.

On a sidenote, callCC is crazy powerful. You can implement Python-style generators with them!

import Control.Monad.Cont (MonadCont, MonadIO, (<=<),
                           callCC, forever, liftIO, runContT)
import Data.Foldable (for_)
import Data.IORef (IORef)
import Data.Void (Void, absurd)
import qualified Data.IORef as IO

main :: IO ()
main = runContT generatorExample pure

generatorExample :: (MonadCont m, MonadIO m) => m ()
generatorExample = callCC $ \ stop -> do
  g <- newGenerator [0 .. 9 :: Integer]
  forever $ do
    result <- g
    case result of
      Nothing -> stop ()
      Just x  -> liftIO (print x)

newGenerator :: (MonadCont m, MonadIO m) => [a] -> m (m (Maybe a))
newGenerator xs = callCC' $ \ ret -> do
  -- rNext stores the function used to jump back into the generator
  rNext <- newIORef ($ Nothing)
  -- rYield stores the function used to jump out of the generator
  rYield <- newIORef <=< callCC' $ \ next -> do
    writeIORef rNext next
    ret $ do
      -- the generator itself is just a function that saves the current
      -- continuation and jumps back into the generator via rNext
      callCC' $ \ yield -> do
        readIORef rNext >>= ($ yield)
  for_ xs $ \ i -> do
    -- save the current continuation then jump out of the generator
    writeIORef rYield <=< callCC' $ \ next -> do
      writeIORef rNext next
      readIORef rYield >>= ($ Just i)
  readIORef rYield >>= ($ Nothing)

-- | This is just double negation elimination under a different name, which
-- turns out to be a little more convenient than 'callCC' here.
callCC' :: MonadCont m => ((a -> m Void) -> m Void) -> m a
callCC' action = callCC ((absurd <$>) . action)

newIORef :: MonadIO m => a -> m (IORef a)
newIORef value = liftIO (IO.newIORef value)

readIORef :: MonadIO m => IORef a -> m a
readIORef ref = liftIO (IO.readIORef ref)

writeIORef :: MonadIO m => IORef a -> a -> m ()
writeIORef ref value = liftIO (IO.writeIORef ref value)

Comments

The need to associate function pointers with environments

2016-09-21

A friend once asked me a question like this:

void register_event_handler(void *f_ctx, void (*f)(void *ctx));
I’m a little confused about the purpose of f_ctx. Here, f is a handler function that gets called when the event is triggered, and f_ctx is − according to the documentation – some pointer argument that gets passed to f whenever it gets called. Why do we need f_ctx? Wouldn’t f alone suffice?

This is a trick for low-level languages like C where functions are represented using a raw function pointer, which does not store an enclosing environment (sometimes called a context). It is not needed in higher-level languages with support for first-class functions, such as Python, as these languages allow functions to be nested inside other functions and will automatically store the enclosing environment within the function objects in a combination called a closure.

The need for an environment pointer f_ctx arises when you want to write a function that depends on external parameters not known at compile time. The f_ctx parameter allows you to smuggle these external parameters into f however you like.

It might be best to illustrate this with an example. Consider a 1-dimensional numerical integrator like this:

double integrate_1(
    double (*f)(double x), /* function to be integrated */
    double x1,
    double x2
);

This works fine if you know the complete form of the function f ahead of time. But what if this is not the case – what if the function requires parameters? Say we want to calculate the gamma function using an integral:

double integrand(double x)
{
    double t = /* where do we get "t" from?? */;
    return pow(x, t - 1.0) * exp(-x);
}

double gamma_function(double t)
{
    /* how do we send the value of "t" into "integrand"? */
    return integrate_1(&integrand, 0.0, INFINITY) / M_PI;
}

Using integrand_1 there are only three ways to do this:

Store t into a global variable, sacrificing thread safety. It would be bad to simultaneously call gamma_function from different threads as they will both attempt to use the same global variable.
Use a thread-local variable, a feature not available until C11. At least it is thread-safe now, but it is still not reentrant.
Write raw machine code to create an integrand on the fly. This can be implemented in a thread-safe and reentrant manner, but it is both inefficient, unportable, and inhibits compiler optimizations.

However, if the numerical integrator were to be re-designed like this:

double integrate_2(
    double (*f)(void *f_ctx, double x),
    void *f_ctx, /* passed into every invocation of "f" */
    double x1,
    double x2
);

Then there is a much simpler solution that avoids all of these problems:

double integrand(void *ctx, double x)
{
    double t = *(double *)ctx;
    return pow(x, t - 1.0) * exp(-x);
}

double gamma_function(double t)
{
    return integrate_2(&integrand, &t, 0.0, INFINITY) / M_PI;
}

This is thread-safe, reentrant, efficient, and portable.

As mentioned earlier, this problem does not exist in languages like Python where functions can be nested inside other functions (or rather, it is automatically taken care of by the language itself):

def gamma_function(t):
    def integrand(x):
        return x ** (t - 1.0) * math.exp(-x)
    return integrate(integrand, 0.0, math.inf) / math.pi

Comments