Sub-cards (0)Archived
Comments (7)

Use case for metaprogramming: generating serialization/deserialization routines from a format description.


We typically distinguish compile time and runtime:

  • compile time: happens on the developer’s machine (host)

  • runtime: happens on the user’s machine (target)

Those are potentially different machines, and when the platform differs, we call it cross-compilation (we cannot assume that machine-sized ints are the same size, or that the same libraries are available, etc).

At first glance, it seems that we only need two stages for this setup (e.g. the meta-level and the object-level stages of 2LTT). However, if we use the compiler for our language as a library, then at least three stages arise:

  • stage -1: code that is generated on the host and runs on the host (e.g. a Template Haskell program)

  • stage 0: code that is generated on the host but runs on the target (e.g. a Haskell program)

  • stage 1: code that is generated on the target and runs on the target (e.g. a program generated with GHC API and dynamically loaded)

This could be useful when implementing a browser in Haskell that supports JavaScript.

  • A parser generator would be at stage -1 (to transform a grammar for JS to a parser for JS). The parser generator itself is also written in Haskell and runs via TH.

  • The browser itself would be at stage 0 (creates a window, loads the website from a URL, etc). This code is written in Haskell.

  • The JavaScript code on the website could be evaluated directly. But we could also transpile it into Haskell and load it using GHC API. This generated code would be at stage 1.

So we have Haskell code (stage -1) that generates Haskell code (stage 0) that generates Haskell code (stage 1).

Now the question is: could we implement a stage-aware type checker?

For the parser generator, we’d like to use typed Template Haskell, so that the parser it outputs is guaranteed to be well-typed. We never want to see type errors in generated code. If there are errors in our grammar, we want to catch them when checking the grammar, not when compiling the generated parser.

Likewise, the output of a JavaScript transpiler must be well-typed. We never want the user to see type errors in generated code. If there are errors in the JavaScript that we received, we want to catch them when transpiling to Haskell, not when we compile the output of the transpiler.

In other words, all type checking and elaboration should happen before staging.

Profile picture

How does this relate to Racket’s phases?


We’d like to type-check the user’s program before macro expansion, so that error messages refer to the code the user wrote, not the code generated by the macro.


The obvious use case for metaprogramming is parser generators: we want to specify a context-free grammar and derive a state machine from it.

Profile picture

I always wondered — are there no other ways to do this?

Profile picture

What do you mean? There are many ways to do parsing without metaprogramming (megaparsec and Earley come to mind as different approaches to that).

But with parser generators a lot of work of analysing and optimizing the grammar is done at compile time.

Profile picture

I mean “describe the grammar in such a way that it can be analyzed e.g. only once (at the program start) but without resorting to meta programming”.

Or is analyzing once at the program start still going to be too expensive?

Profile picture

It’s not only about the cost. There might be issues in the grammar (e.g. shift/reduce conflicts), and I’d like to see them reported at compile time.

Profile picture

As to the performance overhead, you also miss optimization opportunities if you postpone this to runtime.

Once a parser generator has done its job, the rest of the compiler pipeline can optimize the generated code further. If you do it at program start, you’ll need an optimizing JIT for that (which would also have its overhead).

Profile picture

Aha, thanks