Parsing with Derivatives: A Functional Pearl

Matthew Might, David Darais, Daniel Spiewak (ICFP 2011)

[link]

I think this is the first “Functional Pearl” from ICFP that I read. The idea is simple, and is the kind of idea that is easy to verify each step when you read it, and once it’s over you say: “wait, what just happened?”. What I mean by that is that you read the fundamental ideas and don’t even realize that they are actually solving a complex problem, so simple they are.

I only read the recognition part, but it seems parsing follows without too much work. The derivative of a formal language $L$ with respect to a character $c$ is the language of all $l$ such that $c + l \in L$. The utility of this simple (apparently classical) construct becomes clear once you look at the constructors of regular languages and see that it’s very simple to compute, and once you have it recognition is basically solved. Extending it to context-free languages is not trivial: this is the meat of the paper, as I understand. But it can be solved with three simple modifications that seem like “hacks” when you read them, but then you realize that the combination of the three – laziness, memoization and fixed points – actually solve the problem.

The paper follows along a concrete implementation of the ideas in Racket, which I really appreciate. I’ve been using Racket in a new project, and it’s quite enjoyable.

I might come back to actually read the parsing part, but I can already call sections 1-4 a pearl.

Gabriel Poesia
Computer Science PhD student