Sunday, February 1, 2015

LISP Core in Less Than 100 Lines of Haskell

A couple of months ago, I built a LISP core engine in about 100 lines of Haskell code (of course, just for the fun of it). I thoroughly enjoyed the challenge, and with the help of techniques from the awesome Programming Languages MOOC I was taking, I felt ready to take it on. In conclusion, it was truly a fun project and I thought I'd share some things I learned along the way.

LISP's Core Primitives are Easily Implemented Recursively


In the simplest terms, the LISP core engine I built is simply a function that operates, with a specific environment (variable bindings) in mind, on the LISP syntax tree. Because of this I was able to implement primitives like label (aka define) simply with pattern matching and the cons (: in Haskell) operator!

To show how simple these primitives were to define, here's eq? (checks equality)
aux env (AList (Atom "eq?":a:b:_)) =
  if aux env a == aux env b then aTruthyValue else theFalseyValue
Note that aux is the name of the helper function called by the executor.

Haskell Code Can Be Very Elegant


There aren't many languages in which code as elegant as this can be written:
executeText env = execute env.parseMany.tokenize
In my opinion, it doesn't get much more straight forward than this. The dots may throw C-style programmers, but to a Haskell programmer they're pure gold.

Additionally refactoring in Haskell is extremely powerful. Consider the evolution of my bind function (takes names and values and adds them to an environment). It started as this:
bind [] [] env = env
bind (name:names) (value:values) env = bind names values ((name, value):env)
then was transformed to this:
bind names values = (++) $ zip names values
and ended up as:
bind names = (++) . zip names
In Haskell, it seems, there is a function for everything. Honestly, there's probably even one out there that can replace my bind function, I just haven't found it yet!

The Macro System is Pretty Cool


At the moment, the macro system in this LISP is pretty tight. Macros are written in Haskell code (I would like to eventually add support in the language) and are provided with (1) and "eval" which encapsulates the current environment and can run code and (2) a list of the arguments passed to the macro.

All registered macros are simply added to a list, and are thus loosely coupled to the actual core engine (but tightly coupled to the AST since they are essentially generating new portions of it).

Here's an example of how I defined if, in terms of the cond primitive:
structuralMacros = [(
  "if", \_ (p:a:b:_) -> AList [Atom "cond", p, a, Atom "1", b]),
  ...]

Parting Words


If you're interested in the full code, it's on github. Also the project has a REPL whose build step is at the bottom of "README.md."