The best kittens, technology, and video games blog in the world.

Friday, April 20, 2007

Pattern matching for RLisp

claws by dotpolka from flickr (CC-NC-ND)One of the reasons I had for creating RLisp was a desire for a platform on which I could experiment with Lisp-style macros, and where they would do something useful. All other Lisps are either gross like Common Lisp or lack full macro system like Scheme. And in either case they don't have things that we take for granted nowadays in Ruby, Python or even Perl. Now that RLisp exists, and is even decently fast, I finally got to playing with macros. Some observations:

  • It's very difficult. Much more difficult than plain metaprogramming by code generation.
  • Macros are very intolerant - a typo or thinko in macro definition usually results in a completely uninformative error message. And thinkos are quite common.
  • I was never biten by macro hygiene issues. Obviously I (gensym) an identifier if I need one. It seems that to have hygiene problems the macro would need to use global variable, which was shadowed in local scope. This doesn't seem likely to happen.
Probably the most complicated macro I've written so far is pattern matching macro (match ...). It provides pattern matching syntax similar to OCaml's (SML's, Haskell's, and so on):
(defun arithmetics (expr)
 (match expr
   (x 'plus y)  (+ (arithmetics x) (arithmetics y))
   (x 'minus y) (- (arithmetics x) (arithmetics y))
   (x 'times y) (* (arithmetics x) (arithmetics y))
   (x 'div y)   (/ (arithmetics x) (arithmetics y))
   z z
 )
)
(arithmetics '5.0) ; => 5.0
(arithmetics '(1 plus 2)) ; => 3
(arithmetics '((3 times 4) plus 5)) ; => 17
The macro is built in stages. The first stage simply protects against evaluating the expression more than once.
(defmacro match (val . cases)
 (let tmp (gensym))
 `(do
   (let ,tmp ,val)
   (match-2 ,tmp ,@cases)
 )
)
(match-2 ...) does the same thing as (match ...), except that it knows its first argument is a temporary variable, so it doesn't have to worry about evaluating it multiple times. (match-2 ...) builds a tree (if (tmp matches pattern 1) (code 1) (if (tmp matches pattern 2) (code 2) ...)), possibly with a default value if all patterns fail.
(defmacro match-2 (tmp . cases)
 (if [cases empty?]
   `nil
   (if (== [cases size] 1)
      (hd cases)
      `(if (match-3 ,tmp ,(nth cases 0))
         ,(nth cases 1)
         (match-2 ,tmp ,@(ntl cases 2))
      )
   )
 )
)
(match-3 ...) generates code which matches a single value against a single pattern. There's a cool part - in RLisp (let foo bar) sets foo variable in the nearest enclosing scope (typically a function), so we don't have to do any extra work to get our patterns bind values to identifiers.
(defmacro match-3 (tmp pattern)
 (if [pattern is_a? Array]
   (if [pattern == ()] `[,tmp == ,pattern]
     (if [(hd pattern) == 'quote]
       `[,tmp == ',(nth pattern 1)]
       `(bool-and
          [,tmp is_a? Array]
          [[,tmp size] == ,[pattern size]]
          ,@[[Range new 0 [[pattern size] - 1]] map & (fn (i)
            (let tmpi (gensym))
            `(do
               (let ,tmpi [,tmp get ,i])
               (match-3 ,tmpi ,(nth pattern i))
            )
          )]
       )
     )
   )
   (if [pattern is_a? Symbol]
     `(do (let ,pattern ,tmp) true)
     `[,tmp == ,pattern]
   )
 )
)
(match-3 ...) isn't very pretty. It handles the following patterns:
  • Symbol - bind value to identifier
  • Array (quote symbol) - check if value equals symbol
  • Empty array - check if value is an empty array too
  • Other array - check if value is an array or the right size. If it is, check each of its elements against each element of pattern array.
  • Anything else - check if value equals pattern.
It would be nice to handle variable-sized lists with patterns like (foo bar . rest) . It shouldn't be that hard to implement it. (bool-and ...) is simply (and ...) except that it returns true or false, instead of the last value or nil. (and ...) would work here too, but I wanted simpler version for unit testing (match ...).
(defmacro bool-and args
 (if (empty? args)
   true
   (if [[args size] == 1]
     (hd args)
     `(if ,(hd args)
       (bool-and ,@(tl args))
       false
     )
   )
 )
)
I'm still undecided when to send messages like [args empty?] and when to call functions like (empty? args) and on other such details.

3 comments:

Fabien said...

Notice that once you've got robust pattern matching, you're quite close to have hygienic macros:

- to avoid variable capture, just perform an alpha conversion of variables declared in a macro's quoted code

- to avoid change of open variables meaning, collect all of them in a hashtable at macro declaration time, and replace these variables by hashtable indexing in the maacro's body.

Both operations are rather easy to implement with pattern matching. I implemented the first half in metalua (http://metalua.luaforge.net), through pattern matching.

I didn't bother with the second half, because I want, by design, to get macros clearly separated from regular code. That's one of the ways in which metalua is not just Lisp-with-syntax :)

taw said...

Fabien: I think I'll try to build optional Scheme-like macro system on top of default CL-like defmacro.

Anonymous said...

schemd does not lack a full macro system.
syntax-rules is not the only game in town; most modern schemes implement Dyvbig's syntax-case which allows you to do full macros with arbitrary symbol injection...