Macros are powerful metaprogramming tools, but they can be difficult to use well. In this article, I share several tips for creating correct, legible, and useful macros in Clojure.
Macros, in the Lisp sense, are in many ways the ultimate metaprogramming tool. In Lisp, not only is the parse tree explicit in the source code (which is the point of the parentheses), but the code can be treated simply as data. Macros, which transform code, are thus, in Lisp, merely transforming data. And functional programming techniques excel at transforming data.
Despite the convenience of an explicit parse tree, it can be difficult to develop correct and maintainable macros. Hence these tips, gleaned from personal experience and a study of macrology (if you will) from books by Paul Graham and Doug Hoyte.
I assume that you know what a macro is, and basically how it works. Most of these tips are geared for people writing macros. Sorry. If you’d like to see a more introductory post on macros in Clojure, please let me know!
This is a list of suggestions. You can call them "rules" if you want. But know that the goal of these suggestions is not mindless conformity, but informed understanding. Please read in order to understand, and hopefully you will internalize a "wisdom of macros" that will guide you about when, why, and how to use this powerful tool well. At that point, you will understand that, for every single one of these suggestions (I checked), there are situations where you might not want to follow their advice. OK, without further ado…
Tip 1: Don’t use a macro unless you have to.
Prefer functions over macros, for two reasons.
- Functions are values, so they can be passed as parameters to functions (think `map` and `apply`) and returned from functions. Macros are not values and can't be used in this way.
- Functions are simpler. Don't use a complex tool when a simple one will do.
Sometimes, however, a function cannot do what’s desired. Here are some things that only macros can do:
- conditional evaluation of parameters (like the body of a `when`)
- multiple evaluation of parameters (like the body of a `for`)
- binding (like in the bindings vector of a `let`)
- accessing parameter expressions before they are evaluated (like `->` and destructuring)
- DSLs (since you control when and how things are evaluated)
- accessing the caller's context--but don't do this. It's usually a bad idea.
Note that these things are impossible as such with functions. If you’re not convinced, try implementing
if as a function…then watch it fail when I call it like so:
That said, if you’re willing to add some indirection, you can accomplish a similar thing with functions by wrapping code whose evaluation needs to be controlled in a function, and then calling the function-ified code when necessary. In that case, a call like this could work:
But in that case, there is noisy syntactic overhead in every call. Macros can eliminate this overhead. Whether the pain of overhead or the pain of macros is greater is a judgment call.
Tip 2: Include an example expansion after the macro.
Unless you’re an expert macro-er, and you’re the only one consuming a macro, you will probably want to document for others what the macro is doing. Docstrings are great, but often even better is an example transformation, stored in a comment after the macro definition. Examples often convey the process better than the code itself does.
So, say I wrote a macro called
numeric-if, which is a 3-way
if with different branches for whether the number is negative, zero, or positive. Right after my macro definition, I include an example expansion in a comment, like the following. Consider, as you read this, how simple it is to understand what the macro is doing.
1 2 3 4 5 6 7 8 9 10
Tip 3: Append
-expr to each parameter of a macro.
One of the most important properties of a macro, and perhaps the key way in which they differ from functions, is that whereas functions get values as parameters, macros get expressions. Having clarity about this distinction is vital to macro development. Therefore, name the parameters to your macros "something-expression" instead of just "something".
numeric-if example above, instead of this:
1 2 3
1 2 3
Tip 4: Capture your assumptions in
A macro is transforming the caller’s source code. If you make assumptions about that code,
assert those assumptions. This makes life easier for the caller. Consider that you call a macro and get an exception somewhere inside the macro call. You have to debug not the code you wrote, but the code that the macro transformed. Don’t you want to have confidence that the macro didn’t introduce the problem? Therefore, give the same provision for those who call your macros.
Note that you can assert about the form of a parameter as well as the value of a parameter. The former will probably be an assert statement outside of the macro expansion–potentially even in the macro’s preconditions (yes, you can have preconditions for a macro). The latter will be an assertion in the transformed code returned by the macro.
Tip 5: Use the backtick operator.
A macro simply returns code-as-data. You can construct this code however you like. If you love a challenge, you can do it with basic Clojure constructs like
quote, and perhaps even
gensym (see next section). But there is a simpler (and more legible) way: use the backtick operator, also called the syntax quote operator.
The syntax quote operator is like the quote operator in that it evaluates to the raw, unevaluated form instead of the value that form would normally have. However, the syntax quote differs from the regular quote because you can unquote inside it using the
\~@ operators. (You can also use the auto-gensym operator inside a syntax quoted form–see the next section.)
This is a very convenient way to build an S-expression, which is a major part of what macros do.
Tip 6: Don’t conflict with caller’s names.
Remember how macros can access the caller’s context, and how that’s a bad idea? It turns out that you can accidentally access or pollute your caller’s environment if you’re not careful. This is called "variable capture". A basic way to avoid this problem is to always use a "gensym", or generated symbol, for any names that you define in your expansion. Generated symbols are guaranteed not to conflict with other symbols in your code.
The manual way to do this is to call
gensym outside of your expansion to get the name, then use that name in your expansion. A more convenient way, assuming you’re using the syntax quote operator, is to use the
# operator, or "auto-gensym" suffix. Then, you can just say
name# to get a gensym. Any references to
name# inside the same level of syntax quoting will refer to the same gensym. Even if the caller already defined their own binding
name#, it won’t conflict.
Here’s an example. Notice the
number# inside the
let binding. That’s a gensym.
1 2 3 4 5 6 7 8
Tip 7: Avoid evaluating an argument multiple times
Unless it’s specifically a feature of your macro (like
for in Clojure), it’s usually a source of great surprise to a user if you evaluate one of their parameters multiple times. Their expression might have side effects, and it borders on incorrect to call them twice. So don’t do that. Instead, evaluate once, saving the value as a binding with a
let block in your expansion.
For example, say we defined this macro (which really should be a function, of course):
1 2 3 4
seqable-expr twice. So if a user calls the macro like so:
…then they would see "Averaging!" printed twice. Not good. Instead, write your macro like so:
1 2 3 4 5
Tip 8: Use
macroexpand-1 to debug your expansion.
If you need to debug your expansion,
macroexpand-1 is useful. Call it like this:
Tip 9: For complicated macros, build the S-expression in helper function(s)
Recall that in all Lisp languages, code can be simply treated as data. A function call, for example, is just a list, perhaps with nested data structures inside it. This property, called homoiconicity, is one of the reasons Lisp languages are excellent.
Because of homoiconicity, you are free to write functions to do the transformation of code, because code is just data. You can still use the syntax quote and other operators–it’s not like those are restricted to be used inside a
defmacro call. Remember that macros are just functions that (a) get unevaluated parameter expressions instead of evaluated parameter values, and (b) return code-as-data, which is then implicitly evaluated. There’s no reason why a macro couldn’t simply call a function to do all the heavy lifting.
This is especially useful if you’re doing a fairly complicated transformation, in which case you can break out the transformation into separate pieces, each handled by a helper function. Divide and conquer. Or if several macros share a common sub-transformation.
Another benefit is that you can test each function separately. Just remember to quote the parameters.
Macros are a powerful tool and a rich subject for study. Few programming constructs can boast such power, and the structure of Lisp allows macros to blend almost seamlessly into the language itself. However, they can be tricky to get right, especially in a legible way. These guidelines will serve beginning and intermediate macro programmers well. Advanced programmers will know when to ignore the guides.