function show_value(x)
println("The value you passed is ", x)
end
= "sweet"
orange = "sour"
apple
show_value(orange)
show_value(apple)
The value you passed is sweet
The value you passed is sour
June 7, 2021
Macros are a powerful and interesting feature of the Julia programming language, but they can also be confusing. Users coming from Python, Matlab or R have not come in contact with similar constructs before, and they require a different way of thinking about code. This article is supposed to be a simple introduction, after which you might judge better when use of macros is appropriate and how to get around some of the most common gotchas.
Macros change existing source code or generate entirely new code. They are not some kind of more powerful function that unlocks secret abilities of Julia, they are just a way to automatically write code that you could have written out by hand anyway. There’s just the question whether writing that code by hand is practical, not if it’s possible. Often, we can save users a lot of work, by hiding boilerplate code they would otherwise need to write inside our macro.
Still, it’s good advice, especially for beginners, to think hard if macros are the right tool for the job, or if run-of-the-mill functions serve the same purpose. Often, functions are preferable because macro magic puts a cognitive burden on the user, it makes it harder to reason about what code does. Before understanding the code, they have to understand the transformation that the macro is doing, which often goes hand in hand with non-standard syntax. That is, unless they are ok with their code having unintended consequences.
Some of the magic of macros derives from the fact that they don’t just generate some predefined code, they rather take the code they are applied to and transform it in useful ways. Variable names are one of the fundamental mechanisms by which we make code understandable for humans. In principle, you could replace every identifier in a working piece of code with something random, and it would still work.
The computer doesn’t care about the names, only humans do. But functions run after the code has been transformed into lower-level representations, and names are lost at that point.
For example, in this code snippet, there is no way for the author of the function to know what the user named their variable. The function just receives a value, and as far as it is concerned, that value is named x
.
function show_value(x)
println("The value you passed is ", x)
end
orange = "sweet"
apple = "sour"
show_value(orange)
show_value(apple)
The value you passed is sweet
The value you passed is sour
Any information about what the user wrote is lost, as the function only knows “sweet” and “sour” were passed. If we want to incorporate the information contained in the variable names, we need a macro.
macro show_value(variable)
quote
println("The ", $(string(variable)), " you passed is ", $(esc(variable)))
end
end
@show_value(orange)
@show_value(apple)
The orange you passed is sweet
The apple you passed is sour
You probably know a macro that works very similar to this one, which is @show
Note that it doesn’t make a difference here if we use parentheses for the macros or not. That’s a feature of Julia’s syntax which makes some macros more tidy to write. This is especially true if the macro precedes a for block or some other multi-line expression.
Let’s look at our macro in more detail. Even though it’s short, it has a few interesting aspects to it.
First of all, a macro runs before any code is executed. Therefore, you never have access to any runtime values in a macro. That’s something that trips many beginners up, but is crucial to understand. All the logic in the macro has to happen only using the information you can get from the expressions that the macro is applied to.
One good step to understand what’s going on with an expression, is to dump it. You can use Meta.@dump
for that.
In our case, it’s not very interesting:
As you can see, the expression orange
contains only the Symbol
orange. So that is what our macro gets as input, just :orange
. But, again, no runtime information about it being "sweet"
.
Inside the macro, a quote
expression is constructed. A quote
with source code inside returns an expression object that describes this code. The expression we return from a macro is spliced into the place where the macro call happens, as if you really had written the macro result there. That’s the reason why a macro can’t technically do more than any old Julia code.
We can see the code that the macro call results in by using another helper macro, @macroexpand
.
quote
#= In[3]:3 =#
Main.println("The ", "orange", " you passed is ", orange)
end
You can see that, ignoring linenumber and module information, the macro created a function call as if we had written
Therefore, let’s look at where the two oranges come from.
The first one is "orange"
, which is a string literal. We achieved this with this expression inside the macro:
Remember that variable
holds the Symbol
:orange
when the macro is called. We convert that to a string and then place that string into the quoted expression using the interpolation symbol $
. This is how we can print out a sentence that references the user’s chosen variable name.
The other orange
is just a normal variable name. It was created with the interpolation expression $(esc(variable))
. The esc
stands for escape
and is another part of macros that is hard to understand for beginners.
To explain why esc
is needed, let’s look at a macro that leaves it out. In this example we define the macro in a separate module (because any macro you’d put in a package would not be in the Main
module either):
module SomeModule
export @show_value_no_esc
macro show_value_no_esc(variable)
quote
println("The ", $(string(variable)), " you passed is ", $variable)
end
end
end
using .SomeModule
try
@show_value_no_esc(orange)
catch e
sprint(showerror, e)
end
"UndefVarError: `orange` not defined"
The code errors because there is no variable orange
. But there should be, we interpolated it right there! Let’s look at the macro output with @macroexpand
again:
quote
#= In[7]:5 =#
Main.SomeModule.println("The ", "orange", " you passed is ", Main.SomeModule.orange)
end
Ok, so the variable looked up is actually SomeModule.orange
, and of course we didn’t define a variable with that name in SomeModule
. The reason this happens is that macros do often need to reference values from whatever module they were defined in (for example, to add a helper function, that also lives in that module, to the user’s code). Any variable name used in the created expression is looked up in the macro’s parent module by default.
The other reason is that it is potentially dangerous to just change or create variables in user space in a macro that knows nothing about what’s going on there.
Imagine the writer of the macro and the user as two people who know nothing about each other. They only interface via the small snippet of code passed to the macro. So, obviously, the macro shouldn’t mess around with the user’s variables.
In theory, a macro could insert things like my_variable = nothing
or empty!(some_array)
in the place where it’s used. But imagine the user already has a my_variable
and it happens to hold the result of a computation that ran hours. As the macro writer doesn’t know anything about the variables the user has created, all macro-created variables are by default scoped to the macro’s module to avoid conflicts.
Here’s a short example of bad escaping, with a macro that is not really supposed to do anything:
Whoops, the temp_variable
was overwritten by the macro, and this can happen with badly written macros.
But still, in order to access the value of the user’s variable orange
, we need to escape
the use of that symbol in our generated expression. Escaping the variable could be summarized as saying “treat this variable like a variable the user has written themselves”.
As a rule of thumb, macros should only ever escape variables that they know about because they were passed to the macro. These are the variables that the user potentially wants to have changed by the macro, or at least they are aware that they could be subject to change.
Here you can see another example, where there is both a user and a module orange
:
module AnotherModule
export @show_value_user_and_module
orange = "bitter"
macro show_value_user_and_module(variable)
quote
println("The ", $(string(variable)), " you passed is ", $(esc(variable)),
" and the one from the module is ", $variable)
end
end
end
using .AnotherModule
@show_value_user_and_module orange
The orange you passed is sweet and the one from the module is bitter
Even though we could already see some interesting macro properties, maybe you didn’t start reading this article to learn about printing users their own variable names back (even though that is a very user friendly behavior in general, and many R users like their non-standard evaluation a lot for this reason).
Usually, you want to modify the expression you receive, or build a new one with it, to achieve some functional purpose. Sometimes, macros are used to define domain specific languages or DSLs, that allow users to specify complex things with simple, yet non-standard expressions.
A good example for this are the formulas from StatsModels.jl
, where @formula(y ~ x)
is a nice shortcut to create a formula object that you could in principle build yourself without a macro, but with much more typing.
Let’s try to write a small useful macro that transforms a real expression!
An issue some Julia users face once in a while is that the fill
function’s argument is executed once, and then the whole vector is filled with that result. Let’s say we want a vector of 5 three-element random vectors.
5-element Vector{Vector{Float64}}:
[0.9039841821218482, 0.3644507708511703, 0.8407975367191795]
[0.9039841821218482, 0.3644507708511703, 0.8407975367191795]
[0.9039841821218482, 0.3644507708511703, 0.8407975367191795]
[0.9039841821218482, 0.3644507708511703, 0.8407975367191795]
[0.9039841821218482, 0.3644507708511703, 0.8407975367191795]
As you can see, every vector is the same, which we don’t want. A way to get our desired result is with a list comprehension:
5-element Vector{Vector{Float64}}:
[0.8957367680265885, 0.249510539903665, 0.8937505345144938]
[0.9314953138974689, 0.25762719951419766, 0.15016666804238443]
[0.27431927403854617, 0.16696200453179322, 0.18711308743600819]
[0.15334416897527525, 0.09403493703533272, 0.9900757734923491]
[0.520051568171863, 0.06984731462064464, 0.5529582754578751]
This works, but the fill syntax is so nice and short in comparison. Also it gets even worse if you are iterating multiple dimensions in nested for loops, while you can always write fill(rand(3), 3, 4, 5)
.
So can we write a macro that makes a list comprehension expression out of a call like @fill(rand(3), 5)
, so that the first argument is executed anew in each iteration? Let’s try it!
The first step is always to understand what expression you’re even trying to build. We already use two iterators here to understand how multiple are handled in the resulting expression:
Expr
head: Symbol comprehension
args: Array{Any}((1,))
1: Expr
head: Symbol generator
args: Array{Any}((3,))
1: Expr
head: Symbol call
args: Array{Any}((2,))
1: Symbol rand
2: Int64 3
2: Expr
head: Symbol =
args: Array{Any}((2,))
1: Symbol _
2: Expr
head: Symbol call
args: Array{Any}((3,))
1: Symbol :
2: Int64 1
3: Int64 5
3: Expr
head: Symbol =
args: Array{Any}((2,))
1: Symbol _
2: Expr
head: Symbol call
args: Array{Any}((3,))
1: Symbol :
2: Int64 1
3: Int64 3
Aha, now we actually see some real expressions. Every Expr
object has a head
that stores what kind of expression it is, and a vector called args
which contains all arguments to that expression.
We can see that a list comprehension is made by making an Expr
where the head is :comprehension
. There’s only one argument to that expression, which is a :generator
expression. This one in turn is assembled of the expression being called in each iteration, and the iteration expressions _ = 1:5
and _ = 1:3
.
We want to use the syntax @fill(rand(3), sizes...)
, so we need to think how we can transform those two arguments into the expression we want.
Here, we’ll build the Expr
by hand, instead of writing one big quote
. Sometimes that is easier, it also depends on what you find more readable. Expressions with a lot of quoting and interpolating can be hard to understand. I usually prefer quote ... end
over the equivalent :(...)
just because I can parse words a bit better than parentheses.
Here we go:
For each size argument, we make one of the iterator expressions that we saw in the dump above. We escape each size variable s
because those are the arguments that the user will write themselves, and they need to resolve correctly in their scope later.
The comprehension expression then receives the first argument escaped because that expression also needs to run as-is in the user’s scope.
macro fill(exp, sizes...)
iterator_expressions = map(sizes) do s
Expr(
:(=),
:_,
quote 1:$(esc(s)) end
)
end
Expr(
:comprehension,
esc(exp),
iterator_expressions...
)
end
@fill (macro with 1 method)
Let’s try it out:
5-element Vector{Vector{Float64}}:
[0.23155776289714214, 0.16506850108672544, 0.6975576110978176]
[0.9321190744533363, 0.7284798945126126, 0.8736131864599121]
[0.36504661751333844, 0.022691708746904182, 0.5288865087865801]
[0.8521086358547397, 0.2895922988729154, 0.1456070184114313]
[0.8720484237531486, 0.21008612789745829, 0.5082683499697537]
A good check if you’ve escaped correctly is to pass expressions that reference some local variables. The call will error if you’ve forgotten to escape any of them:
5-element Vector{Vector{Float64}}:
[0.5917157732709782, 0.8755225218366083, 0.4032454673315716]
[0.6111142681907994, 0.06564154230386965, 0.4591276163618556]
[0.6521749060007972, 0.26064420556063617, 0.2085548309582519]
[0.64546352947466, 0.485066158128166, 0.20739815115743743]
[0.8058584624008102, 0.9168621696136154, 0.4280967575910163]
This works fine! It should also work with more size arguments, we’ll generate only random scalars so the printout is manageable:
5×3 Matrix{Float64}:
0.144371 0.952823 0.331021
0.289901 0.00354794 0.822716
0.233857 0.810845 0.933876
0.665742 0.988701 0.133996
0.0806734 0.270098 0.825293
Even though this particular example is contrived for simplicity (we could just use rand(5, 3
of course) compare it to the alternative list comprehension syntax:
As you can see, macros can be a gain in syntax clarity, and they offer a powerful way to interact with the user’s source code.
Just remember that a reader also needs to understand what’s happening. In our example, rand()
is not just executed once but many times, which is non-standard behavior for something resembling a function call. This code-reasoning overhead must always be weighed against the convenience of shorter syntax.
I hope you have learned a thing or two about macros and are encouraged to play around with them yourself. Usually, good ideas for macros only present themselves after interacting with Julia for a while, so if you are a beginner, give it time and become proficient with normal functions first.