Macros have types!
Unseemly has all of its core features, but it’s still a long way from being practical. There’s a still a lot of stuff to implement and iron out.
Unseemly is the first language able to safely typecheck all macros before expansion.
Typically, macro-based languages are untyped,
and programmers in typed languages are rightly reluctant to use macros,
because macros can make type errors incomprehensible.
In Unseemly, the code that macros generate is automatically typesafe,
as long as the code the programmer writes passes typechecking,
so it can have the best of both worlds.
(This has historically been difficult,
but recent research has cleared a path.)
If you want to implement a typed language, and the types are pretty normal,
you can write the whole language as Unseemly macros.
Not only is that faster than writing the language from scratch
(you get the typechecker for free!),
but Unseemly-based languages get to share libraries and even some tooling.
(Tools like text editor support and a REPL should be shareable.)
Unseemly is utterly barebones,
but it has the tools to grow a language.
I like to divide the design of programming languages into two main families.
It’s not the only valid taxonomy,
but it appeals to me.
One family, the typed languages,
includes the MLs and Haskell, as well as C++, Java, Rust, and so on.
Programmers in those languages use type systems
both to describe data they are interested in and to express invariants.
The other, smaller, family is macro-based languages.
These are mostly direct descendants of Lisp, like Scheme and Racket.
(If you squint, the dynamic metaprogramming systems of Ruby and JavaScript
make them cousins of this family)
Programmers in those languages use metaprogramming to
abstract over surface syntax, control flow, and binding.
But if you write in a typed language,
you almost certainly hear the advice to use the macro system sparingly.
And Lisps (usually) lack a type system altogether.
While type errors respect function boundaries,
as soon as a macro is involved,
you have to wade through the macro-generated code
to figure out why the macro went wrong,
even if the error had nothing to do with the macro!
And type errors are the user interface of a typed language;
they have to be comprehensible!
If the type system holds you responsible for generated code,
code generation is not an abstraction.
Languages descended from ML often have names with “ML” in them.
Languages descended from Scheme often have names suggesting improper behavior.
Unseemly is both a Scheme and an ML.
In Unseemly, macros have types!
Type errors respect macros they same way they respect functions,
so macros feel like part of the language.
Unseemly’s macro system is procedural, hygienic,
and has access to syntax quotation,
just like Scheme’s.
Unseemly’s type system is algebraic, generic,
and has access to pattern-matching,
just like ML’s.
Almost everything about Unseemly has stolen
from older, more respectable languages.
The one new thing, which Unseemly needs to make macro types work,
is called binding annotations.
When you define a macro that binds names (like a lambda or a let),
you have to specify in the syntax what the binders are and where they’re bound.
Macros are an ergonomic way to specify source-to-source translations.
In other words, they’re a great way to write a compiler.
In macro-based languages,
the language is typically composed of a small set of “core” forms,
and almost every language feature the user interacts with
is a macro that expands to those forms.
Language features can be built layer-by-layer, and, since they live in libraries,
they can be versioned separately from the core language.
Compilers have a reputation for being hard to write. This is basically wrong.
It’s true that writing everything from the tokenizer to the assembly code generator
for a complex language without using any outside libraries
could take years.
But that calculation puts assembly language on an unearned pedestal.
If you’re a programmer,
you can already write a compiler to some normal language (instead of assembly),
if you’re willing to spend a month or two mucking around with strings.
Unseemly (like other macro-based languages)
doesn’t exist to make writing compilers easier; it’s already not that hard.
Unseemly makes it less tedious (with type-safe syntax quotation),
and gives you more of the goodies (type checking, pattern-matching, parsing)
that you shouldn’t have to reimplement.
For example, in order to implement Unseemly,
I needed to write a fairly complicated typechecker.
I’m not an expert in types,
so I just copied the rules out of the brick wall book.
Now I’m a non-expert with a typechecker, and with Unseemly, you can be, too!
If you write a library in one Unseemly-backed language,
in most cases, programmers in other Unseemly-backed languages
will be able to use your library without a foreign function interface.
(This is like the relationship between Clojure and Java.)
This is why Unseemly’s type system is more normal-looking
than the rest of the language;
library users should be able to read type signatures.
With a shared type systems, libraries can be language-agnostic.
Because Unseemly macros (and their associated changes to syntax) are scoped,
it’s possible for multiple languages to coexist on equal footing in the same file.
Using syntax quotation rather than strings to embed code
prevents problems like SQL injection
and means that the Unseemly auto-formatter (uh, once someone writes one)
will format the quoted code.
Okay, I’ve been postponing showing you a code sample
because Unseemly’s syntax is bats.
It looks like I was trying to come up with
a grand unified theory of syntax from first principles,
which, embarrassingly, I was.
I also didn’t worry too much about how it looked, because
the syntax (even the tokenizer!) can be completely rewritten with macros.
And if there was anything I thought a macro could do,
I’ve omitted it from the core language.
Here’s a program to take the factorial of 5:
((fix .[again: [ -> [ Int -> Int ]] .
.[n: Int .
match (zero? n) {
+[True]+ => one
+[False]+ => (times n ((again) (minus n one)))
}
].
].) five)
It’s really hard to read, because Unseemly doesn’t have if
statements
or recursive function definitions.
But we can fix that!
Here’s what that if
macro looks like,
though the density of weird new syntax may make it hard to read:
extend_syntax
Expr ::=also forall T . '{
[
lit ,{ DefaultToken }, = 'if'
cond := ( ,{ Expr<Bool> }, )
lit ,{ DefaultToken }, = 'then'
then_e := ( ,{ Expr<T> }, )
lit ,{ DefaultToken }, = 'else'
else_e := ( ,{ Expr<T> }, )
]
}' conditional -> .{
'[Expr | match ,[cond], {
+[True]+ => ,[then_e],
+[False]+ => ,[else_e], } ]' }. ;
in
⋮
That’s not a lot of code, considering that it handles
the syntax, the typing, and the actual behavior of if
!
Then we don’t need to use match
to implement a factorial function:
((fix .[again: [ -> [ Int -> Int ]] .
.[n: Int .
if (zero? n) then one else (times n ((again) (minus n one)))
].
].) five)
Here’s what it could look like if we added function definitions:
letfn (fact n: Int) -> Int =
if (zero? n) then one else (times n (fact (minus n one))) ;
in (fact five)
…and numeric literals:
letfn (fact n: Int) -> Int =
if (equals? n 0) then one else (times n (fact (minus n one))) ;
in (fact 5)
…and binary math operators:
letfn (fact n: Int) -> Int =
if n == 0 then 1 else n * (fact n - 1)
in (fact 5)
Another layer of macros can impose a C-like or Scheme-like or ML-like syntax,
add comments, more literals, and convenience features.
It shouldn’t take much code to get a basic language off the ground.