Embedding, deep and shallow

Not long ago, I wanted to better understand the notions of “deeply” and “shallowly” embedded languages. Here I mean “embedded” in the sense of, say, “domain-specific language embedded in Haskell”, although it doesn’t actually matter to me what the host language is, or whether the embedded language is “domain-specific”.

My understanding of the difference had been something like this:

  • A shallow embedding involves piggybacking on the ASTs of the host language. A program in the embedded language consists of calls to functions that the author of the embedded language has defined, and the host language’s ASTs are the only ASTs that there are.

  • In a deep embedding, the author of the embedded language comes up with their own AST type. A program in the embedded language gets parsed into an AST that, while being a host-language data structure, may or may not be wildly different from the host language’s ASTs.1

I asked a few colleagues from IU if they thought this understanding was more or less correct, and although nobody exactly said “no”, the distinction between deep and shallow embeddings took a turn for the much less clear-cut. Ken Shan wrote:

I think a deep embedding is where a program in the embedded language is represented by a value in the host language. Here “represent” means that, for each operation that I want to perform on programs in the embedded language (such as running them, printing them, compiling them, listing the names of free and bound variables in them), I can write a function in the host language that takes such a value and performs the operation.

This way of looking at things seems a lot more general than the way I’m defining “deep embedding” above. A “value in the host language” could be an AST, but it could also be something else. Ken went on to say:

So, the more operations I want to perform, the fewer embeddings count as deep. If all I want to do is to run the program, then a shallow embedding counts as a deep embedding!

I don’t have a good response to this other than “whooooooa”, but in any case, it seems like the notions of shallow embedding and deep embedding are a lot less distinct than I had thought. Whether a language’s embedding is shallow or deep seems analogous to the question of whether a language is “high-level” or “low-level”. It might be either, depending on where you’re standing and what you want to do.

Wren Romano expanded on that idea:

A (shallowly) embedded language is when you take some library and — as a client — you come to think of it less as a “library” and more as a “language”. This is a squishy boundary, but I think we get far more agreement over when it happens than is entirely reasonable. (In the same sense as mathematics being unreasonably effective for science.) Part of what it is to be a language is that the interface belies a deep-seated conceptual structure that makes it feel natural and holistic, makes it feel like a “language”.

A deeply embedded language is when the same thing happens, but on the server side: when we stop thinking so much about providing an API and start thinking about providing a language — with all the question that language designers start to ask: about optimizations, desugarings, analyses, syntactic representations, and all that. This too is a very squishy boundary: when is it that you stop thinking about “your library code” and start thinking about “your compiler”? Is it when you no longer directly interpret the grammar someone gave your parser, but instead perform analysis on the grammar in order to optimize the transitions between parser states? Is it when you work to recover shared subexpressions and add dead-code elimination passes? Is it when you develop multiple semantic interpretations? Is it when you add a type system?

And Cameron Swords added:

I can imagine a library that users start to think of as a language, only to have the library designer start working on it that way. What started out as a library may grow into a shallow EDSL and from there grow into a deep EDSL. In all of this, though, the exact moment when we consider this project one thing or another seems more a matter of consensus than anything else.

I’m writing this post because although I see a lot of discussion online about shallow and deep embeddings, it doesn’t seem to capture or account for what Ken, Wren, and Cam are saying here. In fact, I see people concluding that deep embeddings are better, which seems to me about as nonsensical as saying that high-level languages are better than low-level languages. Better for what? “High-level” or “low-level” compared to what?

Thanks to Ken, Wren, Cam, and Dan Piponi for discussing the ideas in this post with me.

  1. One consequence of defining “deep embedding” this way is that we can think of every language as an embedded language! A standalone language can be considered deeply embedded in whatever language its compiler is written in.