Interpreting Tagless Final DSLs with Eff

by Andreas Hartmann
Tags: Functional Programming

Domain-specific languages are a powerful tool for structuring programs. In this article, we focus on the challenge of composing programs from multiple DSLs. We present an efficient and extensible approach by combining two complementing patterns: Tagless Final and free monads, specifically the Eff monad.

La Sagrada Familia, Barcelona, Spain Photo by Claudio Testa on Unsplash

If you're not familiar with the concepts of monads and DSLs in functional programming, I recommend reading the following two articles to help you get started:

Combining DSLs

Since most real-world programs have to deal with a whole array of concerns, we need a way to utilize multiple DSLs in a single program. If we want to combine multiple DSLs in a single for-comprehension, the functions of all involved DSLs must use the same result monad.

In many cases it is possible to use a very expressive monad (like scalaz.Task) for this purpose, but this approach has several drawbacks:

  • It violates the principle of least power – the more expressive a monad, the higher the risk of mistakes.
  • It is not extensible: Committing to a specific monad bears the risk that this particular monad will not be sufficient for all future requirements.

Another approach is to transform all involved DSL monads into the target monad using monad transformers. But this technique requires to specifically address each involved DSL and can become quite cumbersome.

Currently two popular alternatives exist, which we will discuss in this article: free monads, specifically the Eff monad, and Tagless Final. We choose the Tagless Final pattern to model our DSLs, since it is easier to get started with and requires less boilerplate.

Our Example

In our example code we will model two DSLs. The first one is called authn and provides functions for registering and authenticating users:

def register(email: String @@ EmailAddress, password: String):
    Either[RegistrationError, User]

def authn(email: String @@ EmailAddress, password: String):
    Either[AuthnError, User]

In case you're wondering, String @@ EmailAddress is a tagged type, denoting that email is a string with the sole purpose of modeling an e-mail address.

The second DSL is called email and provides a function to send an e-mail to a user:

def send(to: String @@ EmailAddress, subject: String, body: String): Unit

You find the source code for the example application on GitHub.

The package structure looks as follows. We are coupling our code based on functional design, meaning that code with common functionality goes in the same package.

ch.becompany
  authn                     Authentication functionality
    domain                  Authentication domain code
    Dsl                     Authentication DSL
  email                     E-mail functionality
    Dsl                     E-mail DSL
  shapelessext              Extensions to the shapeless library
  shared                    Shared code
    domain                  Shared domain code
  Main.scala                Our main application

To run the example, execute the following command in the console:

sbt run

Modeling DSLs with Tagless Final

With the tagless final technique, we model our DSLs as traits which take a container F[_] as type parameter. At this moment, it's actually not required that F is a monad. Let's model our authentication and e-mail libraries in this style:

ch.becompany.authn.Dsl

package ch.becompany.authn

trait Dsl[F[_]] {

  def register(email: String @@ EmailAddress, password: String):
      F[Either[RegistrationError, User]]

  def authn(email: String @@ EmailAddress, password: String):
      F[Either[AuthnError, User]]

}

ch.becompany.email.Dsl

package ch.becompany.email

trait Dsl[F[_]] {

  def send(to: String @@ EmailAddress, subject: String, body: String): F[Unit]

}

We see that the return values of all DSL functions are wrapped in the container F. This allows us to combine both DSLs by using the same type for F. In the following program, the compiler infers the type parameter for both DSLs from the return type F[…] of the registerAndLogin function.

ch.becompany.Main

package ch.becompany

object Main extends App {

  def registerAndLogin[F[_] : Monad](implicit authnDsl: AuthnDsl[F],
                                     emailDsl: EmailDsl[F]):
      F[Either[AuthnError, User]] = {

    val email = tag[EmailAddress]("john@doe.com")
    val password = "swordfish"

    for {
      _ <- authnDsl.register(email, password)
      _ <- emailDsl.send(email, "Hello", "Thank you for registering")
      authenticated <- authnDsl.authn(email, password)
    } yield authenticated
  }

}

The function signature ensures that F is a monad (by requiring the existence of an implicit value of the type Monad[F]). Therefore the functions of both DSLs can be used in for-comprehensions. Even at this stage, we don't specifiy a concrete type for F; the registerAndLogin function could actually be part of a higher-level DSL.

Besides the Monad instance, the function requires two additional parameters: An instance for the authentication DSL and one for the e-mail DSL, each typed with the common type F. The parameters are declared as implicit to allow automatic resolution by the compiler; we will look into this in detail when we talk about interpreters.

Interpreters

Now that we have defined the syntax of our DSLs in the respective traits, we have to implement the semantics. With the tagless final technique, this is done in interpreters. For each DSL, multiple interpreters can exist; each of them targeting a specific type. If an interpreter doesn't require any parameters it can be modelled as a type classes, so it can be automatically resolved by the compiler when a DSL is used with the respective target type. In our example we implement the interpreters in the companion objects of their DSLs, thereby supporting the implicit resolution mechanism of the compiler.

The type class approach doesn't work if for interpreters that need parameters; in this case the interpreter instances have to be created manually (unless you use non-typeclass implicit parameters, which is generally [discouraged](wartremover TODO)). With manual creation, it is also possible to instantiate multiple interpreters for a target type at runtime; e.g. to implement a network-based service with multiple connections.

In the beginning we will choose target types which make it easy to test the concepts in a simple, self-contained program. In a real-world scenario, you would probably follow the same approach: Start with providing easy-to-use interpreters for your DSLs which can be utilized in test cases. This approach is comparable to implementing mocks, with the difference that our interpreters are full-featured implementations of the DSL.

Later on we can proceed to implementing interpreters for more sophisticated target types covering additional concerns like concurrency and side-effects.

In a first step, we will first provide interpreters for the plain target types. At this stage, the DSLs cannot be combined in a single program, since the target types are not compatible. In the second step, we will lift the target types into the Eff monad, which will serve as a common denominator and make it possible to use both DSLs in the same program.

Interpreter for the Authentication DSL

For the authentication DSL interpreter, we provide a simple implementation which stores users in a list. We utilize the State monad for passing the user repository from one call to the next. Note that the stateInterpreter is declared using the implicit modifier, which makes it visible to the compiler when a DSL interpreter for the UserRepositoryState target type is requested.

ch.becompany.authn.Dsl

package ch.becompany.authn

object Dsl {

  type UserRepository = List[User]

  type UserRepositoryState[A] = State[UserRepository, A]

  implicit def stateInterpreter: Dsl[UserRepositoryState] =
    new Dsl[UserRepositoryState] {

      override def register(email: String @@ EmailAddress, password: String):
          UserRepositoryState[Either[RegistrationError, User]] =
        State { users =>
          if (users.exists(_.email === email))
            (users, RegistrationError("User already exists").asLeft)
          else {
            val user = User(email, password)
            (users :+ user, user.asRight)
          }
        }

      override def authn(email: String @@ EmailAddress, password: String):
          UserRepositoryState[Either[AuthnError, User]] =
        State.inspect(_
          .find(user => user.email === email && user.password === password)
          .toRight(AuthnError("Authentication failed")))
    }

}

Interpreter for the E-Mail DSL

We use the Writer monad as the target type for the e-mail DSL interpreter, allowing us to log the sent e-mails. We call the resulting Writer[Email, A] type MailQueue[A].

package ch.becompany.email

object Dsl {

  final case class Email(to: String @@ EmailAddress, subject: String, body: String)

  type MailQueue[A] = Writer[Email, A]

  implicit lazy val writerInterpreter: Dsl[MailQueue] =
    new Dsl[MailQueue] {
      override def send(to: @@[String, EmailAddress], subject: String, body: String):
          MailQueue[Unit] =
        Writer.tell(Email(to, subject, body))
    }

}

Unifying interpreters Using the Eff Monad

For our interpreters, we have chosen to use a target type based on the specific requirements of the respective interpreter. But as we know, we can't combine the State and the Writer monad, so we have to find a common denominator for both types. One way to achieve this goal is using free monads, specifically the Eff monad.

A Brief Introduction Into the Eff Monad

The Eff monad is based on the principle of free monads (read these pages for more information). When using free monads, the program is expressed as an abstract syntax tree and recursively computed using an interpreter. The key differences between free monads and the tagless final style are explained in this article.

Programs written in the Eff monad can be composed from a set of DSLs. In the Eff nomenclature, these DSLs are called effects. The set of effects which are available when implementing a program is called the effect stack. R is typically used as the type variable for the effect stack. Eff[R, A] is the type of an Eff program with the stack R and the return type A. Later on we will see how a concrete stack type is constructed.

Tagless Final and Eff combined

When combining Tagless Final and Eff, the program is executed in the following stages:

  1. The program is called with a specific target type.
  2. The program returns an expression for a specific target type, e.g. .

The effect stack for our example application will consist of two effects – a State effect, which is emitted by the authentication DSL interpreter, and a Writer effect, which is emitted by the e-mail DSL interpreter. To actually run the resulting Eff program, an interpreter for each involved effect is required. The Eff library conveniently ships with effect implementations (including interpreters) for the most common datatypes from the cats library, including State and Writer, so we do not have to implement any custom Eff interpreters for our example application.

Eff Interpreter for the Authentication DSL

Instead of hard-coding our Eff interpreter for the State monad, we implement a function to transform any interpreter targeting a generic type F into a corresponding interpreter targeting Eff[R, ?]. We are using the kind-projector compiler plugin to transform the two-parameter type constructor Eff[_, _] into the single-parameter type constructor Eff[_, ?] so that it can be used in place of the single-parameter type constructor F[_]. You can think of this mechanism as currying for type parameters: Eff[R, ?] requires one more type parameter to become a concrete type.

The implementation is straightforward: For each function of the DSL, the result of the interpretation is sent into the Eff effect stack. While this approach requires some boilerplate, it has the advantage that the effInterpreter function can be used to transform an arbitrary interpreter for the DSL into a corresponding Eff interpreter.

You probably wonder what the implicit parameter evidence: F |= R means. |= is a shorthand for MemberIn, denoting that the effect F is a member of the effect stack R. When the compiler encounters an implicit parameter of this type, it verifies that the function is called with an effect stack R which contains F, the target type of the input interpreter. This way it is ensured that the resulting Eff program can be interpreted later on.

ch.becompany.authn.Dsl

  private def effInterpreter[R, F[_]](interpreter: Dsl[F])
                                     (implicit evidence: F |= R): Dsl[Eff[R, ?]] =
    new Dsl[Eff[R, ?]] {

      override def register(email: String @@ EmailAddress, password: String):
          Eff[R, Either[RegistrationError, User]] =
        Eff.send(interpreter.register(email, password))

      override def authn(email: String @@ EmailAddress, password: String):
          Eff[R, Either[AuthnError, User]] =
        Eff.send(interpreter.authn(email, password))

    }

Now we implement a function which generates an Eff interpreter for our UserRepositoryState type. First, we declare the type _userRepositoryState[R] as a shorthand for UserRepositoryState |= R. This way, we can conveniently use the context bound notation to declare an implicit parameter as evidence that UserRepositoryState is a member of the effect stack R.

ch.becompany.authn.Dsl

  type _userRepositoryState[R] = UserRepositoryState |= R

  implicit def effStateInterpreter[R : _userRepositoryState]: Dsl[Eff[R, ?]] =
    effInterpreter[R, UserRepositoryState](stateInterpreter)

}

The Eff interpreter for the e-mail DSL is implemented in exactly the same way, so there's no need to cover it here.

The effInterpreter function contains some boilerplate – the result of each DSL function has to be transformed from the UserRepositoryState monad the Eff monad. The diesel library provides a macro to eliminate this boilerplate, but at the time of writing it was not applicable to target monads with additional type parameters, as the R in our [Eff[R, ?]].

Interpreting the Eff Program

Now that we have provided Eff interpreters for both our DSLs, we can instantiate our program in the Eff monad and execute it.

To refresh our memory, here's our program again. The program is using the Tagless Final style; the only restriction being that the type F is a monad:

ch.becompany.Main

package ch.becompany

object Main extends App {

  def registerAndLogin[F[_] : Monad](implicit authnDsl: AuthnDsl[F],
                                     emailDsl: EmailDsl[F]):
      F[Either[AuthnError, User]] = {

    val email = tag[EmailAddress]("john@doe.com")
    val password = "swordfish"

    for {
      _ <- authnDsl.register(email, password)
      _ <- emailDsl.send(email, "Hello", "Thank you for registering")
      authenticated <- authnDsl.authn(email, password)
    } yield authenticated
  }

Before we can run our program, we need to define a concrete Eff effect stack. We want to use the UserRepositoryState and MailQueue effects, so we construct a two-member stack containing these effects:

  type Stack = Fx.fx2[UserRepositoryState, MailQueue]

Now we instantiate the program with this type. The value program has the type Eff[Stack, Either[AuthnError, User]], which means that it is an Eff program with the stack Stack and the return type Either[AuthnError, User].

  val program = registerAndLogin[Eff[Stack, ?]]

Each effect provides one or many interpreters. By importing the Eff syntax helpers, the interpreter functions can be invoked directly on the program. By convention the method names start with run, followed by the name of the effect. Each interpreter call reduces the effect stack of the program by removing its effect. Ultimately, the return value can be extracted from the resulting Eff program, which doesn't contain any effects anymore, using the run method.

Some interpreters require parameters. Some interpreters produce values, which are appended to the return value of the program in form of a tuple. The runState requires passing the initial state, in our case an empty user repository; the result state is emitted as part of the return value. We use runWriterFold with a ListFold to run the Writer interpreter; this way we receive a list of Emails as part of the return value. Since both interpreters augment the return value, the result type takes the form of two nested tuples.

  val result = program
    .runState(List.empty[User])
    .runWriterFold(ListFold[Email])
    .run

  val ((authenticated, users), emails) = result

  println("Authentiated: " + authenticated)
  println("Users: " + users)
  println("E-Mails: " + emails)
}

Finally we can run our program using sbt run, and we can examine its output – the authenticated user (the main return value), the list of users in our user repository, and the list of e-mails sent.

Authentiated: Right(User(john@doe.com,swordfish))
Users: List(User(john@doe.com,swordfish))
E-Mails: List(Email(john@doe.com,Hello,Thank you for registering))

Characteristics and Limitations of This Approach

Complexity: From my experience, introducing the Eff monad can prove challenging for teams which are not familiar with the underlying functional programming concepts. There's a lot going on behind the scenes, and much of the functionality is based on implicit resolution, which is notorious for emitting not particularly helpful error messages. If you don't necessarily require the specific features of the Eff monad, consider sticking with plain Tagless Final.

Performance: In contrast to Tagless Final, where programs are expressions, an Eff program is an algebraic data type (ADT) composed of nested case class instances. You might want to keep in mind that this comes with a certain runtime overhead.

Boilerplate: If you can't (or don't want to) use macros like those provided by the diesel library, some boilerplate is required when implementing Eff interpreters on the basis of existing interpreters for a different target monad. When you are directly targeting all interpreters to the Eff monad, this boilerplate disappears.

Conclusion

The combination of the Tagless Final pattern provides a straightforward, scalable solution for composing programs from multiple layers of DSLs with different target monads. In contrast to a pure Eff-based approach, the business code can be implemented independent from any specific library.

Source Code

Libraries

Further Reading