Getting Func-ey Part 3 - Typeclasses

20 minute read Published: 2020-01-15

Intro to FP series:

The posts are originally based on a series of 5 presentations I ran at my company in an attempt to get in front of some anti-Scala sentiment that pervaded much of the org at the time. The org suffered from the typical problem of hiring Java devs and then telling them to write to Scala. It assumes people are familiar with Scala syntax for the basics.

tldr;

Typeclasses have an unfortunate name and have nothing to do with classes. Your language doesn't event need to have classes in order to have typeclasses (see Rust and Haskell). Scala has typeclasses via implicits. The syntax is a little crufty. Why typeclasses?

It's better to think of typeclasses as type capabilities or type behaviors as the class part is misleading.

Typeclasses

A typeclass in Scala is a trait that:

There should only be one implementation of a typeclass for a given type. This is known as typeclass coherence. Scala will let you define more than one, but if they are both in scope it will be a compile error. Having more than one is generally considered bad typeclass design. After a few examples we will discuss why we want to try to adhere to this particular design decision as much as possible.

A typeclass has a typeclass definition and one or more implementations for different types:

trait Show[T] {
  def debugInfo(v: T): String
}

This is a simple typeclass for giving a type the capability of showing some sort of prettified debugInfo, maybe for logging purposes when things go awry.

Scala does not have first class support for typeclasses, but if you look at languages that do have first class support for typeclasses you will see that the definition of a typeclass is quite similar:

// rust
pub trait Show {
  fn debugInfo(&self) -> String;
}

impl Show for string {
  fn debugInfo(&self) -> String {
    self
  }
}
-- haskell
class Show a where
  debugInfo :: a -> String 

instance Show String where 
  debugInfo s = s 

There is always this separation of definition and implementation.

You see typeclasses everywhere in Scala. For example, in the standard library from scala.math.Numeric defines the following two typeclasses:

trait Ordering[T] {
  def compare(x: T, y: T): Int
  def lt(x: T, y: T): Boolean = compare(x, y) < 0
  def gt(x: T, y: T): Boolean = compare(x, y) > 0
}

trait Numeric[T] extends Ordering[T] {
  def plus(x: T, y: T): T
  def times(x: T, y: T): T
  def negate(x: T): T
  def zero: T
  def abs(x: T): T = if (lt(x, zero)) negate(x) else x
}

If you use Play JSON Read / Write / Format or Circe Decoder/Encoders surprise! these are typeclasses and you are implementing a typeclass for your type. In scala it's common for serialization/deserialization libraries to use typeclasses.

Comparison to interfaces

People will say that they are like interfaces. There is some overlap but they are different. I don't think the details matter that much but if you are interested, you can check out the following:

The main differences are separation of implementation (we'll get to this), return type polymorphism (not important for this post), and typeclass composition

Records of Functions

If you google around you will eventually hit a bunch of people talking about typeclasses vs records of functions. Record of functions is a haskell term and whenever you see it think class + methods:

class UserService(db: DBConnection) {
  def getUser(id: UserId): Task[User]
  def getUserPermissions(userId: UserId): Task[Permissions]
  def updateUser(id: UserId, name: String): Task[Unit] 
}

// and as a record of functions, in haskell style,
// you would never write it this way in scala 
case class UserService(
  getUser: DBConnection => UserId => Task[User],
  getUserPermissions: DbConnection => UserId => Task[Permissions],
  updateUser: DbConnection => UserId => String => Task[Unit]
)

Since functions are just types, a record of functions is a data type of functions. From the above, the two are essentially equivalent and how to make something that feels sort of like class/object in Haskell. In scala we would just use a normal class or a case class with methods.

Whatever you can do manually with a record of functions, you can do with a typeclass which leads to a bit of design paralysis of when to use a typeclass vs a record of functions. We'll discuss this in this post but I do suggest reading Noel Welsh's excellent overview of this topic

Using a typeclass

Scala does not have first class support for typeclasses but we can do them via implicits. There is quite a lot of cruft and ceremony around this compared to other languages. We'll work through the bare bones syntax first and then look at some sugar to clean it up:

def stuff[T](t: T)(implicit N: Numeric[T]): T = {
  import N._
  times(negate(abs(t)), t)
}

There is a lot going on here:

This works but it's pretty crufty isn't it? The implicit part is longer than the rest of the function definition! Importing something into the middle of a function feels weird. Cleaning this up with context bound:

def signOfTheTimes[T: Numeric](t: T): T = {
    val ev = implicitly[Numeric[T]]
    ev.times(ev.negate(ev.abs(t)), t)
}

This reads a bit better

We can clean this up by introducing a summoning implicit on the companion object to the typeclass:

object Numeric {
  def apply[T](implicit numeric: Numeric[T]): Numeric[T] = numeric
}

This looks hairy but it's honestly just annoying boilerplate your memorize (or get a library to derive for you).

With the summoning implicit in place it looks like:

```scala
def signOfTheTimes[T: Numeric](t: T): T = {
  val N = Numeric[T]
  import N._
  times(negate(abs(t)), t)
}

Which is on it's way to getting better but it still kind of sucks. We sort of have inside out methods. To improve this further it's commin to have an ops object on the typeclass companion objects. If you poke around FP libraries you'll find this pattern is very common. So let's introduce this for Numeric:

object Numeric {
  def apply[T](implicit numeric: Numeric[T]): Numeric[T] = numeric

  object ops {
    implicit class NumericOps[T](t: T)(implicit N: Numeric[T]): {
      def +(o: T): T = N.plus(t, o)
      def *(o: T): T = N.times(t, o)
      ...
      def abs: T = N.abs(t)
      ..
    }
  }
}

With all this in place, and the write imports at the top of our file our function now looks like this:

import Numeric.ops._
def signOfTheTimes[T: Numeric](t: T): T = -(t.abs) * t

This is great! As a user of the Numeric typeclass this reads well and we get all of our functionality without weird scala syntax. The downside is whoever defines the typeclass has to go through all the ceremony of putting together the companion object with the ops and the summoning implicit. The boilerplate can add up. Thank your maintainers of your FP libraries for doing this sort of stuff for you.

A larger example with implementing a typeclass for a type

We will look at a solution to the Fizzbuzz problem using typeclasses. We will do all the work so that the following code snippet works:

import Monoid._  // import our typeclass and it's implementations 

val fizz: Int => Option[String] = x => if (x % 3 == 0) Some("fizz") else None
val buzz: Int => Option[String] = x => if (x % 5 == 0) Some("buzz") else None
val bazz: Int => Option[String] = x => if (x % 7 == 0) Some("bazz") else None

val funcs = List(fizz, buzz, bazz)

val fizzbuzzbazz = foldRight(funcs)
  
val fbbOrInt: Int => String = { i =>
  (fizzbuzzbazz(i) getOrElse i.toString) + ","
}

// map our function on a list
val strings: List[String] = (1 until 100).toList map fbbOrInt
println(foldRight(strings))
}

This works! If it looks weird that is ok. It blew my mind the first time I used scala to do this. We're going to peel back the curtain and implement the typeclasses to make this work.

First, we'll define a typeclass for combining two values of the same type. This is called a Monoid. You can read more about them in scala with cats. These show up all over the place which is why it's a common typeclass example. So let's define it as a typeclass. Remember a typeclass in Scala is a trait that:

trait Monoid[A] {
    // it has an identity 
    def empty: A 
    // it has an associative combine operation
    def combine(x: A, y: A): A
}

That's it! Now let's look at some implementations of Monoid for some basic types:

implicit val stringMonoid: Monoid[String] = new Monoid[String] {
    val empty = ""
    val combine(x: String, y: String): String = x + y 
}

implicit val intMonoid: Monoid[Int] = new Monoid[Int] {
    val empty = 0
    def combine(x: Int, y: Int): Int = x + y
}

implicit def listMonoid[A] = new Monoid[List[A]] {
    def empty = List() // Nil in practice, being clear here 
    def combine(x: List[A], y: List[A]): List[A] = x ++ y
}

It turns out you've been using monoids all along and now you have a name for them. Remember: this is a has a relationship, not an is a relationship. Int has a Monoid. Int is not a subtype of Monoid. It should be clear that we could define Monoid for library types that we don't own. This is very powerful!

Typeclass composition

Here is where typeclasses start to get interesting and deviate strongly from interfaces. We've defined the above monoids, but what if I wanted a Monoid for Option? I would have to implement stringOptionMonoid and intOptionMonoid and listOptionMonoid and so on. This sucks. It turns out you can get the compiler to do the work for you:

implicit def optionMonoid[A](implicit am: Monoid[A]): Monoid[Option[A]] = {
  val o = new Monoid[Option[A]] {
    val empty = None
    def combine(x: Option[A], y: Option[A]): Option[A] = (x, y) match {
      case (None, None)       => None
      case (Some(_), None)    => x
      case (None, Some(_))    => y
      case (Some(x), Some(y)) => Some(am.combine(x,y)) // use the A monoid to 
                                                       // combine two A's
    }
  }
}

This is wild! We don't need to define Monoid[Option[Int]]! As long as the thing inside of the Option has a Monoid, we can make a Monoid for any Option[A]. This is a huge boilerplate improvement.

Typeclass comprehensibility

FP programmers will rant about lawless typeclasses when you google around online. Typeclass laws are a statement about properties that must hold for a typeclass to be valid for the types that implement it. It turns out a Monoid has laws:

The laws help us implement systems with many type classes since it helps us know what to expect. This is important because the compiler will be making functions for you that you never see. In the above optionMonoid case, I may never see the option[Int] monoid code myself. If I messed up the implementation and some other function was relying on the monoid to behave correctly, it might explode with a bug. This bug will be difficult to track down since I don't really observe the function myself. Similarly, this is why people only want one typeclass implementation for a given type. It can get very confusing with subtle bugs if multiple implementations existed and the behavior changed drastically depending on what an import at the top of a file was.

Noel Welsh's article goes into this and the tldr is:

Back to our fizzbuzz example

More typeclass composition! Just like we can have a Monoid for Options we can define a monoid for functions. This sounds a little weird so let's break it down. If I have a function A => B and B has a Monoid, then I have a monoid for any function A => B. In code:

implicit def functionMonoid[A, B](implicit bm: Monoid[B]): Monoid[A => B] =
  new Monoid[A => B] {

    def empty: A => B = _ => bm.empty

    def combine(x: A => B, y: A => B): A => B = { a =>
      bm.combine(x(a), y(a))
    }
  }

Stated another way: given a monoid for B I can give you a monoid for functions returning B.

There is a natural association with folding and lists of monoids as monoids define everything you need in a fold: the initial value and a way to combine values. We can write a generic version of fold that works on monoids as follows:

implicit def foldRight[A](la: List[A])(implicit am: Monoid[A]): A = {
  la.foldRight(am.empty)(am.combine)
}

Remember that foldRight is just constructor replacement. If you peek under the hood in Scala that's not what it will look like for reasons of stack safety, performance, etc. But it's the right mental model to have in your head:

List(1,2,3,4) == 1 :: 2 :: 3 :: 4 :: Nil // this is valid scala

foldRight is just constructor replacement (::) with some f and replacing Nil with the initial value:

List(1,2,3,4).foldRight(empty((a, acc)) => a `f` acc)

So `1 :: 2 :: 3 : 4 :: Nil becomes

1 `f` 2 `f` 3 `f` 4 `f` empty // not showing the brackets 

e.g. List(1,2,3,4).foldRight(0)(_ + _) is:

1 + (2 + (3 + (4 + 0))) 
// remember the constructor replacement
1 :: 2 :: 3 :: 4 :: Nil 

We now have all the pieces for our fizzbuzz example to work! The weirdest line is folding a list of functions. It's worth sitting down with a pencil and paper to work out what the constructor replacement looks like in this example as it will help you understand all the pieces better: what is the fn being applied through the foldMethod, the function A => B monoid, all the way to the Option[String] monoid. The fizzbuzz example is as follows:

val fizz: Int => Option[String] = x => if(x % 3 == 0) Some("fizz") else None
val buzz: Int => Option[String] = x => if(x % 5 == 0) Some("buzz") else None
val bazz: Int => Option[String] = x => if(x % 7 == 0) Some("bazz") elze none 

val funcs: List[Int => Option[String]] = List(fizz, buzz, bazz)

// we can combine our functions. 
// this works because we can find an option monoid for strings 
// since we have an option monoid for strings, then we can find a monoid 
// for Int => Option[String] 

val fizzbuzzbazz: Int => Option[String] = fold(funcs)

// handle the None such
val fbbOrInt: Int => String = { i => 
  (fizzbuzzbazz(i).getOrElse(i.toString) + ",")
}

// map our function on a list 
val strings: List[String] = (1 until 100).toList map fbbOrInt 
println(fold(strings)) // concatenate into a single string 

Note that the foldRight method we developed here shows up in cats as the combineAll function. The cats library also gives you the implementation for Monoids for all the standard data types.

Typeclasses and Principle of Least Power

If we take a step back and look at type classes through the lens of Part 1 - Data / Types / Referential Transparency / Value Prop we can use typeclasses to both describe the relationships between stuff and restrain the context. This opens up a much more useful world to be used with generics. We can start thinking about generic types with capabilities. For example:

trait HttpClient {
  def get[A](url: Url)(implicit reads: Reads[A]): Task[A]
  // or
  def get[A: Reads](url: Url): Task[A]
}

We can create a http client that does parsing into some A case class. We don't care about what exactly A is, it's generic. But we know something special about A we can use in the implementation. We know that A implements the typeclass Reads.

If we think about Part 2 - programming with effects we saw examples for Functor, Monad, Applicative. These are just type classes on an effect! We could define implementations for our own types.

Say I'm making a static site generator and I have some concept of a post that has a body. It might look something like:

case class post(date: Date, tags: List[Tags], status: Status, body: String)

But what is the string body? It might first be string. Some later time we may parse it as Markdown and then generate HTML from it. This sounds like an excellent use case for generics:

case class Post[A](date: Date, tags: List[Tags], status: Status, body: A)

object Post {
  def rawPost(date: Date, 
    tags: List[Tags],
    status: Status, 
    body: String): Post[String] = 
      Post(date, tags, status, body)
}

And because we like our users and want to give them flexibility down the road: any type that has a single generic parameter can easily be made into a functor! We just need to implement the Functor typeclass on our Post type

object Post {

  def rawPost(...): Post[String] = ...
  
  implicit val postFunctor = new Functor[Post] {
    override def map[A, B](fa: Post[A])(f: A => B): Post[B] = 
      fa(body = f(fa.body))
  } 
}

Now the user can map the body into whatever they want. From String -> Markdown or Markdown -> Html or String -> Option[Html].

Finally, we can use typeclasses to constraint effects to the principle of least power. The less we know about something, the less we can do with something. If we are in some generic context / effect F[_] we know way less about F than what we would know about List or Task. We can't take the head of it. We can't recoverWith or timeoutTo like we could with Task. The only thing our program knows about F is what we give it via capabilities. The way we do this is with typeclasses. Perhaps I want to give the ability to chain computations together with flatMap but I don't want them to be able to raise or recover from errors:


class thing[F[_] : Monad](...) {
  def doStuff(...): F[Stuff] = for {
    x <- ..
    y <- ..
    z <- ..
    } yield Stuff(x,y,z)
}

We don't know anything about F other than it has a monad. This means we have map (since ever Monad is a functor) and pure and flatMap. Because we have flatMap we can use a for comprehension. But we have no way of knowing if this F has an error channel like Future or Task does. For all we know the caller of Thing is constructing this with List. We've given F just enough power to do it's job.

We could constraint F further. We could give it more capabilities. We do this by introducing more typeclasses. For instance we could give F the ability to raise or recover from errors by constraining it to the MonadError typeclass.

type MonadThrow[F[_]] = MonadError[F, Throwable]

class Thing[F[_] : MonadThrow](...) {
  def doStuff(....): F[Stuff] = {
    val res =  for {
      x <- ..
      y <- ..
      z <- ..
      } yield Stuff(x,y,z)
    res.recoverWith { 
      case NonFatal(e) => ...
    }
}

We've gained the ability to recover from errors!

This is a huge tool in our toolbox because we can restrict the capabilities of a class/function. Maybe you only want to do error handling in one place so your only constraint is Monad for the majority of your program except the once place that handles recovering from errors (perhaps for logging purposes, or to return json). You can make sure people don't do arbitrary error masking by restricting what they can do! There is an entire world to key into here and this is the very tip of the iceberg: for instance you can differentiate between Tasks and CancellableTasks by giving F different capabilities. Maybe it makes no sense for Task[A] to be cancelled, so at a code level we make it impossible to cancel Task A. We can be assured that no one is going to cancel task[A] on us unless they also change the capabilities of the class (which would likely be subject to code review). Maybe Task[B] is always cancellable, so we give it that capability. In our program we are just in some F and our main Method is suppling Task, or IO, or Zio to make this possible. We will explore this topic further in part 5.

Conclusion

There is some hairy syntax to learn but typeclasses are a powerful design tool. Just using them doesn't guarantee success: they can be abused, the syntax can be confusing, it can be hard to tell what typeclass derivation is doing if there is a bug. But they can make your code wonderfully ergonomic! You can add behavior to classes you don't own. They are worth learning about.

Personally, I key into a world of typeclasses (from cats, cats-effect) far more than I end up designing my own typeclasses. Hopefully this has demystified some of what has been going on. The following resources on typeclasses can be handy: