Intro to FP series:
- Part 1 - Data / Types / Referential Transparency / Value prop
- Part 2 - Programming with effects
- Part 3 - Type classes (
this post
) - Part 4 - Practical effect manipulation with traverse andfriends
- based on a workshop repository you can work through right now:
- the workshop questions
- the workshop solutions
- Part 5 - Basics of Final-tagless / ZIO
The posts are originally based on a series of 5 presentations I ran at my company in an attempt to get in front of some anti-Scala sentiment that pervaded much of the org at the time. The org suffered from the typical problem of hiring Java devs and then telling them to write to Scala. It assumes people are familiar with Scala syntax for the basics.
tldr;
Typeclasses have an unfortunate name and have nothing to do with classes. Your language doesn't event need to have classes in order to have typeclasses (see Rust and Haskell). Scala has typeclasses via implicits. The syntax is a little crufty. Why typeclasses?
- They allow use to extend libraries with new functionality (one solution to the
expression problem)
- e.g. I can add functionality to a data type that I don't own without having to change that data type
- They allow for ad-hoc polymorphism without any sort of hierarchy/inheritance between the types that implement the typeclass
- Typeclasses themselves can have an OO hierarchy
It's better to think of typeclasses as type capabilities or type behaviors as the class part is misleading.
Typeclasses
A typeclass in Scala is a trait that:
- holds no state (duh, it's a trait)
- has a type parameter (can be a higher-kinded type parameter too)
- has at least one abstract method
- may contain generalized methods
- may extend other typeclasses
There should only be one implementation of a typeclass for a given type. This is known as typeclass coherence. Scala will let you define more than one, but if they are both in scope it will be a compile error. Having more than one is generally considered bad typeclass design. After a few examples we will discuss why we want to try to adhere to this particular design decision as much as possible.
A typeclass has a typeclass definition and one or more implementations for different types:
trait Show[T] { def debugInfo(v: T): String }
This is a simple typeclass for giving a type the capability of showing some sort of prettified debugInfo, maybe for logging purposes when things go awry.
Scala does not have first class support for typeclasses, but if you look at languages that do have first class support for typeclasses you will see that the definition of a typeclass is quite similar:
// rust pub trait Show { fn debugInfo(&self) -> String; } impl Show for string { fn debugInfo(&self) -> String { self } }
-- haskell class Show a where debugInfo :: a -> String instance Show String where debugInfo s = s
There is always this separation of definition and implementation.
You see typeclasses everywhere in Scala. For example, in the standard library from
scala.math.Numeric
defines the following two typeclasses:
trait Ordering[T] { def compare(x: T, y: T): Int def lt(x: T, y: T): Boolean = compare(x, y) < 0 def gt(x: T, y: T): Boolean = compare(x, y) > 0 } trait Numeric[T] extends Ordering[T] { def plus(x: T, y: T): T def times(x: T, y: T): T def negate(x: T): T def zero: T def abs(x: T): T = if (lt(x, zero)) negate(x) else x }
If you use Play JSON Read / Write / Format or Circe Decoder/Encoders surprise! these are typeclasses and you are implementing a typeclass for your type. In scala it's common for serialization/deserialization libraries to use typeclasses.
Comparison to interfaces
People will say that they are like interfaces. There is some overlap but they are different. I don't think the details matter that much but if you are interested, you can check out the following:
- typeclasses are nothing like interfaces
- how do typeclasses differ from interfaces
- typeclasses in translation
The main differences are separation of implementation (we'll get to this), return type polymorphism (not important for this post), and typeclass composition
Records of Functions
If you google around you will eventually hit a bunch of people talking about
typeclasses vs records of functions. Record of functions is a haskell term and
whenever you see it think class + methods
:
class UserService(db: DBConnection) { def getUser(id: UserId): Task[User] def getUserPermissions(userId: UserId): Task[Permissions] def updateUser(id: UserId, name: String): Task[Unit] } // and as a record of functions, in haskell style, // you would never write it this way in scala case class UserService( getUser: DBConnection => UserId => Task[User], getUserPermissions: DbConnection => UserId => Task[Permissions], updateUser: DbConnection => UserId => String => Task[Unit] )
Since functions are just types, a record of functions is a data type of functions. From the above, the two are essentially equivalent and how to make something that feels sort of like class/object in Haskell. In scala we would just use a normal class or a case class with methods.
Whatever you can do manually with a record of functions, you can do with a typeclass which leads to a bit of design paralysis of when to use a typeclass vs a record of functions. We'll discuss this in this post but I do suggest reading Noel Welsh's excellent overview of this topic
Using a typeclass
Scala does not have first class support for typeclasses but we can do them via implicits. There is quite a lot of cruft and ceremony around this compared to other languages. We'll work through the bare bones syntax first and then look at some sugar to clean it up:
def stuff[T](t: T)(implicit N: Numeric[T]): T = { import N._ times(negate(abs(t)), t) }
There is a lot going on here:
- we take in some type
T
- we don't care what it is, as long as
T
implements theNumeric
typeclass- that's what
implicit N: Numeric[T]
means - in order to use the functions from
Numeric
we have toimport
those functions in order to use them
- that's what
- we don't care what it is, as long as
This works but it's pretty crufty isn't it? The implicit part is longer than the rest of the function definition! Importing something into the middle of a function feels weird. Cleaning this up with context bound:
def signOfTheTimes[T: Numeric](t: T): T = { val ev = implicitly[Numeric[T]] ev.times(ev.negate(ev.abs(t)), t) }
This reads a bit better
- the signature now reads: give me any
T
that has a numeric- important distinction here! This is not subclassing.
- I'm not asking for any
T
that is a subclass of Numeric- that would be a view bound for f-bounded polymorphism
- If I wanted any T that was a subtype of another type that would be
T <: Numeric
- I'm not asking for any
- important distinction here! This is not subclassing.
- gone is the weird import
- but gained is this weird
implicitly
syntax in order to get Numerics functions bound to something
- but gained is this weird
We can clean this up by introducing a summoning implicit on the companion object to the typeclass:
object Numeric { def apply[T](implicit numeric: Numeric[T]): Numeric[T] = numeric }
This looks hairy but it's honestly just annoying boilerplate your memorize (or get a library to derive for you).
With the summoning implicit in place it looks like:
```scala def signOfTheTimes[T: Numeric](t: T): T = { val N = Numeric[T] import N._ times(negate(abs(t)), t) }
Which is on it's way to getting better but it still kind of sucks. We sort of
have inside out methods. To improve this further it's commin to have an ops
object on the typeclass companion objects. If you poke around FP libraries
you'll find this pattern is very common. So let's introduce this for Numeric:
object Numeric { def apply[T](implicit numeric: Numeric[T]): Numeric[T] = numeric object ops { implicit class NumericOps[T](t: T)(implicit N: Numeric[T]): { def +(o: T): T = N.plus(t, o) def *(o: T): T = N.times(t, o) ... def abs: T = N.abs(t) .. } } }
With all this in place, and the write imports at the top of our file our function now looks like this:
import Numeric.ops._ def signOfTheTimes[T: Numeric](t: T): T = -(t.abs) * t
This is great! As a user of the Numeric
typeclass this reads well and we get
all of our functionality without weird scala syntax. The downside is whoever
defines the typeclass has to go through all the ceremony of putting together the
companion object with the ops and the summoning implicit. The boilerplate can
add up. Thank your maintainers of your FP libraries for doing this sort of stuff
for you.
A larger example with implementing a typeclass for a type
We will look at a solution to the Fizzbuzz problem using typeclasses. We will do all the work so that the following code snippet works:
import Monoid._ // import our typeclass and it's implementations val fizz: Int => Option[String] = x => if (x % 3 == 0) Some("fizz") else None val buzz: Int => Option[String] = x => if (x % 5 == 0) Some("buzz") else None val bazz: Int => Option[String] = x => if (x % 7 == 0) Some("bazz") else None val funcs = List(fizz, buzz, bazz) val fizzbuzzbazz = foldRight(funcs) val fbbOrInt: Int => String = { i => (fizzbuzzbazz(i) getOrElse i.toString) + "," } // map our function on a list val strings: List[String] = (1 until 100).toList map fbbOrInt println(foldRight(strings)) }
This works! If it looks weird that is ok. It blew my mind the first time I used scala to do this. We're going to peel back the curtain and implement the typeclasses to make this work.
First, we'll define a typeclass for combining two values of the same type. This is called a Monoid. You can read more about them in scala with cats. These show up all over the place which is why it's a common typeclass example. So let's define it as a typeclass. Remember a typeclass in Scala is a trait that:
- holds no state (duh, it's a trait)
- has a type parameter (can be a higher-kinded type parameter too)
- has at least one abstract method
- may contain generalized methods
- may extend other typeclasses
trait Monoid[A] { // it has an identity def empty: A // it has an associative combine operation def combine(x: A, y: A): A }
That's it! Now let's look at some implementations of Monoid
for some basic types:
implicit val stringMonoid: Monoid[String] = new Monoid[String] { val empty = "" val combine(x: String, y: String): String = x + y } implicit val intMonoid: Monoid[Int] = new Monoid[Int] { val empty = 0 def combine(x: Int, y: Int): Int = x + y } implicit def listMonoid[A] = new Monoid[List[A]] { def empty = List() // Nil in practice, being clear here def combine(x: List[A], y: List[A]): List[A] = x ++ y }
It turns out you've been using monoids all along and now you have a name for
them. Remember: this is a has a
relationship, not an is a
relationship.
Int
has a Monoid. Int is not a subtype of Monoid. It should be clear that
we could define Monoid
for library types that we don't own. This is very
powerful!
Typeclass composition
Here is where typeclasses start to get interesting and deviate strongly from
interfaces. We've defined the above monoids, but what if I wanted a Monoid for
Option
? I would have to implement stringOptionMonoid
and intOptionMonoid
and listOptionMonoid
and so on. This sucks. It turns out you can get the
compiler to do the work for you:
implicit def optionMonoid[A](implicit am: Monoid[A]): Monoid[Option[A]] = { val o = new Monoid[Option[A]] { val empty = None def combine(x: Option[A], y: Option[A]): Option[A] = (x, y) match { case (None, None) => None case (Some(_), None) => x case (None, Some(_)) => y case (Some(x), Some(y)) => Some(am.combine(x,y)) // use the A monoid to // combine two A's } } }
This is wild! We don't need to define Monoid[Option[Int]]
! As long as the
thing inside of the Option has a Monoid, we can make a Monoid for any
Option[A]
. This is a huge boilerplate improvement.
Typeclass comprehensibility
FP programmers will rant about lawless typeclasses
when you google around
online. Typeclass laws are a statement about properties that must hold for a
typeclass to be valid for the types that implement it. It turns out a Monoid has
laws:
- the combine operation must be associative:
combine(x, combine(y,z)) == combine(combine(x,y), z)
- the identity must be communutative:
combine(a, identity) == a == combine(identity, a)
The laws help us implement systems with many type classes since it helps us
know what to expect. This is important because the compiler will be making
functions for you that you never see. In the above optionMonoid
case, I may
never see the option[Int]
monoid code myself. If I messed up the
implementation and some other function was relying on the monoid to behave
correctly, it might explode with a bug. This bug will be difficult to track down
since I don't really observe the function myself. Similarly, this is why people
only want one typeclass implementation for a given type. It can get very
confusing with subtle bugs if multiple implementations existed and the behavior changed
drastically depending on what an import at the top of a file was.
Noel Welsh's article goes into this and the tldr is:
- if automatic composition will be useful then consider a typeclass
- if semantics are not well defined then don't use a typeclass
- this is where people talk about typeclass laws
- Monoid has laws
- if multiple instances are likely a typeclass is probably the wrong choice
- if it will make your code substantially easier to use a typeclass, it might be the right choice
- API ergonomics
Back to our fizzbuzz example
More typeclass composition! Just like we can have a Monoid for Options we can
define a monoid for functions. This sounds a little weird so let's break it
down. If I have a function A => B
and B
has a Monoid, then I have a monoid
for any function A => B
. In code:
implicit def functionMonoid[A, B](implicit bm: Monoid[B]): Monoid[A => B] = new Monoid[A => B] { def empty: A => B = _ => bm.empty def combine(x: A => B, y: A => B): A => B = { a => bm.combine(x(a), y(a)) } }
Stated another way: given a monoid for B I can give you a monoid for functions returning B.
There is a natural association with folding and lists of monoids as monoids define everything you need in a fold: the initial value and a way to combine values. We can write a generic version of fold that works on monoids as follows:
implicit def foldRight[A](la: List[A])(implicit am: Monoid[A]): A = { la.foldRight(am.empty)(am.combine) }
Remember that foldRight is just constructor replacement. If you peek under the hood in Scala that's not what it will look like for reasons of stack safety, performance, etc. But it's the right mental model to have in your head:
List(1,2,3,4) == 1 :: 2 :: 3 :: 4 :: Nil // this is valid scala
foldRight
is just constructor replacement (::
) with some f
and replacing
Nil
with the initial value:
List(1,2,3,4).foldRight(empty((a, acc)) => a `f` acc)
So `1 :: 2 :: 3 : 4 :: Nil becomes
1 `f` 2 `f` 3 `f` 4 `f` empty // not showing the brackets
e.g. List(1,2,3,4).foldRight(0)(_ + _)
is:
1 + (2 + (3 + (4 + 0))) // remember the constructor replacement 1 :: 2 :: 3 :: 4 :: Nil
We now have all the pieces for our fizzbuzz example to work! The weirdest line
is folding a list of functions. It's worth sitting down with a pencil and paper
to work out what the constructor replacement looks like in this example as it
will help you understand all the pieces better: what is the fn being applied
through the foldMethod, the function A => B
monoid, all the way to the
Option[String]
monoid. The fizzbuzz example is as follows:
val fizz: Int => Option[String] = x => if(x % 3 == 0) Some("fizz") else None val buzz: Int => Option[String] = x => if(x % 5 == 0) Some("buzz") else None val bazz: Int => Option[String] = x => if(x % 7 == 0) Some("bazz") elze none val funcs: List[Int => Option[String]] = List(fizz, buzz, bazz) // we can combine our functions. // this works because we can find an option monoid for strings // since we have an option monoid for strings, then we can find a monoid // for Int => Option[String] val fizzbuzzbazz: Int => Option[String] = fold(funcs) // handle the None such val fbbOrInt: Int => String = { i => (fizzbuzzbazz(i).getOrElse(i.toString) + ",") } // map our function on a list val strings: List[String] = (1 until 100).toList map fbbOrInt println(fold(strings)) // concatenate into a single string
Note that the foldRight
method we developed here shows up in cats as the
combineAll
function. The cats library also gives you the implementation for
Monoids for all the standard data types.
Typeclasses and Principle of Least Power
If we take a step back and look at type classes through the lens of Part 1 - Data / Types / Referential Transparency / Value Prop we can use typeclasses to both describe the relationships between stuff and restrain the context. This opens up a much more useful world to be used with generics. We can start thinking about generic types with capabilities. For example:
trait HttpClient { def get[A](url: Url)(implicit reads: Reads[A]): Task[A] // or def get[A: Reads](url: Url): Task[A] }
We can create a http client that does parsing into some A
case class. We
don't care about what exactly A
is, it's generic. But we know something
special about A
we can use in the implementation. We know that A implements the
typeclass Reads
.
If we think about Part 2 - programming with effects we saw examples for Functor, Monad, Applicative. These are just type classes on an effect! We could define implementations for our own types.
Say I'm making a static site generator and I have some concept of a post that has a body. It might look something like:
case class post(date: Date, tags: List[Tags], status: Status, body: String)
But what is the string body? It might first be string. Some later time we may parse it as Markdown and then generate HTML from it. This sounds like an excellent use case for generics:
case class Post[A](date: Date, tags: List[Tags], status: Status, body: A) object Post { def rawPost(date: Date, tags: List[Tags], status: Status, body: String): Post[String] = Post(date, tags, status, body) }
And because we like our users and want to give them flexibility down the road: any type that has a single generic parameter can easily be made into a functor! We just need to implement the Functor typeclass on our Post type
object Post { def rawPost(...): Post[String] = ... implicit val postFunctor = new Functor[Post] { override def map[A, B](fa: Post[A])(f: A => B): Post[B] = fa(body = f(fa.body)) } }
Now the user can map the body into whatever they want. From String -> Markdown
or Markdown -> Html
or String -> Option[Html]
.
Finally, we can use typeclasses to constraint effects to the principle of least
power. The less we know about something, the less we can do with something. If
we are in some generic context / effect F[_]
we know way less about F
than
what we would know about List
or Task
. We can't take the head of it. We
can't recoverWith
or timeoutTo
like we could with Task
. The only thing
our program knows about F is what we give it via capabilities. The way we do
this is with typeclasses. Perhaps I want to give the ability to chain
computations together with flatMap but I don't want them to be able to raise or
recover from errors:
class thing[F[_] : Monad](...) { def doStuff(...): F[Stuff] = for { x <- .. y <- .. z <- .. } yield Stuff(x,y,z) }
We don't know anything about F other than it has a monad. This means we have
map
(since ever Monad is a functor) and pure
and flatMap
. Because we have
flatMap
we can use a for comprehension. But we have no way of knowing if this
F has an error channel like Future
or Task
does. For all we know the caller
of Thing is constructing this with List
. We've given F
just enough power to
do it's job.
We could constraint F further. We could give it more capabilities. We do this by introducing more typeclasses. For instance we could give F the ability to raise or recover from errors by constraining it to the MonadError typeclass.
type MonadThrow[F[_]] = MonadError[F, Throwable] class Thing[F[_] : MonadThrow](...) { def doStuff(....): F[Stuff] = { val res = for { x <- .. y <- .. z <- .. } yield Stuff(x,y,z) res.recoverWith { case NonFatal(e) => ... } }
We've gained the ability to recover from errors!
This is a huge tool in our
toolbox because we can restrict the capabilities of a class/function. Maybe
you only want to do error handling in one place so your only constraint is
Monad
for the majority of your program except the once place that handles
recovering from errors (perhaps for logging purposes, or to return json). You
can make sure people don't do arbitrary error masking by restricting what they
can do! There is an entire world to key into here and this is the very tip of
the iceberg: for instance you can differentiate between Tasks
and
CancellableTasks
by giving F different capabilities. Maybe it makes no sense
for Task[A] to be cancelled, so at a code level we make it impossible to cancel
Task A. We can be assured that no one is going to cancel task[A] on us unless
they also change the capabilities of the class (which would likely be subject to
code review). Maybe Task[B] is always cancellable, so we give it that
capability. In our program we are just in some F
and our main Method is
suppling Task, or IO, or Zio to make this possible. We will explore this topic
further in part 5.
Conclusion
There is some hairy syntax to learn but typeclasses are a powerful design tool. Just using them doesn't guarantee success: they can be abused, the syntax can be confusing, it can be hard to tell what typeclass derivation is doing if there is a bug. But they can make your code wonderfully ergonomic! You can add behavior to classes you don't own. They are worth learning about.
Personally, I key into a world of typeclasses (from cats, cats-effect) far more than I end up designing my own typeclasses. Hopefully this has demystified some of what has been going on. The following resources on typeclasses can be handy:
- Essential Scala by Welsh and Gurnell (free)
- Scala With Cats by Welsh and Gurnell (free)
- FP for mortalz w/ scalaz has a great chapter on typeclasses and libraries for doing automatic typeclass derivation (shapeless, magnolia, kittens, others).