WARNING: I play a little fast and loose with the terminology here. This talk is aimed at beginners and they don't need all the details on how the sausage is made. Some of my descriptions are hand-wavy so it's enough to get going and productive.
- Part 1 - Data / Types / Referential Transparency / Value prop
- Part 2 - Programming with effects (
this post
) - Part 3 - Typeclasses
- Part 4 - Practical effect manipulation with traverse and friends
- based on a workshop repository you can work through
- the workshop questions
- the workshop solutions
- Part 5 - Basics of Final-tagless / ZIO
They are originally based on a series of 5 presentations I ran at my company in an attempt to get in front of some anti-Scala sentiment that pervaded much of the org at the time. The org suffered from the typical problem of hiring Java devs and then telling them to write to Scala. It assumes someone at least has a few months of Scala programming experience.
Recap from Part 1
We want to live in a world where we structure programs like:
------------------------------
| |
| ---------------- |
| | | |
| | PURE | |
| | FUNCTIONS | |
| |________________| |
| |
| Side-effecting functions |
|____________________________|
outside world / program
boundary
where side-effecting functions are anything that is not a pure function - talking to the outside world, updating DBs, etc
Remember: this is a good idea in any programming paradigm but we strive to make it more explicit in FP.
Remember that a Pure Function is:
- deterministic (the same inputs return the same outputs)
- total (not partial)
- no mutation (other than local mutation that does not escape fn)
- no exceptions
- no nulls
- no reflection
- no side-effect
The benefit of this is what is called Referential Transparency which allows us to replace a variable with it's bound expression. We want to do this because it gives us local reasoning which means less surface area to grok. Code is easier to understand and read. This means that code is easier to review. Code is easier to onboard people into because they don't have to go stepping into code paths to figure out what is going on. Instead, everything is in front of you.
All the machinery of FP comes from wanting to maintain referential transparency as much as possible because gain a ton of ability to reason about our programs (even concurrent programs), which is a huge win). Some of this shows up in our type signatures (because of our rich, expressive type system). These busy type signatures are not noise but important indicators of behavior.
Pure functions are good and all, but we know there are impure things we have to deal with:
- Partiality?
- Exceptions?
- Nondeterminism?
- Dependency injection?
- Mutable state?
- Side-effects (a program of pure functions does nothing, we need to talk to the outside world)
Effects
All of these things are an effect (or a context or a box or whatever makes sense in your head). They compute some sort of answer with some extra stuff associated with them. The extra stuff is what we call an effect.
They all have the same Shape F[A]
type F[A] = Option[A] type F[A] = Either[E, A] // for any fixed E type F[A] = List[A] type F[A] = Reader[E, A] // for any type E type F[A] = State[S, A] // for any type S // intuition: this extends to other "effects" type F[A] = Future[A] type F[A] = Task[A]
What is an effect? It's whatever distinguishes F[A]
from A
. Rob Norris has
an excellent talk on this topic
that is worth watching. I have another post where we look in detail at a
generic function that operates on any effect to see how F[A]
differs from A
here
Programming with Effects
So say we have bought into effects. Let's run with referential transparency being a good idea, even if you are not sold. We want to maintain this property and we want to always be signalling intent with our types as much as possible (to help with readability, to act as a weak form of documentation, etc).
Once we start doing this we end up with lots of things inside of an effect. So now that everything is in an effect / context / box:
- how do I operate on things inside of a context / effect?
- how do I put things inside of a context / effect?
- What if I'm nested inside of a context / effect
- what if both contexts / effects are the same?
- what if both contexts / effects are different?
- What if the thing inside my context has known properties?
- what if I want to go abstract (go generic) over my context?
This post is all about the answers to these. The answers to these questions are nuts and bolts manipulation of effects and have huge implications on the behavior of your program and how to reason about them.
I like to use the analogy of linear algebra when I give this presentation. When
you start out in undergrad linear algebra class it's all about matrix
manipulation by hand. Row reducing something. If you are very unlucky, computing
a hessian. But after a few weeks suddenly you are in this abstract world
of some m x n matrix A that is square and invertible. You don't manipulate
numbers directly anymore, you are manipulating symbols and abilities. Some of
functional programming begins to feel like this once you start picking the
effects you want and smooshing them together. What happens in F[A]
that is
different from A
is due to the behavior of F. It seems daunting at first but
there are a few behaviors you learn initially that dominate your work and you
get familiar with them extremely quickly.
How do I operate on things stuck inside of a context
If I have an F[A]
and I really want a B
, the operation to turn an F[A]
into an F[B]
is called an operation we call map
. If you are coming from
Java or Python, map is not just a replacement for a loop w/o explicit indexing:
- we are applying something through/around some context we don't care about
- we are applying something through/around some effect we don't care about
- we are applying something through/around some structure we don't care about
- and we are preserving that structure
It has a name we call Functor, which you can find out more about in Chapter 3 of Scala with Cats which has a simple definition:
trait Functor[F[_]] { def map[A, B](x: F[A](f: A => B): F[B] }
You are probably very familiar with Functors even if you never knew the name
before. Option, Either, Future, Task, Try, etc., all have a map
method and
have an instance of Functor
for their type.
They all have the same shape:
map :: (a -> b) -> Maybe a -> Maybe b map :: (a -> b) -> List a -> List b map :: (a -> b) -> Either e a -> Either e b map :: (a -> b) -> F a -> F b
The first line reads: given a function of a -> b
and a Maybe a
then map
will give you back a Maybe b
A functor is sort of like a magic box (context, effect, whatever) that you can only inspect by means of functions
- you can never unwrap a functor (you can't get A out of
F[A]
just usingFunctor
) - you can't smoosh two functors together
- if you have
F[F[A]]
functor isn't going to get you anF[A]
or even anA
- if you have
F[A]
+F[A]
, functor isn't going to work either
- if you have
- but if you map some function
g
over your functor:- you turn it into a new function that operates on whatever is contained in the functor
- this is called lifting since we lift the function g to operate at a higher level of abstraction
List(1,2,3,4).map(x => x + 1)
Our anonymous function here is a function of Int -> Int
and we've lifted it.
Now the whole thing is a function of List[Int] -> List[Int]
.
A more full definition of Functor should make this clear when we derive the
Lift
method:
trait Functor[F[_]] { // this we have to define def map[A, B](fa: F[A])(f: A => B): F[B] // this we can derive from the first, this sort of derivation // is common // we can lift any function from A -> B to one // that is `F[A] -> F[B]` def lift[A, B](f: A => B): F[A] => F[B] = fa => map(fa)(f) }
- Functor is also a typeclass (a later talk)
- We can constraint a program to require Functor, which we will talk about at the end of this talk
tldr: use Functor
's map
method when you have some effect F[A]
and you want
to change the inside w/o changing the structure.
Functors Compose
It turns out two different functors compose! If you find yourself inside
List[Either[String, Future[A]]]
to do anything to change that A into a B
you need something like
.map(_.map(_.map(futureA => { stuff })
Which sucks. Luckily, since Functors compose:
import cats.Functor import cats.implicits._ val listOption = List(Some(1), None, Some(2)) // listOption: List[Option[Int]] = List(Some(1), None, Some(2)) // Through Functor#compose Functor[List].compose[Option].map(listOption)(_ + 1) // res1: List[Option[Int]] = List(Some(2), None, Some(3))
We have a new property! If effects F and G have Functors, then F[G[_]]
or
G[F[_]]
is also a functor. These properties are important and tend to be
described as laws
when people talk about type classes. You rarely compose
Functors in practice but these laws are useful properties you eventually
internalize so you can work abstractly while still relying on referential
transparency to ensure everything will work out without any funny behavior. If
you do need to compose Functors, look into the nested type in
cats
What I'm nested inside of a context?
And that context is the same
This is the dreaded monad and I don't want to write a monad tutorial so I'll keep it short.
- we are mapping some function in some context / effect / whatever
- but the function we are mapping also returns the same context
For this to all work, we need some way to take a normal value and put it in a context:
- this is called
pure
orunit
orpoint
depending on the language/library - this is just
A -> M[A]
- this is just
Int -> Option[Int]
orString -> Future[String]
- we also need a
flatten
operation to goF[F[A]] -> F[A]
- this is called
join
orflatten
- this is just
Option[Option[String]]
->Option[String]
- this is called
- alternatively, instead of flatten, we need a way to sequence the effects
- this is called
bind
or>>=
orflatMap
- this is called
def bind[A, B](x: F[A]):(f: A => F[B]): F[B]
This flattening/joining or binding really means unwrapping myself one level of the context. This gives rise to ordering or sequencing that people talk about when talking about Monads. The bind operator or the flatten call are dependent on the previous value. For example, look at Future in Scala's standard library:
def flatMap[S](f: T => Future[S])(implicit executor: ExecutionContext): Future[S] = transformWith { case Success(s) => f(s) // 1 case Failure(_) => this.asInstanceOf[Future[S]] // 2 }
But we can only get to 1 or 2 during evaluation of our program. In A.flatMap(a => B(a))
, looking at the above, in order to know if A was successful or not it
would have had to have been evaluated to know which pattern branch to take. If
we inline the flatMap
call to it's expression, we get A.transformWith(a => ..)
That transformWith
is being called on has to have been evaluated by the
time you hit this piece of code. This is where that sequencing aspect comes
from: what happens next depends on what happened before. You see this less
obviously in all implementations of bind / flatMap:
implicit val optionMonad: Monad[Option] = new Monad[Option] { def point[A](x: A) = Some(x) def bind[A, B](x: Option[A])(f: A => Option[B]): Option[B] = x match { case Some(y) => f(y) // again there is a sequencing here, // I have unwrapped one layer of Option case None => None }
This sequencing aspect of a Monad meshes with the effect. For many things like
Option, Either, and to some extent Future / Task / IO / ZIO you also get
something back like exceptions. When Future[A]
evaluates, it is either
(eventually) a Success holding a value of A or a Failure holding a throwable.
If you squint at it, it's kind of like a partial function
And if our effect has a monad and behaves this way, we can compose a bunch of partial functions together. FlatMap isn't function composition, but it sort of is composition in a way. We want to chain together our success cases and short-circuit if something has gone wrong. Scott Wlaschin calls this Railroad Oriented Programming
def Validate(request: Request): Future[A]
def Update(..): Future[B]
def Send(..): Future[SuccessfulResponse]
___________________________________________
| A single fn representing the use case |
| |
request -> | Validate -> Update -> Send -> | -> Success
| | | | |
|______|________________|_____________|_____|
| | |
---------------->------------->--------> Failure
We use Monad (flatMap) to sequence together Validate -> Update -> Send
. If any
one of these functions fail, we short-circuit. If they all succeed, we end up
with some SuccessfulResponse (eventually).
This is a great way to program. Once you wrap your head around Monad you can put
the plumbing logic to the background and focus on your business logic. This is
huge for readability. The alternative we are familiar with is endless Try {] Catch {}
(or if Err = Nil
if you are a Go person):
string UpdateCustomerWithErrorHandling() { var request = receiveRequest(); var isUpdated = validatedRequest(request); if (!isValidated) { return "Request is not valid" } canonicalizeEmail(request); try { var result = db.updateDbFromRequest(request); if (!result) { return "Customer record not found" } } catch { return "DB error: customer record not updated" } if (!smptServer.sendEmail(request.email))) { log.Error "customer email not sent" } return "OK"; }
This example is a little contrived but certainly representative of code I see in Java, Python, PHP, etc. There is only 6 lines of business logic but 12 lines of error handling. This sucks.
If we look at railway-oriented programming using monads in FP the story is much nicer:
def parseInt(str: String): Option[Int] = scala.util.Try(str.toInt).toOption def divide(a: Int, b: Int): Option[Int] = if(b == 0) None else Some(a / b) def stringDivideBy(aStr: String, bStr: String): Option[Int] = parseInt(aStr).flatMap { aNum => parseInt(bStr).flatMap { bNum => divide(aNum, bNum) } }
Our types our handling our plumbing. Once we internalize Monad, in the same way as anyone who has done linear algebra internalizes say, what an invertible matrix means, then our plumbing disappears to the background and our logic comes to the front.
There is one problem though. This nesting of the flatMap
sucks. Above, we are
only making two calls and already we have a pyramid of doom
Luckily there is some better syntax Scala stole from Haskell. If we translated
something similar to the above to Haskell using>>=
instead of flatMap
in
the language it looks very similar with the same nesting problem:
validatePersonn name age = validateName name >>= \name' -> validateage age >>= \age' -> return (Person name' age')
Do notation to the rescue, which is syntax sugar that translates to the above but is much easier to read:
validatePerson name age = do name' <- validateName name age' <- validateAge age return (Person name' age')
Scala stole a weakened form of this for for compreshions
. Unfortunately they
included the word for
which forever confuses Java developers. A for
comprehension has nothing to do with a for loop
. Scala does not have for
loops (but does have while loops). Don't think about them like loops.
for { x <- a y <- b z <- c } yield f(x,y,z) // translates to a.flatMap(x => b.flatMap(y => c.map(z => f(x,y,z))))
This lets us write cleaner code that almost looks imperative in nature (do x, then y, then z and if that all worked then apply the function f to all my variables. The Scala compiler does the translation for you.
If your for comprehension explodes with type errors it probably means you've
lined something up wrong. If you translate it by hand to the expanded flatMap
form you can usually spot your mistake. Remember, whatever effect a
has above,
sets the effect for the whole for comprehension. b
and c
must have the same
effect. if a returns Future[A]
then b
and c
must also return
Future[something]
. Remember flatten: you need to Future[Future[B]] -> Future[B]
. Monad doesn't tell us how to Future[Option[B]] -> Option[B]
More than just railway oriented programming
Monads aren't just about short-circuiting failure, although this is probably the first place you encounter them since the Option / Either / Future / Task types are so useful. Many other contexts / effects / boxes have Monads but do not have any concept of failure and thus no concept of short-circuiting. You will eventually discover things like Reader, Writer, Par, and others that have this but it's not important right now. Monads are a huge general idea and it's ok to understand them at a basic level when you start out since this plumbing of sequencing with short-circuited failure is so useful.
What if I'm nested inside two contexts
The story so far:
- I have some function and I want to apply something through some structure
- without affecting the structure
- I have some function I want to apply through some structure
- but that function also returns the same structure
- I need to destroy some structure to flatten my structure
- this destruction causes the dependent sequencing we're seeing
- this sequencing means that in a chain of operations, each operation depends on the one before this
If we look at the types we'll see that map
and flatMap
are very close:
map :: (a -> b) -> f a -> f b .. FUNCTOR flatMap :: (a -> f b) -> f a -> f b .. MONAD (>>=)
in Monad
, we had a chain of effectful computations where each link of the
chain depended on the previous value. But what if I'm in a chain of effectful
computations that have no relationship with each other? What if I have
independent rather than dependent computations?
map :: (a -> b) -> f a -> f b .. FUNCTOR <*> :: f (a -> b) -> f a -> f b .. APPLICATIVE flatMap :: (a -> f b) -> f a -> f b .. MONAD (>>=)
What's going on here? Well, we are applying some function through some context / effect / box / structure except the function is also stuck inside some context
That gives us <*>
or ap
for something we call Applicative
. We often don't
deal with ap
directly but it happens behind the scenes in many of the
libraries we use. E.g. the Validated type from
Cats
This is weird to think about in scala and you deal with this sort of
applicative
style
more often in Haskell due to many more curried functions being present. The
intuition you want in scala for how this works is more along the lines of zip
or product
:
trait Applicative[F[_]] extends Functor[F] { def product[A, B](fa: F[A], fb: F[B]): F[(A, B)] def pure[A](a: A): F[A] }
- Given a
Task[KafkaAuditLog]
and aTask[Response]
that have nothing to do with each other, I can run them both and give you aTask[(KafkaAuditLog, Response)]
- intuition: product, composing multiple independent effects
- intuition: concurrency
So why do we care about this sort of background property that an effect might have?
Concurrent computations
Suppose I want to aggregate data from 2 remote services and serve responses as fast as I can. If I use Monad, I have to hit the first service and (if successful) hit the second service. This is not what I want:
def loadUser: Task[User] def loadData: Task[Data] case class Payload(user: User, data: Data)
// sequentially for { user <- loadUser data <- loadData } yield Payload(user, data)
But applicative to the rescue. We can use mapN
to compute the independent
effects:
(loadUser, loadData).mapN { case(user, data) => Payload(user, data) }
What exactly Applicative does depends on the effect. For our Task
effect, it
will run loadUser and loadData concurrently on two different threads. If List
was the effect, you would get all combinations of the values in each list.
There is some overhead to learn.
Inverting containers
It's not uncommon to end up holding on to a List[Future[A]]
in scala. Perhaps
you had a list of requests and you mapped some HTTP request function over the
list and now you're stuck with this type. Or perhaps you have a
List[Option[Int]]
. A typeclass called Traverse
comes to the rescue:
trait Traverse[F[_]] { def traverse[G[_]: Applicative, A, B](fa: F[A])(f: A => G[B]): G[F[B]] }
The details aren't important other than noticing that there is an Applicative
in there. If an effect has a Traverse
then it has a sequence
method which is
insanely useful as it allows you to turn two containers inside out. We can turn
that List[Future[A]] -> Future[List[A]]
! It's not important how this works
right now, but the intuition is around the product
method earlier.
- much like we have a bunch of effects:
- type F[A] = Option[A]
- type F[A] = Either[E, A] // for any fixed E
- type F[A] = List[A]
- type F[A] = Reader[E, A] // for any type E
- type F[A] = Writer[W, A] // for any type W
- type F[A] = State[S, A] // for any type S
- we may have nested types, e.g. List[Future[A]]
- aka we may have F[G[A]]
- we want to flip these inside out, to get G[F[A]]
- we use
sequence
, which depends onApplicative
for this - this is the secret sauce for dealing with hairy types in Scala
What if the thing inside my context has known properties?
This falls inline with other ways to manipulate things inside of a context
similar to mapping
. The details aren't important, but if you find yourself
constantly pattern matching to unpack contexts just to put them back together
again there are probably simpler things you can be doing. For instance I could
have a monadic program summing a bunch of optional values (or returning None).
Given three values:
val v1: Option[Int] = Some(1) val v2: Option[Int] = Some(10) val v3: Option[Int] = None
val program2: Option[Int] = for { i <- v1 ii <- v2 iii <- v3 } yield i + ii + iii // None
Or, since I know Int
is a monoid I can lean on the fact that Option
has a
monoid if the thing inside the effect is also a monoid. How do I know this? you
pick it up with time:
val program1: Option[Int] = v1 |+| v2 |+| v3 // Some(11)
There is a ton of these little combinators available and you discover them
eventually. Some useful ones worth looking into are combineAll
and foldMap
.
What if I'm nested inside a context
And those contexts aren't the same
And those contexts are monads
Now that we've gotten a taste for effects/contexts, you will run into more cases
with nested contexts like F[G[_]]
or Future[Either[Throwable, Result]]
(which is extremely common: you've made a request to a remote API that might
fail with a parsing error or succeed with a Result). So now we're stuck inside
of Future[Either[..]]
. If we want to do something with that Result
we have
to resort to the following:
def lookupUser(id): Future[Either[Error, String]] = { ... } def lookupUserName(id: Long): Future[Either[Error, String]] = for { maybeUser <- lookupUser(id) } yield { for { user <- MaybeUser } yield user.name }
This is looking ugly and hard to read even for this simple example! We have to
nest for comprehensions! This sucks. We know if we were two Functors
inside
of each other we could do this but it turns out there is no way to generically
combine two monads to flatMap
works.;
BUT! If you know what one of the contexts is, you can get into Monad
Transformers. You might
have seen something like EitherT
floating around in your code base:
outer monad | | right of either | | case class EitherT[F[_], E, A](stack: F[Either[E, A]]) { | left of either
- This is very similar to
Future[Either[E, A]]
- It knows how to jump straight to doing something with
A
def lookupUser(id): Future[Either[Error, String]] = { ... } def lookupUserName(id: Long): EitherT[Future, Error, String]] = for { user <- EitherT(lookupUser(id)) } yield user.name
And now we are back to our easy to read, railway oriented code world!. A monad
transformer knows about the context in between. We can unwrap a monad
transformer using EitherT(..).value
. They are extremely useful and worth
reading up on.
What if I want to abstract over the context?
A what now? Now we are far into the realm of type systems. You can't do this in
a language like F# or Elm for instance. If you've ever jumped into library code
and seeing F[_]
that's what it is. Since our context / effects / box /
whatever are just a type, well we can go generic on them like any other type.
class AirlineService[F[_] : Functor](airlineRepo: AirlineRepository[F]) { def baggagePolicy(airlineName: AirlineName): F[Either[ValidationError, Airline]] = airlineRepo.findAirline(airlineName).map { airline => airline.toRight[ValidationError](AirlineNotFound(airlineName)) }
We are definitely not in OOP land anymore. Scala has something called Higher Kinded Types.
- we are use to types like Int, String, User, etc.
- turns out there is something called Kinds, which types do have
- the intuition is the type of types, or type constructors
- the kind of
List
is* -> *
- the kind of
List[Int]
is*
where* = Int
- the kind of
If you dig into the scala repl and use the kind command k
to get the kind of a type:
scala> :k List
List's kind of F[+A]
scala> k: List[Int]
List[Int]'s kind is A
List[+A]
takes a type parameterA
List
is not a valid type but is a type constructor waiting for an A
That * -> *
should have triggered something. It feels like the function syntax
I've been using all through this series
Kinds are like types for types. There is a weird syntax when talking about
this. In previous versions, the scala compiler used the same * -> *
notation
that Haskell does. You can think of *
as a hole in the type
- kinds describe the number of holes in a type
- the Kind of an ordinary type like
Int
orChar
is*
(zero types) - the kind of a unary type constructor like
Maybe
orList
is* -> *
(one hole)- you provide a type to fill the hole and you get a Maybe A!
- the kind of a binary type constructor like
Either
is* -> * -> *
(two holes)
So we can distinguish regular types (no holes) and type constructors which are Higher Kinded Types.
List // type constructor, takes on parameter, has one hole List[A] // type, produced by applying a type
There is a close analogy with functions and values
math.abs // function, takes one parameter math.abs(a) // value, produced by supplying the parameter a
When we abstract on our effect it lets us get really generic.
case class User(name: String, id: Long, age: Double) object Thing { def doThing[F[_]: Monad](a: F[String], b: F[Long], c: F[Double]): F[User] = for { aa <- a bb <- b cc <- c } yield User(aa, bb, cc) }
The F[_]
means I'm going generic on my effect. We've applied a constraint to
F[_]
stating that whatever it is, it has a Monad. Inside of doThing, the only thing we know
about F
is that is has pure
, map
, and flatMap
(and thus for
comprehensions). We have applied that FP principle of least power to constrain
what is possible. We don't know that F
is a list so I can't take the head of
the list or do some other indexing manipulation. We don't know that F
is
option so I can't fold
on it to get out some value with a default. We just
know it has a Monad.
- I can supply
Future[String]
,Future[Long]
,Future[Double]
and get out aFuture[User]
- I can supply
Option[String]
,Option[Long]
,Option[Double]
and get out anOption[User]
This is not a useful function and you probably would never do something like this specific example; however, most of the libraries you are using under the hood rely on higher-kinded types and these ultra-generic programs to work ergonomically.
What the above is very useful is for illustrating effects. Remember that
effects all have the same Shape F[A]
type F[A] = Option[A] type F[A] = Either[E, A] // for any fixed E type F[A] = List[A] type F[A] = Reader[E, A] // for any type E type F[A] = State[S, A] // for any type S // intuition: this extends to other "effects" type F[A] = Future[A] type F[A] = Task[A]
and the effect is whatever distinguishes F[A] from A
We can use the doThing
above to see how different effects add something extra
to the computation when getting a User
. This is the topic of an entire post
I've written about here