This post started out as “Applicative Functors for Non-Haskell Developers.”
That was an ambitious title.
Instead, it’s turned out to be something more like “Applicative Functors for Curious Haskell Novices,” or, more accurately, “Applicative Functors for Ian Henry, Two Years Ago.” But that’s not as catchy.
Specifically, this post assumes the following prior Haskell knowledge:
- That you understand currying.
- That you understand the function
map
. - That you have a vague notion of what a typeclass is.
- That you can read Haskell type signatures, including, if the need arises, signatures with typeclasses.
- You understand, informally, how
(+)
and+
are different.
This post inevitably assumes more than that, but it does not mean to.
If that does not describe you: read on, brave soul, and let me know what you think of all this!
Consider this:
Prelude> [1, 2, 3]
[1,2,3]
We’re familiar with map
, and we know that succ
is the successor function (an imperative language might call it increment
), so we’re comfortable with the following:
Prelude> map succ [1, 2, 3]
[2,3,4]
Straightforward. But what if we map a “binary” function?
Prelude> map (+) [1, 2, 3]
We get a list of (partially-applied) functions. If Haskell were really smart, it might print them like this:
[(1 +), (2 +), (3 +)]
But as it stands there’s no way to stringify functions, so that instead spits out an error. Still, we know what’s going on.
What if we want to call those functions? That’s straightforward:
Prelude> let fns = map (+) [1, 2, 3]
Prelude> map (\fn -> fn 10) fns
[11,12,13]
Or, equivalently:
Prelude> map ($ 10) fns
[11,12,13]
But we have a list of functions… what if we want to call them with a list of values?
What would that mean?
My first instinct would be to call the first function with the first value, the second function with the second value, etc. But we could also call each function with each value, getting a little cartesian product thing going on.
The first approach – zipping the lists together and calling each function with the value at the same index – is nice if the lists are the same length. But if one is shorter than the other, we’ll have to ignore either some functions or some values. This is how zip
behaves, so we’re used to it, but it’s nonetheless not perfect.
So the second approach is somehow nicer to me. It always uses all of the values and all of the functions, and it’s sort of indiscriminate in the way it behaves: every function gets every value.
There are certainly other, less sensible things we could do: call every function with the first value in the list, call the first function on every value an infinite number of times, do nothing and always return the empty list, reverse the values and then do the zip application, etc. But I don’t want to talk about those just yet.
Here’s my crack at the first approach:
Prelude> map (\(fn, x) -> fn x) (zip fns [10, 20, 30])
[11,22,33]
Or, more briefly:
Prelude> zipWith ($) fns [10, 20, 30]
[11,22,33]
And we could write the second one like this:
Prelude> [fn x | fn <- fns, x <- [10, 20, 30]]
[11,21,31,12,22,32,13,23,33]
Which is all well and good.
Remember this, because we’ll come back to it.
Do that to me one more time
Let’s take a step back. Forget about lists of functions and lists of values – let’s talk about Maybe
for a second.
Prelude> Just 5
Just 5
With a list of values, we could map succ
over the list and get another list of values.
It makes sense to do the same thing with Maybe
values: we could imagine turning Just 5
into Just 6
using the successor function. The function that does that is called fmap
:
Prelude> fmap succ (Just 5)
Just 6
fmap
is startlingly close to map
, and, in fact, is the same function. But map
only works on lists, and fmap
works on lots of different things: lists, maybes, functions (“mapping” their return values), and countless others.
These things have something in common: they “contain values” which “can be mapped.” Lists contain zero or more values. Maybe
s contain zero or one values. Functions “contain” any number of values – anything the function is able to return. There are plenty more examples of “things that can be mapped,” and the container analogy doesn’t make sense for all of them, but we’re gonna stick with these for now. It’s good enough to get us started.
In Haskell, this common nature is expressed by a typeclass. Specifically, Functor
. The name comes from category theory, and we’re stuck with it. So when you see “fmap
,” think “functor map.”
So back to Maybe
: let’s do the same exercise as before.
Prelude> fmap succ (Just 5)
Just 6
Prelude> let maybeFn = fmap (+) (Just 5)
GHCi can’t display maybeFn
, but I would render it as Just (5 +)
– it’s Just
the function which adds 5
to its input.
As before, we can invoke this with a value:
Prelude> fmap (\fn -> fn 10) maybeFn
Just 15
Or:
Prelude> fmap ($ 10) maybeFn
Just 15
But what if we want to invoke it not with an Int
, but with a Maybe Int
? That is, to “invoke” this “Maybe
function” with a “Maybe
value.” How do we do that?
In the list case, we zipped, or we used list comprehensions. In the Maybe
case, both of those approaches collapse into the same thing: invoking every function with every value isn’t hard when there’s only one of each. But there’s no zip
analogue for Maybe
,1 and it doesn’t make sense to use a list comprehension when we don’t have any lists,2 so it’s difficult to write a one-liner for this. We kinda need pattern matching.
But you know where this is going: if we can use the normal (+)
function to add one list of numbers to another list of numbers, and we want to do essentially the same thing but with Maybe
numbers instead, it would be nice to express that common functionality explicitly. And as we are in Haskell, our mechanism for expressing common behavior between distinct types is called a typeclass.
Coming clean
We already saw the Functor
typeclass, and we’re getting close to seeing the next one: it’s called Applicative
in Haskell code, and its full name in English is “applicative functor.”
But we haven’t actually gotten there yet. In fact, we have only seen one half of Applicative
: the idea of combining two “fancy values” together using one normal function. By two “fancy values” I mean two Maybe Int
s or two [Int]
s or even two String -> Int
s, and by “normal function” I mean something like Int -> Int
– a function that takes a Plain Old Int
™.
That operation turns out to be the function (<*>)
– that is, the operator <*>
– which sometimes goes by the name ap
. You can mentally pronounce (<*>)
like “app” while you’re reading. I pronounce it “spaceship” because I read it long before I knew what it was, but I’m trying to correct that.
Using the <*>
operator, we can write things like:
Prelude> Just succ <*> Just 10
Just 11
Prelude> [(+ 1), (+ 2), (+ 3)] <*> [10, 20, 30]
[11,21,31,12,22,32,13,23,33]
Prelude> fmap (+) (Just 10) <*> Just 5
Just 15
It’s exactly what we’ve been doing ad-hoc this whole time, but now we have a name for it. Importantly, as with fmap
, the name is the same across types, reflecting the fact that it really is the same operation with a different implementation each time.
Defining (<*>)
for different types is rather straightforward – we’ve already done so once – and in fact we can define it a little more neatly without using any “powerful” type-specific functions like zip
, but instead just pattern matching, constructors, and fmap
.
We’re allowed to use fmap
because we’re going to define Applicative
as an extension of Functor
– Functor
plus some additional powers. We can’t make a type an instance of Applicative
without making it an instance of Functor
first, so fmap
will always be available to us.
(Just fn) <*> x = fmap fn x
Nothing <*> _ = Nothing
-- the zip version of lists
(fn:fns) <*> (x:xs) = (fn x):(fns <*> xs)
_ <*> _ = []
In order to define the cartesian product version of list application, I find that I want to reach for (++)
, but only to shorten the code – it is not defined in terms of anything fancy itself:
-- the cartesian product version of lists
(fn:fns) <*> xs = (fmap fn xs) ++ (fns <*> xs)
_ <*> _ = []
But I could’ve just stuck with the list comprehension approach:
fns <*> xs = [fn x | fn <- fns, x <- xs]
That’s how the standard library defines (<*>)
for lists. But using a list comprehension here feels a little bit like cheating (to me).
Now that we have (<*>)
, we can return to our examples and see how they might actually look:
Prelude> fns <*> [10, 20, 30]
[11,21,31,12,22,32,13,23,33]
Prelude> maybeFn <*> Just 10
Just 15
Not bad!
But are these implementations really the best implementations they could be?
Implementationpol
Judge with me for a moment:
One way that we could “add” two Maybe Int
s together would be to say that the result is always Nothing
, regardless of the input. That has the right type – it turns two Maybe Int
s into a single Maybe Int
– but I judge it to be an unreasonable implementation. No adding takes place! 4
“plus” 0
would be the same as 4
“plus” 1
! That’s nonsense.
And there are nonsensical ways to add two lists of Int
s as well: I would say that it is unreasonable to claim that [1, 2, 3]
“plus” [10, 11, 12]
is [11, 11, 11, 11, 11, 11...]
, or [1, 2, 3]
, or any number of other type-correct ways you could define it.
But the reason I say that that’s “unreasonable” is that it feels wrong to me. And that’s not satisfying: what makes me feel that way? It’s as if it’s not really following the spirit of those functions, even if it’s obeying the letter of their type signatures.
Now, I would be much happier – and I hope you would too – if we could somehow formalize this distinction between “reasonable” and “unreasonable” implementations.
And it turns out we can.
Laying down the law
Let’s go back to Functor
, because it’s a simpler place to start, and because in order to have a reasonable Applicative
instance we first need a reasonable Functor
instance upon which to build our temple.
So we want to be able to distinguish “reasonable” definitions of fmap
, like fmap succ [10, 20, 30] = [11, 21, 31]
from “unreasonable” ones, like fmap succ [1, 2, 3] = []
or fmap succ (Just 5) = Just 5
.
I’m going to classify these two types of incorrect as implementations which change the “shape” that they’re mapping over, instead of just the values, and functions which change the values incorrectly, without “respecting” the function that they’re given.
When we’re talking about the list type, the first type of incorrectness – structural incorrectness – would mean changing the number or order of elements in some way. To rule out implementations like that, we can say that any “reasonable” fmap
implementation will obey the following law:
∀ x :: Functor f => f a
fmap id x ≡ id x
That is, for any fancy value you can come up with, mapping the identity function over it shouldn’t change anything. As long as this law holds, the fmap
implementation can only change the values in the structure, not the structure itself.
The second type of incorrect implementation is actually ruled out by Haskell’s type system, as long as we shut our eyes and pretend that seq
doesn’t exist. If you try to write an implementation like this:
fmap fn x = x
It simply won’t compile, because x
is the wrong type for the right-hand side.
In fact, given the type signature for fmap
:
fmap :: Functor f => (a -> b) -> f a -> f b
-- specialized to lists
fmap :: (a -> b) -> [a] -> [b]
-- specialized to Maybe
fmap :: (a -> b) -> Maybe a -> Maybe b
The only way you can get a b
value is by pulling it out of the function you were handed – by respecting the function, in other words.
Thus, knowing fmap id ≡ id
is sufficient to ensure a reasonable fmap
implementation.
If we want to be more precise and consider the case where
seq
can be used to sniff outundefined
values, then we can add another law to further restrict implementations offmap
. But this post is more interested in building intuition than being completely rigorous, so I’ll leave it at that.
Reasonable Applicative
s
Okay. So fmap
preserves the “shape” of the structure that it maps over. But the (<*>)
function in Applicative
actually takes two “structures:” one containing the function, and another containing the function’s argument-to-be.
(<*>)
can’t exactly “preserve” both structures, because it’s collapsing them together into a single result, but it seems like it should somehow “respect” the shape of both of them. We can probably write a law (or perhaps one law for each side) that guarantees that for us:
uhh...
But how? It would be nice to say that if id
is your function, and you “call” it with a fancy value, you should get your exact input back. The “shape” of your input shouldn’t change (because that would be an unreasonable (<*>)
implementation) and any values that it might contain3 shouldn’t change at all (because id
is not capable of changing values, and using anything else would not typecheck4). Something like this:
∀ x :: Applicative f => f a
id <*> x ≡ x
But that doesn’t make sense, because (<*>)
is all about fancy functions and fancy values, and id
is simply not a fancy function. It’s a plain function. We want to say something like this:
-- cartesian product version
∀ x :: [a]
[id] <*> x ≡ x
-- zip version
∀ x :: [a]
[id, id, id...] <*> x ≡ x
Or like this:
∀ x :: Maybe a
Just id <*> x ≡ x
But we have no way to express that idea generally about all Applicative
s. We have no way to get from id
to “fancy id
.”
The pure
ity test
So let’s make one.
More precisely, we want a function that does just what I did above: take a function (in that case id
) and “wrap” it in a way that doesn’t impose any more structure than necessary. Yes, that’s an incredibly fuzzy statement. I’ll refine it.
And heck, why limit its input to a function type? We can have it wrap any value. That gives it a signature like this:
pure :: Applicative f => a -> f a
-- specialized to lists
pure :: a -> [a]
-- specialized to Maybe
pure :: a -> Maybe a
And notice: we’re making things harder on ourselves! We’re trying to ensure that our implementation of (<*>)
is reasonable, but now we have to ensure that our implementation of pure
is reasonable too.
But I’m afraid there’s no other way: we simply don’t have the ability to say anything useful about (<*>)
without pure
. So we soldier on, and now we’re able to express the law we were talking about before in the most general way:
∀ x :: Applicative f => f a
pure id <*> x ≡ x
Before I said that we can’t hope to “preserve” the structure of our input, because we have two inputs that we’re collapsing into one output.
But if one of those inputs is pure
, it’s sort of like multiplying by 1
: we can preserve the structure of the other side. So while generally (<*>)
will collapse two structures into one, we’d like pure
to be as “structureless” as possible, so that when it comes time to “collapse it,” it doesn’t change the other structure at all.
And notice that that idea is expressed in the above law: not only are we using id
to keep us from changing the value, we’re also using pure
to keep us from changing the structure.
Crystal clear right okay let’s keep going.
Outlaws abound
We’ve taken the first step towards defining what it means to be a “reasonable” implementation of (<*>)
. And we’ve already ruled out a ton of nonsense implementations, like this one for the list type:
_ <*> _ = []
Or this one:
(fn:_) <*> (x:_) = repeat (fn x)
_ <*> _ = []
But we haven’t ruled out all nonsense instances. For example: what if we have an unreasonable implementation of (<*>)
and an unreasonable implementation of pure
, such that the two cancel each other out and we end up satisfying the law? Sticking with lists, I could come up with:
pure x = [x, x]
(fn:_) <*> x = fmap fn x
_ <*> _ = []
That implementation certainly satisfies pure id <*> x ≡ x
:
pure id <*> x
-- definition of pure:
[id, id] <*> x
-- definition of (<*>):
fmap id x
-- Functor law:
x
But it doesn’t seem reasonable to me. (<*>)
does not look at its entire input, and pure
makes an arbitrary choice: why does it double its input? Why not triple it? Arbitrary choices smell unreasonable to me.
Hopefully this isn’t surprising: if you want to solve for two variables, you’ve got to have two equations. So what’s our second law going to be?
Before, I said that we wanted to respect the “shape” of both structures. The first law ensures that we preserve the “shape” of the structure holding the value, by fixing the structure around the function to be the “pure
” structure – but now let’s turn that around, and fix the right-hand side:
∀ g :: Applicative f => f (a -> b)
∀ x :: a
g <*> pure x ≡ ???
It’s not immediately obvious what we can say about this friend equation. If it were perfectly symmetric with the previous law, we’d expect g <*> pure x ≡ g
, but of course that doesn’t make sense – (<*>)
is not symmetric, and we need only look at the types to see how silly that statement is:
g :: Applicative f => f (a -> b)
x :: a
g <*> pure x :: f b
So how else might we arrive at something with the type f b
?
Well, we’re trying to express the intent of the Applicative
class. And a large part of the intent is that we, you know, actually call the function(s) on the left-hand side of <*>
. We don’t ignore it, or call it twice, or anything weird like that.
So it would be nice to somehow capture that idea with in law form: given a pure argument, we “just” want to invoke the left-hand side with that argument – for some definition of invoke that makes sense (e.g. for the list type, we really mean invoke all of the functions on the left-hand side).
Now, it’s not at all obvious how we write that. But if we’re very clever, we might be mulling over function application, and we might start thinking about functions that “do” function application. Functions like (\fn -> fn x)
, for example.
And if we’re clever enough, we might have this thought: how could we feed our g
from above into a function like (\fn -> fn x)
?
Hmm. It’s not an easy question, but fortunately someone smarter than me already thought about it and came up with the answer: what if we move g
to the other side of the <*>
operator?
??? <*> g
It’s totally fine to say that our “value” is a function – so long as our “function” takes a function as its input. Which is precisely what the function-applying function from above does!
pure (\fn -> fn x) <*> g
And putting it all together, we have this as our next law (forgiving, I hope, the heavy-handed derivation):
∀ g :: Applicative f => f (a -> b)
∀ x :: a
g <*> pure x ≡ pure (\fn -> fn x) <*> g
-- or, canonically:
g <*> pure x ≡ pure ($ x) <*> g
Now, does that rule out our conspiring implementation?
It does! We can prove so by counter-example, choosing [id, id, id]
as our g
:
-- left hand side: -- right hand side:
g <*> pure x pure ($ x) <*> g
-- subtituting g: -- subtituting g:
[id, id, id] <*> pure x pure ($ x) <*> [id, id, id]
-- definition of pure: -- definition of pure:
[id, id, id] <*> [x, x] [($ x), ($ x)] <*> [id, id, id]
-- definition of (<*>): -- definition of (<*>):
fmap id [x, x] fmap ($ x) [id, id, id]
-- Functor law: -- definition of fmap:
[x, x] [id $ x, id $ x, id $ x]
-- definition of ($):
[id x, id x, id x]
-- definition of id:
[x, x, x]
Now, we ask ourselves: are these two laws sufficient to rule out all “unreasonable” implementations?
I don’t know
I really don’t. Applicative
is generally presented with a total of four laws.5 But are some of them redundant, if we don’t care about differences in strictness? If we pretend that seq
isn’t part of the language, is it possible to construct an Applicative
instance that satisfies the two laws presented here without satisfying the others?
I don’t know. I can’t come up with one, but that means little.
After all, Functor
is often presented with two laws – but ignoring differences in strictness, you can derive the second law from the first using the “free theorem” that arises from the type signature of fmap
.
So we have to wonder if it’s possible to use a similar trick to derive the other Applicative
laws. And if we happen to stumble upon an answer, let me know!
For the record, those other two laws are:
∀ f :: a -> b
∀ x :: a
pure fn <*> pure x ≡ pure (fn x)
And:
∀ u :: Applicative f => f (a -> b)
∀ v :: Applicative f => f (a -> b)
∀ w :: Applicative f => f a
pure (.) <*> u <*> v <*> w ≡ u <*> (v <*> w)
Which are not hard to find elsewhere. However, as I can’t find a counter-example that they can be used to restrict, I find it hard to motivate further discussion of them here.
Closing thoughts, feelings
Until I wrote this post, I didn’t really understand why pure
was part of Applicative
. It seemed to me that you could imagine a typeclass with only one of pure
or (<*>)
, sitting somewhere in between Functor
and Applicative
in power.
But when I started thinking about the laws, started distinguishing between reasonable and unreasonable implementations of the typeclass, I realized that you can’t really have one without the other.
I’m not claiming that the reason pure
is a function in Applicative
– the original motivation – is to define these laws. It could have been, but pure
is an extremely useful function to have, so it could have arisen in unison with (<*>)
.6
But if you try to gain intuition for Applicative
as “like Functor
, but you can map with functions of multiple arguments,” then it seems like you don’t really need pure
. After all, you can always write something like:
fmap ($ 10) (fmap (+) (Just 5))
Instead of:
pure (+) <*> Just 5 <*> pure 10
It’s ugly, but it works.
Now, there are certainly other, richer interpretations of the Applicative
typeclass than this one. And it’s possible that pure
has a more vital place within them. But as a programmer, I think that the first time I found myself using the Applicative
typeclass was like this – lifting curried functions so I could call them with fancy arguments.
And it’s hard to talk about other interpretations without using the M-word.
So I’ll save those for another day.
-
There is an alternative, equivalent formulation of
Applicative
based on generalizingzip
into multiple types. Shockingly – criminally – the typeclass this gives rise to is not namedZipplicative
. ↩︎ -
Actually, if we
:set -XMonadComprehensions
, we can write it identically:[fn x | fn <- maybeFn, x <- Just 10]
. After all, why should lists get special syntax? ↩︎ -
“Containing values” is intuitive when we’re dealing with
[a]
andMaybe a
, but remember that it can mean more exotic things like “values that can be returned from functions” or “values which can be produced by interacting with the outside world.” So in the function case we might interpret that statement as “it should continue to return the same values for the same input.” ↩︎ -
Obligatory ⊥ qualification. ↩︎
-
Or four-and-a-half laws, if you count
fmap ≡ (<*>) . pure
↩︎ -
Or rather, given the history, in unison with
(>>=)
. ↩︎