This post started out as “Applicative Functors for Non-Haskell Developers.”
That was an ambitious title.
Instead, it’s turned out to be something more like “Applicative Functors for Curious Haskell Novices,” or, more accurately, “Applicative Functors for Ian Henry, Two Years Ago.” But that’s not as catchy.
Specifically, this post assumes the following prior Haskell knowledge:
- That you understand currying.
- That you understand the function
- That you have a vague notion of what a typeclass is.
- That you can read Haskell type signatures, including, if the need arises, signatures with typeclasses.
- You understand, informally, how
This post inevitably assumes more than that, but it does not mean to.
If that does not describe you: read on, brave soul, and let me know what you think of all this!
Prelude> [1, 2, 3] [1,2,3]
We’re familiar with
map, and we know that
succ is the successor function (an imperative language might call it
increment), so we’re comfortable with the following:
Prelude> map succ [1, 2, 3] [2,3,4]
Straightforward. But what if we map a “binary” function?
Prelude> map (+) [1, 2, 3]
We get a list of (partially-applied) functions. If Haskell were really smart, it might print them like this:
[(1 +), (2 +), (3 +)]
But as it stands there’s no way to stringify functions, so that instead spits out an error. Still, we know what’s going on.
What if we want to call those functions? That’s straightforward:
Prelude> let fns = map (+) [1, 2, 3] Prelude> map (\fn -> fn 10) fns [11,12,13]
Prelude> map ($ 10) fns [11,12,13]
But we have a list of functions… what if we want to call them with a list of values?
What would that mean?
My first instinct would be to call the first function with the first value, the second function with the second value, etc. But we could also call each function with each value, getting a little cartesian product thing going on.
The first approach – zipping the lists together and calling each function with the value at the same index – is nice if the lists are the same length. But if one is shorter than the other, we’ll have to ignore either some functions or some values. This is how
zip behaves, so we’re used to it, but it’s nonetheless not perfect.
So the second approach is somehow nicer to me. It always uses all of the values and all of the functions, and it’s sort of indiscriminate in the way it behaves: every function gets every value.
There are certainly other, less sensible things we could do: call every function with the first value in the list, call the first function on every value an infinite number of times, do nothing and always return the empty list, reverse the values and then do the zip application, etc. But I don’t want to talk about those just yet.
Here’s my crack at the first approach:
Prelude> map (\(fn, x) -> fn x) (zip fns [10, 20, 30]) [11,22,33]
Or, more briefly:
Prelude> zipWith ($) fns [10, 20, 30] [11,22,33]
And we could write the second one like this:
Prelude> [fn x | fn <- fns, x <- [10, 20, 30]] [11,21,31,12,22,32,13,23,33]
Which is all well and good.
Remember this, because we’ll come back to it.
Do that to me one more time
Let’s take a step back. Forget about lists of functions and lists of values – let’s talk about
Maybe for a second.
Prelude> Just 5 Just 5
With a list of values, we could map
succ over the list and get another list of values.
It makes sense to do the same thing with
Maybe values: we could imagine turning
Just 5 into
Just 6 using the successor function. The function that does that is called
Prelude> fmap succ (Just 5) Just 6
fmap is startlingly close to
map, and, in fact, is the same function. But
map only works on lists, and
fmap works on lots of different things: lists, maybes, functions (“mapping” their return values), and countless others.
These things have something in common: they “contain values” which “can be mapped.” Lists contain zero or more values.
Maybes contain zero or one values. Functions “contain” any number of values – anything the function is able to return. There are plenty more examples of “things that can be mapped,” and the container analogy doesn’t make sense for all of them, but we’re gonna stick with these for now. It’s good enough to get us started.
In Haskell, this common nature is expressed by a typeclass. Specifically,
Functor. The name comes from category theory, and we’re stuck with it. So when you see “
fmap,” think “functor map.”
So back to
Maybe: let’s do the same exercise as before.
Prelude> fmap succ (Just 5) Just 6 Prelude> let maybeFn = fmap (+) (Just 5)
GHCi can’t display
maybeFn, but I would render it as
Just (5 +) – it’s
Just the function which adds
5 to its input.
As before, we can invoke this with a value:
Prelude> fmap (\fn -> fn 10) maybeFn Just 15
Prelude> fmap ($ 10) maybeFn Just 15
But what if we want to invoke it not with an
Int, but with a
Maybe Int? That is, to “invoke” this “
Maybe function” with a “
Maybe value.” How do we do that?
In the list case, we zipped, or we used list comprehensions. In the
Maybe case, both of those approaches collapse into the same thing: invoking every function with every value isn’t hard when there’s only one of each. But there’s no
zip analogue for
Maybe,1 and it doesn’t make sense to use a list comprehension when we don’t have any lists,2 so it’s difficult to write a one-liner for this. We kinda need pattern matching.
But you know where this is going: if we can use the normal
(+) function to add one list of numbers to another list of numbers, and we want to do essentially the same thing but with
Maybe numbers instead, it would be nice to express that common functionality explicitly. And as we are in Haskell, our mechanism for expressing common behavior between distinct types is called a typeclass.
We already saw the
Functor typeclass, and we’re getting close to seeing the next one: it’s called
Applicative in Haskell code, and its full name in English is “applicative functor.”
But we haven’t actually gotten there yet. In fact, we have only seen one half of
Applicative: the idea of combining two “fancy values” together using one normal function. By two “fancy values” I mean two
Maybe Ints or two
[Int]s or even two
String -> Ints, and by “normal function” I mean something like
Int -> Int – a function that takes a Plain Old
That operation turns out to be the function
(<*>) – that is, the operator
<*> – which sometimes goes by the name
ap. You can mentally pronounce
(<*>) like “app” while you’re reading. I pronounce it “spaceship” because I read it long before I knew what it was, but I’m trying to correct that.
<*> operator, we can write things like:
Prelude> Just succ <*> Just 10 Just 11 Prelude> [(+ 1), (+ 2), (+ 3)] <*> [10, 20, 30] [11,21,31,12,22,32,13,23,33] Prelude> fmap (+) (Just 10) <*> Just 5 Just 15
It’s exactly what we’ve been doing ad-hoc this whole time, but now we have a name for it. Importantly, as with
fmap, the name is the same across types, reflecting the fact that it really is the same operation with a different implementation each time.
(<*>) for different types is rather straightforward – we’ve already done so once – and in fact we can define it a little more neatly without using any “powerful” type-specific functions like
zip, but instead just pattern matching, constructors, and
We’re allowed to use
fmap because we’re going to define
Applicative as an extension of
Functor plus some additional powers. We can’t make a type an instance of
Applicative without making it an instance of
Functor first, so
fmap will always be available to us.
(Just fn) <*> x = fmap fn x Nothing <*> _ = Nothing -- the zip version of lists (fn:fns) <*> (x:xs) = (fn x):(fns <*> xs) _ <*> _ = 
In order to define the cartesian product version of list application, I find that I want to reach for
(++), but only to shorten the code – it is not defined in terms of anything fancy itself:
-- the cartesian product version of lists (fn:fns) <*> xs = (fmap fn xs) ++ (fns <*> xs) _ <*> _ = 
But I could’ve just stuck with the list comprehension approach:
fns <*> xs = [fn x | fn <- fns, x <- xs]
That’s how the standard library defines
(<*>) for lists. But using a list comprehension here feels a little bit like cheating (to me).
Now that we have
(<*>), we can return to our examples and see how they might actually look:
Prelude> fns <*> [10, 20, 30] [11,21,31,12,22,32,13,23,33] Prelude> maybeFn <*> Just 10 Just 15
But are these implementations really the best implementations they could be?
Judge with me for a moment:
One way that we could “add” two
Maybe Ints together would be to say that the result is always
Nothing, regardless of the input. That has the right type – it turns two
Maybe Ints into a single
Maybe Int – but I judge it to be an unreasonable implementation. No adding takes place!
0 would be the same as
1! That’s nonsense.
And there are nonsensical ways to add two lists of
Ints as well: I would say that it is unreasonable to claim that
[1, 2, 3] “plus”
[10, 11, 12] is
[11, 11, 11, 11, 11, 11...], or
[1, 2, 3], or any number of other type-correct ways you could define it.
But the reason I say that that’s “unreasonable” is that it feels wrong to me. And that’s not satisfying: what makes me feel that way? It’s as if it’s not really following the spirit of those functions, even if it’s obeying the letter of their type signatures.
Now, I would be much happier – and I hope you would too – if we could somehow formalize this distinction between “reasonable” and “unreasonable” implementations.
And it turns out we can.
Laying down the law
Let’s go back to
Functor, because it’s a simpler place to start, and because in order to have a reasonable
Applicative instance we first need a reasonable
Functor instance upon which to build our temple.
So we want to be able to distinguish “reasonable” definitions of
fmap succ [10, 20, 30] = [11, 21, 31] from “unreasonable” ones, like
fmap succ [1, 2, 3] =  or
fmap succ (Just 5) = Just 5.
I’m going to classify these two types of incorrect as implementations which change the “shape” that they’re mapping over, instead of just the values, and functions which change the values incorrectly, without “respecting” the function that they’re given.
When we’re talking about the list type, the first type of incorrectness – structural incorrectness – would mean changing the number or order of elements in some way. To rule out implementations like that, we can say that any “reasonable”
fmap implementation will obey the following law:
∀ x :: Functor f => f a fmap id x ≡ id x
That is, for any fancy value you can come up with, mapping the identity function over it shouldn’t change anything. As long as this law holds, the
fmap implementation can only change the values in the structure, not the structure itself.
The second type of incorrect implementation is actually ruled out by Haskell’s type system, as long as we shut our eyes and pretend that
seq doesn’t exist. If you try to write an implementation like this:
fmap fn x = x
It simply won’t compile, because
x is the wrong type for the right-hand side.
In fact, given the type signature for
fmap :: Functor f => (a -> b) -> f a -> f b -- specialized to lists fmap :: (a -> b) -> [a] -> [b] -- specialized to Maybe fmap :: (a -> b) -> Maybe a -> Maybe b
The only way you can get a
b value is by pulling it out of the function you were handed – by respecting the function, in other words.
fmap id ≡ id is sufficient to ensure a reasonable
If we want to be more precise and consider the case where
seqcan be used to sniff out
undefinedvalues, then we can add another law to further restrict implementations of
fmap. But this post is more interested in building intuition than being completely rigorous, so I’ll leave it at that.
fmap preserves the “shape” of the structure that it maps over. But the
(<*>) function in
Applicative actually takes two “structures:” one containing the function, and another containing the function’s argument-to-be.
(<*>) can’t exactly “preserve” both structures, because it’s collapsing them together into a single result, but it seems like it should somehow “respect” the shape of both of them. We can probably write a law (or perhaps one law for each side) that guarantees that for us:
But how? It would be nice to say that if
id is your function, and you “call” it with a fancy value, you should get your exact input back. The “shape” of your input shouldn’t change (because that would be an unreasonable
(<*>) implementation) and any values that it might contain3 shouldn’t change at all (because
id is not capable of changing values, and using anything else would not typecheck4). Something like this:
∀ x :: Applicative f => f a id <*> x ≡ x
But that doesn’t make sense, because
(<*>) is all about fancy functions and fancy values, and
id is simply not a fancy function. It’s a plain function. We want to say something like this:
-- cartesian product version ∀ x :: [a] [id] <*> x ≡ x
-- zip version ∀ x :: [a] [id, id, id...] <*> x ≡ x
Or like this:
∀ x :: Maybe a Just id <*> x ≡ x
But we have no way to express that idea generally about all
Applicatives. We have no way to get from
id to “fancy
So let’s make one.
More precisely, we want a function that does just what I did above: take a function (in that case
id) and “wrap” it in a way that doesn’t impose any more structure than necessary. Yes, that’s an incredibly fuzzy statement. I’ll refine it.
And heck, why limit its input to a function type? We can have it wrap any value. That gives it a signature like this:
pure :: Applicative f => a -> f a -- specialized to lists pure :: a -> [a] -- specialized to Maybe pure :: a -> Maybe a
And notice: we’re making things harder on ourselves! We’re trying to ensure that our implementation of
(<*>) is reasonable, but now we have to ensure that our implementation of
pure is reasonable too.
But I’m afraid there’s no other way: we simply don’t have the ability to say anything useful about
pure. So we soldier on, and now we’re able to express the law we were talking about before in the most general way:
∀ x :: Applicative f => f a pure id <*> x ≡ x
Before I said that we can’t hope to “preserve” the structure of our input, because we have two inputs that we’re collapsing into one output.
But if one of those inputs is
pure, it’s sort of like multiplying by
1: we can preserve the structure of the other side. So while generally
(<*>) will collapse two structures into one, we’d like
pure to be as “structureless” as possible, so that when it comes time to “collapse it,” it doesn’t change the other structure at all.
And notice that that idea is expressed in the above law: not only are we using
id to keep us from changing the value, we’re also using
pure to keep us from changing the structure.
Crystal clear right okay let’s keep going.
We’ve taken the first step towards defining what it means to be a “reasonable” implementation of
(<*>). And we’ve already ruled out a ton of nonsense implementations, like this one for the list type:
_ <*> _ = 
Or this one:
(fn:_) <*> (x:_) = repeat (fn x) _ <*> _ = 
But we haven’t ruled out all nonsense instances. For example: what if we have an unreasonable implementation of
(<*>) and an unreasonable implementation of
pure, such that the two cancel each other out and we end up satisfying the law? Sticking with lists, I could come up with:
pure x = [x, x] (fn:_) <*> x = fmap fn x _ <*> _ = 
That implementation certainly satisfies
pure id <*> x ≡ x:
pure id <*> x -- definition of pure: [id, id] <*> x -- definition of (<*>): fmap id x -- Functor law: x
But it doesn’t seem reasonable to me.
(<*>) does not look at its entire input, and
pure makes an arbitrary choice: why does it double its input? Why not triple it? Arbitrary choices smell unreasonable to me.
Hopefully this isn’t surprising: if you want to solve for two variables, you’ve got to have two equations. So what’s our second law going to be?
Before, I said that we wanted to respect the “shape” of both structures. The first law ensures that we preserve the “shape” of the structure holding the value, by fixing the structure around the function to be the “
pure” structure – but now let’s turn that around, and fix the right-hand side:
∀ g :: Applicative f => f (a -> b) ∀ x :: a g <*> pure x ≡ ???
It’s not immediately obvious what we can say about this friend equation. If it were perfectly symmetric with the previous law, we’d expect
g <*> pure x ≡ g, but of course that doesn’t make sense –
(<*>) is not symmetric, and we need only look at the types to see how silly that statement is:
g :: Applicative f => f (a -> b) x :: a g <*> pure x :: f b
So how else might we arrive at something with the type
Well, we’re trying to express the intent of the
Applicative class. And a large part of the intent is that we, you know, actually call the function(s) on the left-hand side of
<*>. We don’t ignore it, or call it twice, or anything weird like that.
So it would be nice to somehow capture that idea with in law form: given a pure argument, we “just” want to invoke the left-hand side with that argument – for some definition of invoke that makes sense (e.g. for the list type, we really mean invoke all of the functions on the left-hand side).
Now, it’s not at all obvious how we write that. But if we’re very clever, we might be mulling over function application, and we might start thinking about functions that “do” function application. Functions like
(\fn -> fn x), for example.
And if we’re clever enough, we might have this thought: how could we feed our
g from above into a function like
(\fn -> fn x)?
Hmm. It’s not an easy question, but fortunately someone smarter than me already thought about it and came up with the answer: what if we move
g to the other side of the
??? <*> g
It’s totally fine to say that our “value” is a function – so long as our “function” takes a function as its input. Which is precisely what the function-applying function from above does!
pure (\fn -> fn x) <*> g
And putting it all together, we have this as our next law (forgiving, I hope, the heavy-handed derivation):
∀ g :: Applicative f => f (a -> b) ∀ x :: a g <*> pure x ≡ pure (\fn -> fn x) <*> g -- or, canonically: g <*> pure x ≡ pure ($ x) <*> g
Now, does that rule out our conspiring implementation?
It does! We can prove so by counter-example, choosing
[id, id, id] as our
-- left hand side: -- right hand side: g <*> pure x pure ($ x) <*> g -- subtituting g: -- subtituting g: [id, id, id] <*> pure x pure ($ x) <*> [id, id, id] -- definition of pure: -- definition of pure: [id, id, id] <*> [x, x] [($ x), ($ x)] <*> [id, id, id] -- definition of (<*>): -- definition of (<*>): fmap id [x, x] fmap ($ x) [id, id, id] -- Functor law: -- definition of fmap: [x, x] [id $ x, id $ x, id $ x] -- definition of ($): [id x, id x, id x] -- definition of id: [x, x, x]
Now, we ask ourselves: are these two laws sufficient to rule out all “unreasonable” implementations?
I don’t know
I really don’t.
Applicative is generally presented with a total of four laws.5 But are some of them redundant, if we don’t care about differences in strictness? If we pretend that
seq isn’t part of the language, is it possible to construct an
Applicative instance that satisfies the two laws presented here without satisfying the others?
I don’t know. I can’t come up with one, but that means little.
Functor is often presented with two laws – but ignoring differences in strictness, you can derive the second law from the first using the “free theorem” that arises from the type signature of
So we have to wonder if it’s possible to use a similar trick to derive the other
Applicative laws. And if we happen to stumble upon an answer, let me know!
For the record, those other two laws are:
∀ f :: a -> b ∀ x :: a pure fn <*> pure x ≡ pure (fn x)
∀ u :: Applicative f => f (a -> b) ∀ v :: Applicative f => f (a -> b) ∀ w :: Applicative f => f a pure (.) <*> u <*> v <*> w ≡ u <*> (v <*> w)
Which are not hard to find elsewhere. However, as I can’t find a counter-example that they can be used to restrict, I find it hard to motivate further discussion of them here.
Closing thoughts, feelings
Until I wrote this post, I didn’t really understand why
pure was part of
Applicative. It seemed to me that you could imagine a typeclass with only one of
(<*>), sitting somewhere in between
Applicative in power.
But when I started thinking about the laws, started distinguishing between reasonable and unreasonable implementations of the typeclass, I realized that you can’t really have one without the other.
I’m not claiming that the reason
pure is a function in
Applicative – the original motivation – is to define these laws. It could have been, but
pure is an extremely useful function to have, so it could have arisen in unison with
But if you try to gain intuition for
Applicative as “like
Functor, but you can map with functions of multiple arguments,” then it seems like you don’t really need
pure. After all, you can always write something like:
fmap ($ 10) (fmap (+) (Just 5))
pure (+) <*> Just 5 <*> pure 10
It’s ugly, but it works.
Now, there are certainly other, richer interpretations of the
Applicative typeclass than this one. And it’s possible that
pure has a more vital place within them. But as a programmer, I think that the first time I found myself using the
Applicative typeclass was like this – lifting curried functions so I could call them with fancy arguments.
And it’s hard to talk about other interpretations without using the M-word.
So I’ll save those for another day.
Actually, if we
:set -XMonadComprehensions, we can write it identically:
[fn x | fn <- maybeFn, x <- Just 10]. After all, why should lists get special syntax? ↩︎
“Containing values” is intuitive when we’re dealing with
Maybe a, but remember that it can mean more exotic things like “values that can be returned from functions” or “values which can be produced by interacting with the outside world.” So in the function case we might interpret that statement as “it should continue to return the same values for the same input.” ↩︎
Obligatory ⊥ qualification. ↩︎
Or four-and-a-half laws, if you count
fmap ≡ (<*>) . pure↩︎
Or rather, given the history, in unison with