Haskell Functors in Detail: An in-depth tutorial/reference about functors

 In this hands-on tutorial/reference, I will be explaining Functors and all of its major instances:

  • Either
  • List
  • Maybe
  • IO
  • Function ((->) r)
  • Tuple ((,) a))

Functor is the easiest of the three big topics: FunctorApplicative, and Monad. Let's get started.
I assume you know the map function. For example:
ghci> map (+1) [1 .. 5]
[2, 3, 4, 5, 6]
All it did, is take every element from the list [1 .. 5], and replace it with (+1) applied to that element. Pretty simple right? Now try replacing map with fmap like so: fmap (+1) [1 .. 5]. You should get  [2,3,4,5,6] just as before. So what is the use of fmap when we already have map? Now try this (this uses Maybe; if you don't know what it is, looking at the documentation might help):
ghci> map (+1) (Just 10)
I'm sure many of you will expect this to raise an error: it does. You should get an error saying that it expected a [b], but got a Maybe Int-- and it is totally justified.
 
Now, let's try the same thing with fmap:
ghci> fmap (+1) (Just 10)
Just 11
Woah! What happened? Well, all fmap did is apply (+1) to the 10 in Just 10. That's it. So, fmap is a sort of generalization of map: it can be applied to more types other than a list. Let's compare map and fmap's types:
map :: (a -> b) -> [a] -> [b]
fmap :: Functor f => (a -> b) -> f a -> f b
You need to know typeclasses for this part! If you don't know what they are or need a refresher, look here. If you ignore the Functor f part, you'll notice instant similarities:
  • map takes a function of type (a -> b) as the first argument; fmap does too!
  • map takes a structure (a list in this case) of type a; fmap also takes a structure (the structure has to be an instance of Functor) of type a too!
  • map returns a structure (a list in this case) of type bfmap also takes a structure (the structure has to be an instance of Functor) of type b too!
If the last two points don't make too much sense, don't fret, it'll all make sense in a minute. map's type signature can be rewritten as so:
map :: (a -> b) -> [] a -> [] b
This is because [] is a data constructor. Now, what if we add Functor f as the type constraint, and replace [] with f? We get fmap's type signature! All fmap does, is take a function of type (a -> b), and an instance of Functor, returning an instance of Functor. Let's take a look at the Functor typeclass and its main instances:
class Functor (f :: * -> *) where
  fmap :: (a -> b) -> f a -> f b
  (<$) :: a -> f b -> f a

instance Functor (Either a) -- Defined in `Data.Either'
instance Functor [] -- Defined in `GHC.Base'
instance Functor Maybe -- Defined in `GHC.Base'
instance Functor IO -- Defined in `GHC.Base'
instance Functor ((->) r) -- Defined in `GHC.Base'
instance Functor ((,) a) -- Defined in `GHC.Base'
We'll ignore (<$) for now: we'll get back to it later. We'll also ignore (f :: * -> *) for now; we'll get back to it in a moment. If you know what that is ahead of time (unless you already know what that is), read up on kinds and higher-kinded types. Now, we'll take a closer look at each of those instances.
 

Either

If you don't know what Either is, looking at the documentation could be helpful. Let us start off by looking at the instance declaration:
instance Functor (Either a) where
    fmap f (Right x) = Right (f x)
    fmap f (Left x) = Left x
Pretty simple right? Not really. If you take a closer look, you'll notice that the instance of Functor we are making is not for Either, but Either a. This has to do with higher kinded types. I'll explain it a bit. Either a is a partially applied data constructor. This is required because the (f :: * -> *) we saw previously means that every instance of Functor has to only take in one argument. The problem is Either takes in two arguments, a and b. So we are partially applying Either to a (getting Either a), so that Either takes in only one argument: b. This was a very brief explanation, so it would be helpful to read up on kinds.
 
Now that is out of the way, we can focus on the real stuff. Try running this in ghci:
ghci> fmap (+1) (Right 10)
Right 11
ghci> fmap (+1) (Left "error!")
Left "error!"
Let us ignore the last one for now. Remember the fmap declaration for Right? Here it is:
fmap f (Right x) = Right (f x)
This is pretty straight-forward. All it does is apply (+1) to the 10 in Right 10. Now let's look at the last one. Again, we'll take a look at its definition:
fmap f (Left x) = Left x
As you can see, the function is never applied to the value in Left! Why would they do this? They do this for a number of reasons, but the most relevant is the fact that Either is usually used to return errors. Right is the result if the function succeded (funny, right?), and Left is the error, usually explaining why the error happened. So if an error is returned using Left, you wouldn't want to apply the function to the error! That concludes the explanation for the Either functor!

List

This is trivial. fmap for lists is the exact same as map. If we try to do this over an empty list, we'll get an empty list back.

Maybe

If you don't know what Maybe is, looking at the documentation might help. We've previously seen a demonstration of the Maybe functor:
ghci> fmap (+1) (Just 10)
Just 11
As we've stated, the only thing that is happening is (+1) is being applied to the 10 in Just 10. Now let's try doing the same thing to Nothing:
ghci> fmap (+1) Nothing
Nothing
As expected, we get Nothing back because there is nothing to run (+1) on.

IO

Now we're getting to the interesting stuff. This is not as simple as the previous ones, so you should probably read up on monads (if you haven't already). Here is the instance declaration:
instance Functor IO where
    fmap f x = x >>= (pure . f)
If you have a good understanding of monads, this should make a lot of sense. If not, I'll explain. x is something wrapped in IO (so something like IO Int). What >>= does is it takes the Int in IO Int and passes it to (pure . f). So what does (pure . f) do? It applies f to the argument passed, then wraps it up in IO (if f returns an Int, then (pure . f) returns an IO Int). Let's see an example:
ghci> fmap (++ "!!!") getLine
foobar
"foobar!!!"
(++ "!!!") takes a String, and returns it with !!! appended to it. getLine reads from stdin (like Python's input, or C++'s std::cin), and returns an IO String. So the value inside of getLine ( getLine returns IO String, so the value is the String) is passed to (++ "!!!"). Then, it is wrapped back up in IO, and returned as an IO String. Pretty simple when you think about it.

Function ((->) r)

This may seem complicated but in reality, it is extremely simple; In fact, you've been doing this for quite some time now! Let's specialize fmap's type signature to ((->) r):
fmap :: (a -> b) -> (r -> a) -> r -> b
Looks familiar? It should because it is the exact same as the composition function (.):
(.) :: (b -> c) -> (a -> b) -> a -> c
This means that fmap (+1) (+5) is the same as (+1) . (+5)!

Tuple ((,) a)

Here is another interesting one. Let's look at the instance declaration:
instance Functor ((,) a) where
    fmap f (x,y) = (x, f y)
As you can see, (,) is partially applied, just like Either was. Again, this is because (,) takes in two arguments instead of the required one argument. A consequence of this is the function can only be applied to the second item in the tuple. Let's see an example:
ghci> fmap (+1) (5, 10)
(5,11)
As you can see (+1) was only applied to the second element on (5, 10), hence (5, 11).

Bonus!

Now we are going to talk about a few cool Functor stuff. Remember the <$ operator? If you don't here it is (where f is constrained by Functor):
(<$) :: a -> f b -> f a
So what does this do? It takes two arguments, one of type a, and an instance of Functor applied to type b. Let's see a few examples:
ghci> 10 <$ [1 .. 5]
[10,10,10,10,10]
ghci> 10 <$ Just 'a'
Just 10
As we can see, everything inside the structure is replaced with the first argument to <$. It does not matter what type is inside the structure, it is still replaced with the first argument. <$ can be implemented in terms of fmap:
(<$) = fmap . const
This is really straightforward, but let me explain. Let's reuse 10 <$ [1 .. 5] as our example. Starting off, 10 is replaced with const 10 (const takes in two arguments; no matter what the second argument is, it returns the first one). After that const 10 is used as the mapping function. So 10 <$ [1 .. 5] could have be rewritten as:
ghci> fmap (const 10) [1 .. 5]
[10,10,10,10,10]
Next up, we have an operator for fmap! It is the <$> operator (looks a lot like the <$ operator). This means this:
ghci> fmap (+1) (Just 10)
Just 11
Could be replaced with this:
ghci> (+1) <$> (Just 10)
Just 11
How cool is that?

Functor Laws

Now, we will discuss the functor laws. They are very intuitive, so I won't have to say much. This is pretty important in order to have a solid understanding of Functors, and write good code with them. Functors obey two laws:
  1. Identity morphisms: This is trivial. fmap id will return the exact same thing as id
  2. Composition of morphisms: Also very simple. All it says is that fmap f . fmap g will return the exact same thing as fmap (f . g)

Conclusion

I hope you now know a thing or two (or a ton) about Functors! Thanks a lot for reading this post!

Popular posts from this blog

Haskell: GHC's error messages explained