In this chapter, we are going to have a closer look at the
computations we typically perform with container types:
Parameterized data types like List
, Maybe
, or
Identity
, holding zero or more values of the parameter's
type. Many of these functions are recursive in nature,
so we start with a discourse about recursion in general,
and tail recursion as an important optimization technique
in particular. Most recursive functions in this part
will describe pure iterations over lists.
It is recursive functions, for which totality is hard to determine, so we will next have a quick look at the totality checker and learn, when it will refuse to accept a function as being total and what to do about this.
Finally, we will start looking for common patterns in
the recursive functions from the first part and will
eventually introduce a new interface for consuming
container types: Interface Foldable
.
module Tutorial.Folds
import Data.List1
import Data.Maybe
import Data.Vect
import Debug.Trace
%default total
In this section, we are going to have a closer look at recursion in general and at tail recursion in particular.
Recursive functions are functions, which call themselves to repeat a task or calculation until a certain aborting condition (called the base case) holds. Please note, that it is recursive functions, which make it hard to verify totality: Non-recursive functions, which are covering (they cover all possible cases in their pattern matches) are automatically total if they only invoke other total functions.
Here is an example of a recursive function: It generates a list of the given length filling it with identical values:
replicateList : Nat -> a -> List a
replicateList 0 _ = []
replicateList (S k) x = x :: replicateList k x
As you can see (this module has the %default total
pragma at the top),
this function is provably total. Idris verifies, that the Nat
argument
gets strictly smaller in each recursive call, and that therefore, the
function must eventually come to an end. Of course, we can do the
same thing for Vect
, where we can even show that the length of the
resulting vector matches the given natural number:
replicateVect : (n : Nat) -> a -> Vect n a
replicateVect 0 _ = []
replicateVect (S k) x = x :: replicateVect k x
While we often use recursion to create values of data types like
List
or Vect
, we also use recursion, when we consume such values.
For instance, here is a function for calculating the length of a list:
len : List a -> Nat
len [] = 0
len (_ :: xs) = 1 + len xs
Again, Idris can verify that len
is total, as the list we pass in
the recursive case is strictly smaller than the original list argument.
But when is a recursive function non-total? Here is an example: The
following function creates a sequence of values until the given
generation function (gen
) returns a Nothing
. Note, how we use
a state value (of generic type s
) and use gen
to calculate
a value together with the next state:
covering
unfold : (gen : s -> Maybe (s,a)) -> s -> List a
unfold gen vs = case gen vs of
Just (vs',va) => va :: unfold gen vs'
Nothing => []
With unfold
, Idris can't verify that any of its arguments is
converging towards the base case. It therefore rightfully
refuses to accept that unfold
is total. And indeed, the following
function produces an infinite list (so please, don't try to inspect
this at the REPL, as doing so will consume all your computer's
memory):
fiboHelper : (Nat,Nat) -> ((Nat,Nat),Nat)
fiboHelper (f0,f1) = ((f1, f0 + f1), f0)
covering
fibonacci : List Nat
fibonacci = unfold (Just . fiboHelper) (1,1)
In order to safely create a (finite) sequence of Fibonacci numbers, we need to make sure the function generating the sequence will stop after a finite number of steps, for instance by limiting the length of the list:
unfoldTot : Nat -> (gen : s -> Maybe (s,a)) -> s -> List a
unfoldTot 0 _ _ = []
unfoldTot (S k) gen vs = case gen vs of
Just (vs',va) => va :: unfoldTot k gen vs'
Nothing => []
fibonacciN : Nat -> List Nat
fibonacciN n = unfoldTot n (Just . fiboHelper) (1,1)
In order to demonstrate what tail recursion is about, we require
the following main
function:
main : IO ()
main = printLn . len $ replicateList 10000 10
If you have Node.js installed on your system, you might try the following experiment. Compile and run this module using the Node.js backend of Idris instead of the default Chez Scheme backend and run the resulting JavaScript source file with the Node.js binary:
idris2 --cg node -o test.js --find-ipkg -src/Tutorial/Folds.md
node build/exec/test.js
Node.js will fail with the following error message and a lengthy
stack trace: RangeError: Maximum call stack size exceeded
.
What's going on here? How can it be that main
fails with an
exception although it is provably total?
First, remember that a function
being total means that it will eventually produce a value
of the given type in a finite amount of time, given
enough resources like computer memory. Here, main
hasn't
been given enough resources as Node.js has a very small size
limit on its call stack. The call stack can be thought
of as a stack data structure (first in, last out), where
nested function calls are put. In case of recursive functions,
the stack size increases by one with every recursive function
call. In case of our main
function, we create and consume
a list of length 10'000, so the call stack will hold
at least 10'000 function calls before they are being invoked
and the stack's size is reduced again. This exceeds Node.js's
stack size limit by far, hence the overflow error.
Now, before we look at a solution how to circumvent this issue,
please note that this is a very serious and limiting source of
bugs when using the JavaScript backends of Idris. In Idris, having no
access to control structures like for
or while
loops, we always
have to resort to recursion in order to describe iterative
computations. Luckily (or should I say "unfortunately", since otherwise
this issue would already have been addressed with all seriousness),
the Scheme backends don't have this issue, as their stack size
limit is much larger and they perform all kinds of optimizations
internally to prevent the call stack from overflowing.
A recursive function is said to be tail recursive, if
all recursive calls occur at tail position: The last
function call in a (sub)expression. For instance, the following
version of len
is tail recursive:
lenOnto : Nat -> List a -> Nat
lenOnto k [] = k
lenOnto k (_ :: xs) = lenOnto (k + 1) xs
Compare this to len
as defined above: There, the last
function call is an invocation of operator (+)
, and
the recursive call happens in one of its arguments:
len (_ :: xs) = 1 + len xs
We can use lenOnto
as a utility to implement a tail recursive
version of len
without the additional Nat
argument:
lenTR : List a -> Nat
lenTR = lenOnto 0
This is a common pattern when writing tail recursive functions:
We typically add an additional function argument for accumulating
intermediary results, which is then passed on explicitly at each
recursive call. For instance, here is a tail recursive version
of replicateList
:
replicateListTR : Nat -> a -> List a
replicateListTR n v = go Nil n
where go : List a -> Nat -> List a
go xs 0 = xs
go xs (S k) = go (v :: xs) k
The big advantage of tail recursive functions is, that they can be easily converted to efficient, imperative loops by the Idris compiler, and are thus stack safe: Recursive function calls are not added to the call stack, thus avoiding the dreaded stack overflow errors.
main1 : IO ()
main1 = printLn . lenTR $ replicateListTR 10000 10
We can again run main1
using the Node.js backend. This time,
we use slightly different syntax to execute a function other than
main
(Remember: The dollar prefix is only there to distinghish
a terminal command from its output. It is not part of the
command you enter in a terminal sesssion.):
$ idris2 --cg node --exec main1 --find-ipkg src/Tutorial/Folds.md
10000
As you can see, this time the computation finished without overflowing the call stack.
Tail recursive functions are allowed to consist of (possibly nested) pattern matches, with recursive calls at tail position in several of the branches. Here is an example:
countTR : (a -> Bool) -> List a -> Nat
countTR p = go 0
where go : Nat -> List a -> Nat
go k [] = k
go k (x :: xs) = case p x of
True => go (S k) xs
False => go k xs
Note, how each invocation of go
is in tail position in
its branch of the case expression.
It is sometimes convenient to implement several related functions, which call each other recursively. In Idris, unlike in many other programming languages, a function must be declared in a source file before it can be called by other functions, as in general a function's implementation must be available during type checking (because Idris has dependent types). There are two ways around this, which actually result in the same internal representation in the compiler. Our first option is to write down the functions' declarations first with the implementations following after. Here's a silly example:
even : Nat -> Bool
odd : Nat -> Bool
even 0 = True
even (S k) = odd k
odd 0 = False
odd (S k) = even k
As you can see, function even
is allowed to call function odd
in
its implementation, since odd
has already been declared (but not yet
implemented).
If you're like me and want to keep declarations and implementations
next to each other, you can introduce a mutual
block, which has
the same effect. Like with other code blocks, functions in a mutual
block must all be indented by the same amount of whitespace:
mutual
even' : Nat -> Bool
even' 0 = True
even' (S k) = odd' k
odd' : Nat -> Bool
odd' 0 = False
odd' (S k) = even' k
Just like with single recursive functions, mutually recursive
functions can be optimized to imperative loops if all
recursive calls occur at tail position. This is the case
with functions even
and odd
, as can again be
verified at the Node.js backend:
main2 : IO ()
main2 = printLn (even 100000)
>> printLn (odd 100000)
$ idris2 --cg node --exec main2 --find-ipkg src/Tutorial/Folds.md
True
False
In this section, we learned about several important aspects of recursion and totality checking, which are summarized here:
-
In pure functional programming, recursion is the way to implement iterative procedures.
-
Recursive functions pass the totality checker, if it can verify that one of the arguments is getting strictly smaller in every recursive function call.
-
Arbitrary recursion can lead to stack overflow exceptions on backends with small stack size limits.
-
The JavaScript backends of Idris perform mutual tail call optimization: Tail recursive functions are converted to stack safe, imperative loops.
Note, that not all Idris backends you will come across in the wild will perform tail call optimization. Please check the corresponding documentation.
Note also, that most recursive functions in the core libraries (prelude
and base) do not yet make use of tail recursion. There is an
important reason for this: In many cases, non-tail recursive
functions are easier to use in compile-time proofs, as they
unify more naturally than their tail recursive counterparts.
Compile-time proofs are an important aspect of programming
in Idris (as we will see in later chapters), so there is a
compromise to be made between what performs well at runtime
and what works well at compile time. Eventually, the way
to go might be to provide two implementations for most
recursive functions with a transform rule telling the
compiler to use the optimized version at runtime whenever
programmers use the non-optimized version in their code.
Such transform rules have - for instance - already been
written for functions pack
and unpack
(which use
fastPack
and fastUnpack
at runtime; see the corresponding
rules in the following source file).
In these exercises you are going to implement several recursive functions. Make sure to use tail recursion whenever possible and quickly verify the correct behavior of all functions at the REPL.
-
Implement functions
anyList
andallList
, which returnTrue
if any element (or all elements in case ofallList
) in a list fulfills the given predicate:anyList : (a -> Bool) -> List a -> Bool allList : (a -> Bool) -> List a -> Bool
-
Implement function
findList
, which returns the first value (if any) fulfilling the given predicate:findList : (a -> Bool) -> List a -> Maybe a
-
Implement function
collectList
, which returns the first value (if any), for which the given function returns aJust
:collectList : (a -> Maybe b) -> List a -> Maybe b
Implement
lookupList
in terms ofcollectList
:lookupList : Eq a => a -> List (a,b) -> Maybe b
-
For functions like
map
orfilter
, which must loop over a list without affecting the order of elements, it is harder to write a tail recursive implementation. The safest way to do so is by using aSnocList
(a reverse kind of list that's built from head to tail instead of from tail to head) to accumulate intermediate results. Its two constructors areLin
and(:<)
(called the snoc operator). ModuleData.SnocList
exports two tail recursive operators called fish and chips ((<><)
and(<>>)
) for going fromSnocList
toList
and vice versa. Have a look at the types of all new data constructors and operators before continuing with the exercise.Implement a tail recursive version of
map
forList
by using aSnocList
to reassemble the mapped list. Use then the chips operator with aNil
argument to in the end convert theSnocList
back to aList
.mapTR : (a -> b) -> List a -> List b
-
Implement a tail recursive version of
filter
, which only keeps those values in a list, which fulfill the given predicate. Use the same technique as described in exercise 4.filterTR : (a -> Bool) -> List a -> List a
-
Implement a tail recursive version of
mapMaybe
, which only keeps those values in a list, for which the given function argument returns aJust
:mapMaybeTR : (a -> Maybe b) -> List a -> List b
Implement
catMaybesTR
in terms ofmapMaybeTR
:catMaybesTR : List (Maybe a) -> List a
-
Implement a tail recursive version of list concatenation:
concatTR : List a -> List a -> List a
-
Implement tail recursive versions of bind and
join
forList
:bindTR : List a -> (a -> List b) -> List b joinTR : List (List a) -> List a
The totality checker in Idris verifies, that at least one
(possibly erased!) argument in a recursive call converges towards
a base case. For instance, with natural numbers, if the base case
is zero (corresponding to data constructor Z
), and we continue
with k
after pattern matching on S k
, Idris can derive from
Nat
's constructors, that k
is strictly smaller than S k
and therefore the recursive call must converge towards a base case.
Exactly the same reasoning is used when pattern matching on a list
and continuing only with its tail in the recursive call.
While this works in many cases, it doesn't always go as expected. Below, I'll show you a couple of examples where totality checking fails, although we know, that the functions in question are definitely total.
Idris doesn't know anything about the internal structure of primitive data types. So the following function, although being obviously total, will not be accepted by the totality checker:
covering
replicatePrim : Bits32 -> a -> List a
replicatePrim 0 v = []
replicatePrim x v = v :: replicatePrim (x - 1) v
Unlike with natural numbers (Nat
), which are defined as an inductive
data type and are only converted to integer primitives during compilation,
Idris can't tell that x - 1
is strictly smaller than x
, and so it
fails to verify that this must converge towards the base case.
(The reason is, that x - 1
is implemented in terms of primitive
function prim__sub_Bits32
, which is built into the compiler and
must be implemented by each backend individually. The totality
checker knows about data types, constructors, and functions
defined in Idris, but not about (primitive) functions and foreign functions
implemented at the backends. While it is theoretically possible to
also define and use laws for primitive and foreign functions, this hasn't yet
been done for most of them.)
Since non-totality is highly contagious (all functions invoking a
partial function are themselves considered to be partial by the
totality checker), there is utility function assert_smaller
, which
we can use to convince the totality checker and still annotate our
functions with the total
keyword:
replicatePrim' : Bits32 -> a -> List a
replicatePrim' 0 v = []
replicatePrim' x v = v :: replicatePrim' (assert_smaller x $ x - 1) v
Please note, though, that whenever you use assert_smaller
to
silence the totality checker, the burden of proving totality rests
on your shoulders. Failing to do so can lead to arbitrary and
unpredictable program behavior (which is the default with most
other programming languages).
Below - as a demonstration - is a simple proof of Void
.
Void
is an uninhabited type: a type with no values.
Proofing Void
means, that we implement a function accepted
by the totality checker, which returns a value of type Void
,
although this is supposed to be impossible as there is no
such value. Doing so allows us to completely
disable the type system together with all the guarantees it provides.
Here's the code and its dire consequences:
-- In order to proof `Void`, we just loop forever, using
-- `assert_smaller` to silence the totality checker.
proofOfVoid : Bits8 -> Void
proofOfVoid n = proofOfVoid (assert_smaller n n)
-- From a value of type `Void`, anything follows!
-- This function is safe and total, as there is no
-- value of type `Void`!
exFalsoQuodlibet : Void -> a
exFalsoQuodlibet _ impossible
-- By passing our proof of void to `exFalsoQuodlibet`
-- (exported by the *Prelude* by the name of `void`), we
-- can coerce any value to a value of any other type.
-- This renders type checking completely useless, as
-- we can freely convert between values of different
-- types.
coerce : a -> b
coerce _ = exFalsoQuodlibet (proofOfVoid 0)
-- Finally, we invoke `putStrLn` with a number instead
-- of a string. `coerce` allows us to do just that.
pain : IO ()
pain = putStrLn $ coerce 0
Please take a moment to marvel at provably total function coerce
:
It claims to convert any value to a value of any other type.
And it is completely safe, as it only uses total functions in its
implementation. The problem is - of course - that proofOfVoid
should
never ever have been a total function.
In pain
we use coerce
to conjure a string from an integer.
In the end, we get what we deserve: The program crashes with an error.
While things could have been much worse, it can still be quite
time consuming and annoying to localize the source of such an error.
$ idris2 --cg node --exec pain --find-ipkg src/Tutorial/Folds.md
ERROR: No clauses
So, with a single thoughtless placement of assert_smaller
we wrought
havoc within our pure and total codebase sacrificing totality and
type safety in one fell swoop. Therefore: Use at your own risk!
Note: I do not expect you to understand all the dark magic at work in the code above. I'll explain the details in due time in another chapter.
Second note: Ex falso quodlibet, also called
the principle of explosion
is a law in classical logic: From a contradiction, any statement can be proven.
In our case, the contradiction was our proof of Void
: The claim that we wrote
a total function producing such a value, although Void
is an uninhabited type.
You can verify this by inspecting Void
at the REPL with :doc Void
: It
has no data constructors.
Below is an implementation of a rose tree. Rose trees can represent search paths in computer algorithms, for instance in graph theory.
record Tree a where
constructor Node
value : a
forest : List (Tree a)
Forest : Type -> Type
Forest = List . Tree
We could try and compute the size of such a tree as follows:
covering
size : Tree a -> Nat
size (Node _ forest) = S . sum $ map size forest
In the code above, the recursive call happens within map
. We know that
we are using only subtrees in the recursive calls (since we know how map
is implemented for List
), but Idris can't know this (teaching a totality
checker how to figure this out on its own seems to be an open research
question). So it will refuse to accept the function as being total.
There are two ways to handle the case above. If we don't mind writing a bit of otherwise unneeded boilerplate code, we can use explicit recursion. In fact, since we often also work with search forests, this is the preferable way here.
mutual
treeSize : Tree a -> Nat
treeSize (Node _ forest) = S $ forestSize forest
forestSize : Forest a -> Nat
forestSize [] = 0
forestSize (x :: xs) = treeSize x + forestSize xs
In the case above, Idris can verify that we don't blow up our trees behind its back as we are explicit about what happens in each recursive step. This is the safe, preferable way of going about this, especially if you are new to the language and totality checking in general.
However, sometimes the solution presented above is just too cumbersome to
write. For instance, here is an implementation of Show
for rose trees:
Show a => Show (Tree a) where
showPrec p (Node v ts) =
assert_total $ showCon p "Node" (showArg v ++ showArg ts)
In this case, we'd have to manually reimplement Show
for lists of trees:
A tedious task - and error-prone on its own. Instead, we resort to using the
mighty sledgehammer of totality checking: assert_total
. Needless to say
that this comes with the same risks as assert_smaller
, so be very
careful.
Implement the following functions in a provably total way without "cheating". Note: It is not necessary to implement these in a tail recursive way.
- Implement function
depth
for rose trees. This should return the maximal number ofNode
constructors from the current node to the farthest child node. For instance, the current node should be at depth one, all its direct child nodes are at depth two, their immediate child nodes at depth three and so on.
-
Implement interface
Eq
for rose trees. -
Implement interface
Functor
for rose trees. -
For the fun of it: Implement interface
Show
for rose trees. -
In order not to forget how to program with dependent types, implement function
treeToVect
for converting a rose tree to a vector of the correct size.Hint: Make sure to follow the same recursion scheme as in the implementation of
treeSize
. Otherwise, this might be very hard to get to work.
When looking back at all the exercises we solved in the section about recursion, most tail recursive functions on lists were of the following pattern: Iterate over all list elements from head to tail while passing along some state for accumulating intermediate results. At the end of the list, return the final state or convert it with an additional function call.
This is functional programming, and we'd like to abstract
over such reoccurring patterns. In order to tail recursively
iterate over a list, all we need is an accumulator function
and some initial state. But what should be the type of
the accumulator? Well, it combines the current state
with the list's next element and returns an updated
state: state -> elem -> state
. Surely, we can come
up with a higher-order function to encapsulate this
behavior:
leftFold : (acc : state -> el -> state) -> (st : state) -> List el -> state
leftFold _ st [] = st
leftFold acc st (x :: xs) = leftFold acc (acc st x) xs
We call this function a left fold, as it iterates over
the list from left to right (head to tail), collapsing (or
folding) the list until just a single value remains.
This new value might still be a list or other container type,
but the original list has been consumed from head to tail.
Note how leftFold
is tail recursive, and therefore all
functions implemented in terms of leftFold
are
tail recursive (and thus, stack safe!) as well.
Here are a few examples:
sumLF : Num a => List a -> a
sumLF = leftFold (+) 0
reverseLF : List a -> List a
reverseLF = leftFold (flip (::)) Nil
-- this is more natural than `reverseLF`!
toSnocListLF : List a -> SnocList a
toSnocListLF = leftFold (:<) Lin
The example functions we implemented in terms of leftFold
had
to always completely traverse the whole list, as every single
element was required to compute the result. This is not always
necessary, however. For instance, if you look at findList
from
the exercises, we could abort iterating over the list as soon
as our search was successful. It is not possible to implement
this more efficient behavior in terms of leftFold
: There,
the result will only be returned when our pattern match reaches
the Nil
case.
Interestingly, there is another, non-tail recursive fold, which reflects the list structure more naturally, we can use for breaking out early from an iteration. We call this a right fold. Here is its implementation:
rightFold : (acc : el -> state -> state) -> state -> List el -> state
rightFold acc st [] = st
rightFold acc st (x :: xs) = acc x (rightFold acc st xs)
Now, it might not immediately be obvious how this differs from leftFold
.
In order to see this, we will have to talk about lazy evaluation
first.
For some computations, it is not necessary to evaluate all function
arguments in order to return a result. For instance, consider
boolean operator (&&)
: If the first argument evaluates to False
,
we already know that the result is False
without even looking at
the second argument. In such a case, we don't want to unnecessarily evaluate
the second argument, as this might include a lengthy computation.
Consider the following REPL session:
Tutorial.Folds> False && (length [1..10000000000] > 100)
False
If the second argument were evaluated, this computation would most
certainly blow up your computer's memory, or at least take a very long
time to run to completion. However, in this case, the result False
is
printed immediately. If you look at the type of (&&)
, you'll see
the following:
Tutorial.Folds> :t (&&)
Prelude.&& : Bool -> Lazy Bool -> Bool
As you can see, the second argument is wrapped in a Lazy
type
constructor. This is a built-in type, and the details are handled
by Idris automatically most of the time. For instance, when passing
arguments to (&&)
, we don't have to manually wrap the values in
some data constructor.
A lazy function argument will only be evaluated at the moment it
is required in the function's implementation, for instance,
because it is being pattern matched on, or it is being passed
as a strict argument to another function. In the implementation
of (&&)
, the pattern match happens
on the first argument, so the second will only be evaluated if
the first argument is True
and the second is returned as the function's
(strict) result.
There are two utility functions for working with lazy evaluation:
Function delay
wraps a value in the Lazy
data type. Note, that
the argument of delay
is strict, so the following might take
several seconds to print its result:
Tutorial.Folds> False && (delay $ length [1..10000] > 100)
False
In addition, there is function force
, which forces evaluation
of a Lazy
value.
We will now learn how to make use of rightFold
and lazy evaluation
to implement folds, which can break out from iteration early.
Note, that in the implementation of rightFold
the result of
folding over the remainder of the list is passed as an argument
to the accumulator (instead of the result of invoking the accumulator
being used in the recursive call):
rightFold acc st (x :: xs) = acc x (rightFold acc st xs)
If the second argument of acc
were lazily evaluated, it would be possible
to abort the computation of acc
's result without having to iterate
till the end of the list:
foldHead : List a -> Maybe a
foldHead = force . rightFold first Nothing
where first : a -> Lazy (Maybe a) -> Lazy (Maybe a)
first v _ = Just v
Note, how Idris takes care of the bookkeeping of laziness most of the time. (It
doesn't handle the curried invocation of rightFold
correctly, though, so we
either must pass on the list argument of foldHead
explicitly, or compose
the curried function with force
to get the types right.)
In order to verify that this works correctly, we need a debugging utility
called trace
from module Debug.Trace
. This "function" allows us to
print debugging messages to the console at certain points in our pure
code. Please note, that this is for debugging purposes only and should
never be left lying around in production code, as, strictly speaking,
printing stuff to the console breaks referential transparency.
Here is an adjusted version of foldHead
, which prints "folded" to
standard output every time utility function first
is being invoked:
foldHeadTraced : List a -> Maybe a
foldHeadTraced = force . rightFold first Nothing
where first : a -> Lazy (Maybe a) -> Lazy (Maybe a)
first v _ = trace "folded" (Just v)
In order to test this at the REPL, we need to know that trace
uses unsafePerformIO
internally and therefore will not reduce during evaluation. We have to
resort to the :exec
command to see this in action at the REPL:
Tutorial.Folds> :exec printLn $ foldHeadTraced [1..10]
folded
Just 1
As you can see, although the list holds ten elements, first
is only called
once resulting in a considerable increase of efficiency.
Let's see what happens, if we change the implementation of first
to
use strict evaluation:
foldHeadTracedStrict : List a -> Maybe a
foldHeadTracedStrict = rightFold first Nothing
where first : a -> Maybe a -> Maybe a
first v _ = trace "folded" (Just v)
Although we don't use the second argument in the implementation of first
,
it is still being evaluated before evaluating the body of first
, because
Idris - unlike Haskell! - defaults to use strict semantics. Here's how this
behaves at the REPL:
Tutorial.Folds> :exec printLn $ foldHeadTracedStrict [1..10]
folded
folded
folded
folded
folded
folded
folded
folded
folded
folded
Just 1
While this technique can sometimes lead to very elegant code, always
remember that rightFold
is not stack safe in the general case. So,
unless your accumulator is not guaranteed to return a result after
not too many iterations, consider implementing your function
tail recursively with an explicit pattern match. Your code will be
slightly more verbose, but with the guaranteed benefit of stack safety.
Left and right folds share a common pattern: In both cases, we start
with an initial state value and use an accumulator function for
combining the current state with the current element. This principle
of combining values after starting from an initial value lies
at the heart of an interface we've already learned about: Monoid
.
It therefore makes sense to fold a list over a monoid:
foldMapList : Monoid m => (a -> m) -> List a -> m
foldMapList f = leftFold (\vm,va => vm <+> f va) neutral
Note how, with foldMapList
, we no longer need to pass an accumulator
function. All we need is a conversion from the element type to
a type with an implementation of Monoid
. As we have already seen
in the chapter about interfaces, there are many
monoids in functional programming, and therefore, foldMapList
is
an incredibly useful function.
We could make this even shorter: If the elements in our list already are of a type with a monoid implementation, we don't even need a conversion function to collapse the list:
concatList : Monoid m => List m -> m
concatList = foldMapList id
And here we are, finally, looking at a large pile of utility functions
all dealing in some way with the concept of collapsing (or folding)
a list of values into a single result. But all of these folding functions
are just as useful when working with vectors, with non-empty lists, with
rose trees, even with single-value containers like Maybe
, Either e
,
or Identity
. Heck, for the sake of completeness, they are even useful
when working with zero-value containers like Control.Applicative.Const e
!
And since there are so many of these functions, we'd better look out for
an essential set of them in terms of which we can implement all
the others, and wrap up the whole bunch in an interface. This interface
is called Foldable
, and is available from the Prelude
. When you
look at its definition in the REPL (:doc Foldable
), you'll see that
it consists of six essential functions:
foldr
, for folds from the rightfoldl
, for folds from the leftnull
, for testing if the container is empty or notfoldlM
, for effectful folds in a monadtoList
, for converting the container to a list of valuesfoldMap
, for folding over a monoid
For a minimal implementation of Foldable
, it is sufficient to only
implement foldr
. However, consider implementing all six functions
manually, because folds over container types are often performance
critical operations, and each of them should be optimized accordingly.
For instance, implementing toList
in terms of foldr
for List
just makes no sense, as this is a non-tail recursive function
running in linear time complexity, while a hand-written implementation
can just return its argument without any modifications.
In these exercises, you are going to implement Foldable
for different data types. Make sure to try and manually
implement all six functions of the interface.
-
Implement
Foldable
forCrud i
:data Crud : (i : Type) -> (a : Type) -> Type where Create : (value : a) -> Crud i a Update : (id : i) -> (value : a) -> Crud i a Read : (id : i) -> Crud i a Delete : (id : i) -> Crud i a
-
Implement
Foldable
forResponse e i
:data Response : (e, i, a : Type) -> Type where Created : (id : i) -> (value : a) -> Response e i a Updated : (id : i) -> (value : a) -> Response e i a Found : (values : List a) -> Response e i a Deleted : (id : i) -> Response e i a Error : (err : e) -> Response e i a
-
Implement
Foldable
forList01
. Use tail recursion in the implementations oftoList
,foldMap
, andfoldl
.data List01 : (nonEmpty : Bool) -> Type -> Type where Nil : List01 False a (::) : a -> List01 False a -> List01 ne a
-
Implement
Foldable
forTree
. There is no need to use tail recursion in your implementations, but your functions must be accepted by the totality checker, and you are not allowed to cheat by usingassert_smaller
orassert_total
.Hint: You can test the correct behavior of your implementations by running the same folds on the result of
treeToVect
and verify that the outcome is the same. -
Like
Functor
andApplicative
,Foldable
composes: The product and composition of two foldable container types are again foldable container types. Proof this by implementingFoldable
forComp
andProduct
:record Comp (f,g : Type -> Type) (a : Type) where constructor MkComp unComp : f (g a) record Product (f,g : Type -> Type) (a : Type) where constructor MkProduct fst : f a snd : g a
We learned a lot about recursion, totality checking, and folds in this chapter, all of which are important concepts in pure functional programming in general. Wrapping one's head around recursion takes time and experience. Therefore - as usual - try to solve as many exercises as you can.
In the next chapter, we are taking the concept of iterating over container types one step further and look at effectful data traversals.