In this section I'm going to show how to organize, install, and depend on larger Idris projects. We will have a look at Idris packages, the module system, visibility of types and functions, writing comments and doc strings, and using pack for managing our libraries.
This section should be useful for all readers who have already
written a bit of Idris code. We will not do any fancy type level
wizardry in here, but I'll demonstrate several concepts using
failing
code blocks, which you might not have seen before.
This rather new addition to the language
allows us to write code that is expected to fail during elaboration
(type checking). For instance:
failing "Can't find an implementation for FromString Bits8."
ohno : Bits8
ohno = "Oh no!"
As part of a failing block, we can give a substring of the compiler's error message for documentation purposes and to make sure the block fails with the expected error.
Every Idris source file defines a module, typically starting with a module header like the one below:
module Appendices.Projects
A module's name consists of several upper case identifiers separated
by dots, which must reflect the path of the .idr
file where the
module is stored. For instance, this module is stored in file
Appendices/Projects.md
, so the module's name is Appendices.Projects
.
"But wait!", I hear you say, "What about the parent folder(s) of Appendices
?
Why aren't those part of the module's name?" In order to understand this,
we must talk about the concept of the source directory. The source directory
is where Idris is looking for source files. It defaults to the
directory, from which the Idris executable is run. For instance, when
in folder src
of this project, you can open this source file like so:
idris2 Appendices/Projects.md
This will not work, however, if you try the same thing from this project's root folder:
$ idris2 src/Appendices/Projects.md
...
Error: Module name Appendices.Projects does not match file name "src/Appendices/Projects.md"
...
So, which folder names to include in a module name depends on the
parent folder we consider to be our source directory. It is common
practice to name the source directory src
, although this is not
mandatory (as I said above, the default is actually the directory,
from which we run Idris). It is possible to change the source directory
with the --source-dir
command-line option. The following works from
within this project's root directory:
idris2 --source-dir src src/Appendices/Projects.md
And the following would work from a parent directory
(assuming this tutorial is stored in folder tutorial
):
idris2 --source-dir tutorial/src tutorial/src/Appendices/Projects.md
Most of the time, however, you will specify an .ipkg
file for
your project (see later in this section) and define the source
directory there. Afterwards, you can use pack (instead of the idris2
executable) to start REPL sessions and load your source files.
You often need to import functions and data types from other
modules when writing Idris code. This can be done with an
import
statement. Here are several examples showing
how these might look like:
import Data.String
import Data.List
import Text.CSV
import public Appendices.Neovim
import Data.Vect as V
import public Data.List1 as L
The first two lines import modules from another package (we will learn
about packages below): Data.List
from the base package, which
will be installed as part of your Idris installation.
The second line imports module Text.CSV
from within our own source
directory src
. It is always possible to import modules that are part
of the same source directory as the file we are working on.
The third line imports module Appendices.Neovim
, again from our
own source directory. Note, however, that this import
statement comes
with an additional public
keyword. This allows us to re-export
a module, so that it is available from within other modules in addition
to the current module: If another module imports Appendices.Projects
,
module Appendices.Neovim
will be imported as well without the need
of an additional import
statement. This is useful when
we split some complex functionality across different modules and
want to import the lot via a single catch-all module
See module Control.Monad.State
in base for an example. You
can look at the Idris sources on GitHub or locally after cloning
the Idris2 project.
The base library can be found in the libs/base
subfolder.
It often happens that in order to make use of functions from some module
A
we also require utilities from another module B
, so A
should
re-export B
. For instance, Data.Vect
in base re-exports Data.Fin
,
because the latter is often required when working with vectors.
The fourth line imports module Data.Vect
, giving it a new name V
, to
be used as a shorter prefix. If you often need to disambiguate identifiers
by prefixing them with a module's name, this can help making your code
more concise:
vectSum : Nat
vectSum = sum $ V.fromList [1..10]
Finally, on the fifth line we publicly import a module and give it
a new name. This name will then be the one seen when we transitively
import Data.List1
via Appendices.Projects
. To see this, start
a REPL session (after type checking the tutorial)
without loading a source file from this project's root folder:
pack typecheck tutorial
pack repl
Now load module Appendices.Projects
and checkout the type
of singleton
:
Main> :module Appendices.Projects
Imported module Appendices.Projects
Main> :t singleton
Data.String.singleton : Char -> String
Data.List.singleton : a -> List a
L.singleton : a -> List1 a
As you can see, the List1
version of singleton
is now prefixed
with L
instead of Data.List1
. It is still possible to use the
"official" prefix, though:
Main> List1.singleton 12
12 ::: []
Main> L.singleton 12
12 ::: []
At times, we want to define several functions or data types
with the same name in a single module. Idris does not allow this,
because every name must be unique in its namespace, and the
namespace of a module is just the fully qualified module name.
However, it is possible to define additional namespaces within
a module by using the namespace
keyword followed by the name
of the namespace. All functions which should belong to this
namespace must then be indented by the same amount of whitespace.
Here's an example:
data HList : List Type -> Type where
Nil : HList []
(::) : (v : t) -> (vs : HList ts) -> HList (t :: ts)
head : HList (t :: ts) -> t
head (v :: _) = v
tail : HList (t :: ts) -> HList ts
tail (_ :: vs) = vs
namespace HVect
public export
data HVect : Vect n Type -> Type where
Nil : HVect []
(::) : (v : t) -> (vs : HVect ts) -> HVect (t :: ts)
public export
head : HVect (t :: ts) -> t
head (v :: _) = v
public export
tail : HVect (t :: ts) -> HVect ts
tail (_ :: vs) = vs
Function names HVect.head
and HVect.tail
as well as constructors
HVect.Nil
and HVect.(::)
would clash with functions and constructors
of the same names from the outer namespace (Appendices.Projects
), so
we had to put them in their own namespace. In order to be able to use
them from outside their namespace, they need to be exported (see the
section on visibility below). In case we need to disambiguate between
these names, we can prefix them with part of their namespace. For instance,
the following fails with a disambiguation error, because there
are several functions called head
in scope and it is not clear
from head
's argument (some data type supporting list syntax,
of which again several are in scope), which version we want:
failing "Ambiguous elaboration."
whatHead : Nat
whatHead = head [12,"foo"]
By prefixing head
with part of its namespace, we can resolve both
ambiguities. It is now immediately clear, that [12,"foo"]
must be
an HVect
, because that's the type of HVect.head
's argument:
thisHead : Nat
thisHead = HVect.head [12,"foo"]
In the following subsection I'll make use of namespaces to demonstrate the principles of visibility.
In order to use functions and data types outside of the module
or namespace they were defined in, we need to change
their visibility. The default visibility is private
:
Such a function or data type is not visible from outside
its module or namespace:
namespace Foo
foo : Nat
foo = 12
failing "Name Appendices.Projects.Foo.foo is private."
bar : Nat
bar = 2 * foo
To make a function visible, annotate it with the export
keyword:
namespace Square
export
square : Num a => a -> a
square v = v * v
This will allow us to invoke function square
from within
other modules or namespaces (after importing Appendices.Projects
):
OneHundred : Bits8
OneHundred = square 10
However, the implementation of square
will not be exported,
so square
will not reduce during elaboration:
failing "Can't solve constraint between: 100 and square 10."
checkOneHundred : OneHundred === 100
checkOneHundred = Refl
For this to work, we need to publicly export square
:
namespace SquarePub
public export
squarePub : Num a => a -> a
squarePub v = v * v
OneHundredAgain : Bits8
OneHundredAgain = squarePub 10
checkOneHundredAgain : OneHundredAgain === 100
checkOneHundredAgain = Refl
Therefore, if you need a function to reduce during elaboration,
annotate it with public export
instead of export
.
This is especially important if you use a function to compute
a type. Such function's must reduce during elaboration, otherwise they
are completely useless:
namespace Stupid
export
0 NatOrString : Type
NatOrString = Either String Nat
failing "Can't solve constraint between: Either String ?b and NatOrString."
natOrString : NatOrString
natOrString = Left "foo"
If we publicly export our type alias, everything type checks fine:
namespace Better
public export
0 NatOrString : Type
NatOrString = Either String Nat
natOrString : Better.NatOrString
natOrString = Left "bar"
Visibility of data types behaves slightly differently. If set to
private
(the default), neither the type constructor nor
the data constructors are visible outside of the namespace
they where defined in. If annotated with export
,
the type constructor is exported but not the data constructors:
namespace Export
export
data Foo : Type where
Foo1 : String -> Foo
Foo2 : Nat -> Foo
export
mkFoo1 : String -> Export.Foo
mkFoo1 = Foo1
foo1 : Export.Foo
foo1 = mkFoo1 "foo"
As you can see, we can use the type Foo
as well as
function mkFoo1
outside of namespace Export
. However,
we cannot use the Foo1
constructor to create a value
of type Foo
directly:
failing "Export.Foo1 is private."
foo : Export.Foo
foo = Foo1 "foo"
This changes when we publicly export the data type:
namespace PublicExport
public export
data Foo : Type where
Foo1 : String -> PublicExport.Foo
Foo2 : Nat -> PublicExport.Foo
foo2 : PublicExport.Foo
foo2 = Foo2 12
The same goes for interfaces: If they are publicly exported, the interface (a type constructor) plus all its functions are exported and you can write implementations outside the namespace where they where defined:
namespace PEI
public export
interface Sized a where
size : a -> Nat
Sized Nat where size = id
sumSizes : Foldable t => Sized a => t a -> Nat
sumSizes = foldl (\n,e => n + size e) 0
If they are not publicly exported, you will not be able to write implementations outside the namespace they were defined in (but you can still use the type and its functions in your code):
namespace EI
export
interface Empty a where
empty : a -> Bool
export
Empty (List a) where
empty [] = True
empty _ = False
failing
Empty Nat where
empty Z = True
empty (S _) = False
nonEmpty : Empty a => a -> Bool
nonEmpty = not . empty
Sometimes, it is necessary to access a private function in another module or namespace. This is possible from within child namespaces (for want of a better name): Modules and namespaces sharing the parent module's or namespace's prefix. For instance:
namespace Inner
testEmpty : Bool
testEmpty = nonEmpty (the (List Nat) [12])
As you can see, we can access function nonEmpty
from
within namespace Appendices.Projects.Inner
, although it is a
private function of module Appendices.Projects
. This is
even possible for modules: If we were to write a module
Data.List.Magic
, we'd have access to private utility functions
defined in module Data.List
in base. Actually, I did just that
and added module Data.List.Magic
demonstrating this quirk
of the Idris module system (go have a look!).
In general, this is a rather hacky way to work around visibility
constraints, but it can be useful at times.
In this subsection, we are going to have a look at a language construct
called a parameters
block,
which enables us to share a set of common read-only arguments (parameters)
across several functions, thus allowing us to write more concise function
signatures. I'm going to demonstrate their usability with a small
example program.
The most basic way to make some piece of external information available to a function is by passing it as an additional argument. In object-orientied programming, this principle is sometimes called dependency injection, and a lot of fuss is being made about it, and whole libraries and frameworks have been built around it.
In functional programming, we can be perfectly relaxed about all of this:
Need access to some configuration data for your application? Pass it as an additional
argument to your functions. Want to use some local
mutable state? Pass the corresponding IORef
as an additional
argument to your functions. This is both highly efficient
and incredibly simple. The only drawback it has: It can blow
up our function signatures. There is even a monad for abstracting over this
concept, called the Reader
monad. It can be found in module Control.Monad.Reader
,
in the base library.
In Idris, however, there is an even simpler approach: We can use proof search with auto implicit arguments for dependency injection. Here's some example code:
data Error : Type where
NoNat : String -> Error
NoBool : String -> Error
record Console where
constructor MkConsole
read : IO String
put : String -> IO ()
record ErrorHandler where
constructor MkHandler
handle : Error -> IO ()
getCount' : (h : ErrorHandler) => (c : Console) => IO Nat
getCount' = do
str <- c.read
case parsePositive str of
Nothing => h.handle (NoNat str) $> 0
Just n => pure n
getText' : (h : ErrorHandler) => (c : Console) => (n : Nat) -> IO (Vect n String)
getText' n = sequence $ replicate n c.read
prog' : ErrorHandler => (c : Console) => IO ()
prog' = do
c.put "Please enter the number of lines to read."
n <- getCount'
c.put "Please enter \{show n} lines of text."
ls <- getText' n
c.put "Read \{show n} lines and \{show . sum $ map length ls} characters."
The example program reads input from and prints output to some
Console
type, the implementation of which is left to the caller of the
function. This is a typical example of dependency injection: Our
IO
actions know nothing about how to read and write lines of text
(they do, for instance, not invoke putStrLn
or getLine
directly),
but rely on an external object to handle these tasks for us. This allows
us to use a simple mock object during testing, while using - for instance -
two file handles or data base connections when running the application
for real. These are typical techniques often found in object-oriented
programming, and in fact, this example emulates typical object-oriented
patterns in a purely functional programming language: A type like
Console
can be viewed as a class providing pieces of functionality
(methods read
and put
), and a value of type Console
can be viewed as an object of this class, on which we can invoke
those methods.
The same goes for error handling: Our error handler could just silently
ignore any error that occurs, or it could print it to stderr
and write
it to a log file at the same time. Whatever it does, our functions need
not care.
Note, however, that even in this very simple example we already
introduced two additional function arguments, and we can easily see
how in a real-world application we might need many more of those
and how this would quickly blow up our function signatures.
Luckily, there is a very clean and simple solution to this in
Idris: parameter
blocks. These allow us to specify lists
of parameters (unchanging function arguments) shared by all
functions listed inside the block. These arguments need then no longer
be listed with each function, thus decluttering our function signatures.
Here's the example from above in a parameter block:
parameters {auto c : Console} {auto h : ErrorHandler}
getCount : IO Nat
getCount = do
str <- c.read
case parsePositive str of
Nothing => h.handle (NoNat str) $> 0
Just n => pure n
getText : (n : Nat) -> IO (Vect n String)
getText n = sequence $ replicate n c.read
prog : IO ()
prog = do
c.put "Please enter the number of lines to read."
n <- getCount
c.put "Please enter \{show n} lines of text."
ls <- getText n
c.put "Read \{show n} lines and \{show . sum $ map length ls} characters."
We are free to list arbitrary arguments (implicit, explicit, auto-implicit,
named and unnamed) of any quantity as the parameters in a parameters
block, but it works best with implicit and auto implicit arguments. Explicit
arguments will have to be passed explicitly to functions in a parameter
block, even when invoking them from other parameter blocks with the
same explicit argument. This can be rather confusing.
To complete this example, here is a main function for running
the program. Note, how we explicitly assemble the Console
and
ErrorHandler
to be used when invoking prog
.
main : IO ()
main =
let cons := MkConsole (trim <$> getLine) putStrLn
err := MkHandler (const $ putStrLn "It didn't work")
in prog
Dependency injection via auto-implicit arguments is only one possible application of parameter blocks. They are useful in general whenever we have repeating argument lists for several functions.
Documentation is key. Be it for other programmers using a library we wrote, or for people (including our future selves) trying to understand our code, it is important to annotate our code with comments explaining non-trivial implementation details and docstrings describing the intent and functionality of exported data types and functions.
Writing a comment in an Idris source file is as simple as adding some text after two hyphens:
-- this is a truly boring comment
boring : Bits8 -> Bits8
boring a = a -- probably I should just use `id` from the Prelude
Whenever a line contains two hyphens that are not part of a string literal, the remainder of the line will be interpreted as a comment by Idris.
It is also possible to write multiline comments using delimiters
{-
and -}
:
{-
This is a multiline comment. It can be used to comment
out whole blocks of code, for instance if we get several
type errors in a larger source file.
-}
While comments are targeted at programmers reading and trying to understand our source code, doc strings provide documentation for exported functions and data types, explaining their intent and behavior to others.
Here's and example of a documented function:
||| Tries to extract the first two elements from the beginning
||| of a list.
|||
||| Returns a pair of values wrapped in a `Just` if the list has
||| two elements or more. Returns `Nothing` if the list has fewer
||| than two elements.
export
firstTwo : List a -> Maybe (a,a)
firstTwo (x :: y :: _) = Just (x,y)
firstTwo _ = Nothing
We can view a doc string at the REPL:
Appendices.Projects> :doc firstTwo
Appendices.Projects.firstTwo : List a -> Maybe (a,a)
Tries to extract the first two elements from the beginning
of a list.
Returns a pair of values wrapped in a `Just` if the list has
two elements or more. Returns `Nothing` if the list has fewer
than two elements.
Visibility: export
We can document data types and their constructors in a similar manner:
||| A binary tree index by the number of values it holds.
|||
||| @param `n` : Number of values stored in the `Tree`
||| @param `a` : Type of values stored in the `Tree`
public export
data Tree : (n : Nat) -> (a : Type) -> Type where
||| A single value stored at the leaf of a binary tree.
Leaf : (v : a) -> Tree 1 a
||| A branch unifying two subtrees.
Branch : Tree m a -> Tree n a -> Tree (m + n) a
Go ahead and have a look at the doc strings this generates at the REPL.
Documenting our code is very important. You will realize this, once you try to understand other people's code, or when you come back to a non-trivial piece of source code you wrote yourself a couple of months a ago and since then haven't looked at. If it is not well documented, this can be an unpleasant experience. Idris provides us with the tools necessary to document and annotate our code, so should take our time and do so. It is time well spent.
Idris packages allow us to assemble several modules into a logical unit and make them available to other Idris projects by installing the packages. In this section, we are going to learn about the structure of an Idris package and how to depend on other packages in our projects.
At the heart of an Idris package lies its .ipkg
file,
which is usually but not necessarily stored at a project's root directory.
For instance, for this Idris tutorial, there is file
tutorial.ipkg
at the tutorial's root directory.
An .ipkg
file consists
of several key-value pairs (most of them optional), the
most important of which I'll describe here. By far the easiest
way to setup a new Idris project is by letting pack or Idris itself
do it for you. Just run
pack new lib pkgname
to create the skeleton of a new library or
pack new bin appname
to setup a new application. In addition to creating a new directory plus
a suitable .ipkg
file, these commands will also add a pack.toml
file,
which we will discuss further below.
One of the most important aspects of an .ipkg
file is
listing the packages the library depends on in
the depends
field. Here is an example from the
hedgehog package,
a framework for writing property tests in Idris:
depends = base >= 0.5.1
, contrib >= 0.5.1
, elab-util >= 0.5.0
, pretty-show >= 0.5.0
, sop >= 0.5.0
As you can see, hedgehog depends on base and contrib, both of which are part of every Idris installation, but also on elab-util, a library of utilities for writing elaborator scripts (a powerful technique for creating Idris declarations by writing Idris code; it comes with its own lengthy tutorial if you are interested), sop, a library for generically deriving interface implementations via a sum of products representation (this is a useful thing you might want to check out some day), and pretty-show, a library for pretty printing Idris values (hedgehog makes use of this in case a test fails).
So, before you actually can use hedgehog to write some property tests for your own project, you will need to install the packages it depends on before installing hedgehog itself. Since this can be tedious to do manually, it is best let a package manager like pack handle this task for you.
You might want to specify a certain version (or a range) Idris should use for your dependencies. This might be useful if you have several versions of the same package installed and not all of them are compatible with your project. Here are several examples:
depends = base == 0.5.1
, contrib == 0.5.1
, elab-util >= 0.5.0
, pretty-show
, sop >= 0.5.0 && < 0.6.0
This will look for packages base and contrib of
exactly the given version, package elab-util of a version
greater than or equal to 0.5.0
, package pretty-show of
any version, and package sop of a version in the given
range. In all cases, if several installed versions of a
package match the specified range, the latest version will
be used.
In order to make use of this for your own packages, every
.ipkg
file should give the package's name and current
version:
package tutorial
version = 0.1.0
As I'll show below, package versions play a much less crucial role when using pack and its curated package collection. But even then you might want to consider restricting the versions of packages you accept in order to make sure you catch any braking changes introduced upstream.
Many if not most Idris packages available on GitHub are programming libraries: They implement some piece of functionality and make it available to all projects depending on the given package. This is unlike Idris applications, which are supposed to be compiled to an executable that can then be run on your computer. The Idris project itself provides both: The Idris compiler application, which we use to type check and build other Idris libraries and applications, and several libraries like prelude, base, and contrib, which provide basic data types and functions useful in most Idris projects.
In order to type check and install the modules you wrote in a
library, you must list them in the .ipkg
file's modules
field.
Here is an excerpt from the sop package:
modules = Data.Lazy
, Data.SOP
, Data.SOP.Interfaces
, Data.SOP.NP
, Data.SOP.NS
, Data.SOP.POP
, Data.SOP.SOP
, Data.SOP.Utils
Modules missing from this list will not be installed and hence will not be available for other packages depending on the sop library.
When the dependency graph of your project is getting large and complex, that is, when your project depends on many libraries, which themselves depend on yet other libraries, it can happen that two packages depend both on different - and, possibly, incompatible - versions of a third package. This situation can be nigh to impossible to resolve, and can lead to a lot of frustration when working with conflicting libraries.
It is therefore the philosophy of the pack project to avoid such a situation from the very beginning by making use of curated package collections. A pack collection consists of a specific Git commit of the Idris compiler and a set of packages, again each at a specific Git commit, all of which have been tested to work well and without issues together. You can see a list of packages available to pack here.
Whenever a project you are working on depends on one of the libraries listed
in pack's package collection, pack will automatically install it and all of its
dependencies for you. However, you might also want to depend on a library that
is not yet part of pack's collection. In that case, you must specify the
library in question in one of your pack.toml
files - the global
one found at $HOME/.pack/user/pack.toml
, or one local to your
current project or one of its parent directories (if any).
There, you can either specify a dependency local
to your system or a Git project (local or remote). An example for each is
shown below:
[custom.all.foo]
type = "local"
path = "/path/to/foo"
ipkg = "foo.ipkg"
[custom.all.bar]
type = "github"
url = "https://github.com/me/bar"
commit = "latest:main"
ipkg = "bar.ipkg"
As you can see, in both cases you have to specify where the project can be
found as well as the name and location of its .ipkg
file. In case of
a Git project, you also need to tell pack the commit it should use.
In the example above, we want to use the latest commit from the main
branch. We can use pack fetch
to fetch and store the currently latest
commit hash.
Entries like the ones given above are all that is needed to add support to
custom libraries to pack. You can now list these libraries as dependencies
in your own project's .ipkg
file and pack will automatically install them
for you.
This concludes our section about structuring Idris projects. We have learned
about several types of code blocks - failing
blocks for showing that a
piece of code fails to elaborate, namespace
s for having overloaded names
in the same source file, and parameter blocks for sharing lists of
parameters between functions - and how to group several source files into
an Idris library or application. Finally, we learned how to include
external libraries in an Idris project and how to use pack to help us
keep track of these dependencies.