Haskell error function

These are the four types of error handling that are standard and widely used in the Haskell world, as of 2014.

These are the four types of error handling that are standard and
widely used in the Haskell world, as of 2014.

There exist some other libraries like attempt (like Either, but where you don’t care or know the type of the exception) and control-monad-exception which implements a checked exception monad, etc. but the following ones are the standard ones seen in the wild.

Contents

  • 1 Exception
  • 2 Error (pure code)
  • 3 Error using the Either type
  • 4 No value using the Maybe type

Exception

An unexpected code path, one that rarely but can happen and can be handled if needs be. Typically caused by IO going wrong in some way, like the machine running out of swap and your program terminating, a file not existing, etc. The most basic functions are:

  • throw :: Exception e => e -> a
  • try :: Exception e => IO a -> IO (Either e a)

from Control.Exception.

Say you were writing a library to do things on reddit, you would define an exception type in your API:

data RedditException
  = Couldn'tUpvote
  | CommentFailed
  | LoginFailed !Text
  | ConnectFailure !HttpError
  deriving (Show,Typeable)

instance Exception RedditException

login :: Details -> IO ()
login details = do
  code <- tryLogin details
  case code of
    (200,val) -> setLoginContext val
    (_,err)   -> throw (LoginFailed err)

Then later you might write try (login ) or catch (login ) ((e :: RedditException) -> ) to handle the exception, if needed. Another exception might be a connection failure.

See Control.Exception for more detail and related functions.

Error (pure code)

Some pure functions contain calls to error, causing the evaluation to stop and crash:

head :: [a] -> a
head (x:_) = x
head []    = error "empty list"

Now suppose someone writes head ages and unexpectedly, ages is an empty list. If you are trying to take the head of an empty list your program logic is simply broken.

A solution here is to avoid the head function and use listToMaybe from Data.Maybe.

case listToMaybe ages of
  Nothing -> defaultAge
  Just first -> first

Alternatively, these errors can be caught from IO monad by using evaluate and try from Control.Exception. Ideally you should avoid partial functions like head, but sometimes this is not an option (e.g. when using an external library)

Error using the Either type

An expected return value: Either SomeError a The type indicates that an error is common, but doesn’t mean your program is broke. Rather that some input value wasn’t right. Typically used by parsers, consumers, that are pure and often error out.

data ParseError = ParseError !Pos !Text

So this type describes exactly what is going on:

runParser :: Parser a -> Text -> Either ParseError a

Take a parser of a, some text to parse, and return either a parser error or a parsed a. Typical usage would be:

main = do
  line <- getLine
  case runParser emailParser line of
    Right (user,domain) -> print ("The email is OK.",user,domain)
    Left (pos,err) -> putStrLn ("Parse error on " <> pos <> ": " <> err)

Or depending on the code one might opt instead to use a deconstructing function:

main = do
  line <- getLine
  either (putStrLn . ("Parse error: " <>) . show)
         (print . ("The email is OK.",))
         (runParser emailParser line)

No value using the Maybe type

There is simply no value there. This isn’t a problem in the system. It means you don’t care why there isn’t a value, or you already know. Maybe a

Typical example:

lookup :: Eq a => a -> [(a,b)] -> Maybe b

That is, take some key a that can be compared for equality, and a list of pairs where the first is the same type of the key a and maybe return the b of the pair, or nothing.

So one might pattern match on this:

case lookup name person of
  Nothing -> "no name specified"
  Just name -> "Name: " <> name

Or use a deconstructing function:

maybe "no name specified"
      ("Name: " <>)
      (lookup name person)

Again, depends on the code and the person writing it whether an explicit case is used. Often monads like Maybe are composed to make a chain of possibly-nothing values:

lookup "height" profile >>=
parseInt >>=
flip lookup recommendedSizes

So lookup a height from a person’s profile (might not exist), parse it as integer (might not parse), then use that as a key to lookup from a mapping list of age to clothes size: [(Int,Text)]

*** Exception: Prelude.head: empty list

In Error versus Exception, Henning Thielemann makes a clear
distinction between errors and exceptions in Haskell. Even
though not all Haskellers make such distinction, it’s useful to do so
in order to talk about the most basic ways to handle computations that
can go wrong and to discuss unsafe functions such as head,
fromJust, and (!!).

On the one hand, an error is a programming mistake such as a division
by zero, the head of an empty list, or a negative index. If we
identify an error, we remove it. Thus, we don’t handle errors, we
simply fix them. In Haskell, we have error and undefined to cause
such errors and terminate execution of the program.

On the other hand, an exception is something that can go wrong or
that simply doesn’t follow a rule but can happen. For example, reading
a file that does not exist or that is not readable. If we identify a
possible exception, we handle it, and not doing so would be an
error. In Haskell, we have pure (Maybe and Either, for instance)
and impure ways to handle exceptions.

A basic example of an error in Haskell is trying to get the head of an
empty list using the head function as defined in GHC.List:

head :: [a] -> a
head []    = error "head: empty list"
head (x:_) = x

One way to distinguish an error from an exception is to think in terms
of contracts and preconditions. In this case, there’s a precondition
in the documentation of the head function: the list must be
nonempty. This means that the first equation of head is supposed to
be dead or unreachable code. This way, if we are sure that a list has
at least one element, we can extract its head:

ghci> head [104,97,115,107,101,108,108]
104

Of course, the type signature of the head function says nothing
about such contract, which means that there’s nothing stopping us from
applying it to an empty list and therefore breaking the rules:

ghci> head []
*** Exception: head: empty list

As a comment in the definition of the fromJust function in the
Data.Maybe module says, «yuck.»

Even if trying to get the head of an empty list using head is an
error, it’s unsafe to do so. We can certainly treat it as an exception
and handle it with the Maybe data type:

data Maybe a = Nothing | Just a

In terms of exceptions, the Maybe type represents a computation that
can fail (in the case of a Nothing).

Let’s define a safe version of the head function:

maybeHead :: [a] -> Maybe a
maybeHead []    = Nothing
maybeHead (x:_) = Just x

The safety of the maybeHead function relies on its type
signature. We know that applying the function to a list can succeed:

ghci> maybeHead [104,97,115,107,101,108,108]
Just 104

Or fail:

ghci> maybeHead []
Nothing

A similar example is the fromJust function, which
extracts the element out of a Just:

fromJust :: Maybe a -> a
fromJust Nothing  = error "Maybe.fromJust: Nothing" -- yuck
fromJust (Just x) = x

Again, it’s an error to apply fromJust to a Nothing, but there’s
nothing stopping us from doing it. It’s best if we use a safe function
such as fromMaybe, which takes a default value:

fromMaybe :: a -> Maybe a -> a
fromMaybe d mx =
  case mx of
    Nothing -> d
    Just x  -> x

Yet another example is the lookup function, which looks up a key in
an association list or dictionary:

lookup :: Eq a => a -> [(a,b)] -> Maybe b
lookup _   []          = Nothing
lookup key ((x,y):xys)
  | key == x           = Just y
  | otherwise          = lookup key xys

In this case, applying lookup to an empty list or to a list which
doesn’t contain the key we’re looking for is not an error but an
exception, and the type signature of the function clearly specifies
that it can go wrong:

ghci> lookup 1 (zip [1..] [104,97,115,107,101,108,108])
Just 104
ghci> lookup 1 []
Nothing
ghci> lookup (-1) (zip [1..] [104,97,115,107,101,108,108])
Nothing

Now, let’s consider the elemAt (or (!!)) function, which is a list
index operator:

elemAt :: [a] -> Int -> a
elemAt xs     n | n < 0 = error "elemAt: negative index"
elemAt []     _         = error "elemAt: index too large"
elemAt (x:_)  0         = x
elemAt (_:xs) n         = elemAt xs (n - 1)

This function has two preconditions: the index must be nonnegative and
less than the length of the list. For example:

ghci> elemAt [104,97,115,107,101,108,108] 0
104
ghci> elemAt [104,97,115,107,101,108,108] (-8)
*** Exception: elemAt: negative index
ghci> elemAt [104,97,115,107,101,108,108] 8
*** Exception: elemAt: index too large

The elemAt function is as unsafe as head and fromJust in that
its type signature tells us nothing about the possibility of
failure. We could define a safe version using Maybe, but now we have
two different errors and it would be nice to provide additional
information about what went wrong, which we can accomplish with the
Either data type:

data Either a b = Left a | Right b

In terms of exceptions, a Left represents failure and a Right
represents success.

Here’s a safe version of elemAt using strings for exceptions:

eitherElemAt :: [a] -> Int -> Either String a
eitherElemAt _      n | n < 0 = Left "elemAt: negative index"
eitherElemAt []     _         = Left "elemAt: index too large"
eitherElemAt (x:_)  0         = Right x
eitherElemAt (_:xs) n         = eitherElemAt xs (n - 1)

We can safely apply this version of elemAt to the lists and indexes
we used before:

ghci> eitherElemAt [104,97,115,107,101,108,108] 0
Right 104
ghci> eitherElemAt [104,97,115,107,101,108,108] 8
Left "elemAt: index too large"
ghci> eitherElemAt [104,97,115,107,101,108,108] (-8)
Left "elemAt: negative index"

We know that there are only two things that can go wrong with elemAt
and that means that a String is too general for representing failure
in this case. We can be more specific by defining our own data type
and moving the error strings to the Show instance:

data ElemAtError
  = IndexTooLarge
  | NegativeIndex

instance Show ElemAtError where
  show IndexTooLarge = "elemAt: index too large"
  show NegativeIndex = "elemAt: negative index"

And we can use this data type for exceptions in another safe version
of elemAt:

errorElemAt :: [a] -> Int -> Either ElemAtError a
errorElemAt _      n | n < 0 = Left NegativeIndex
errorElemAt []     _         = Left IndexTooLarge
errorElemAt (x:_)  0         = Right x
errorElemAt (_:xs) n         = errorElemAt xs (n - 1)

Which we can safely apply to the same lists and indexes as before:

ghci> errorElemAt [104,97,115,107,101,108,108] 0
Right 104
ghci> errorElemAt [104,97,115,107,101,108,108] 8
Left elemAt: index too large
ghci> errorElemAt [104,97,115,107,101,108,108] (-8)
Left elemAt: negative index

It might be confusing to call our custom data type ElemAtError
instead of ElemAtException, but perhaps it’s a better name for
reflecting that we’re treating errors as exceptions for the sake of
safety.

It’s even more confusing once we figure out that the implementation of
error is actually raising an exception, but a very general one. Even
then, we can be more specific about the exception that gets thrown by
making our ElemAtError type an instance of Exception, as follows:

instance Exception ElemAtError

Instead of calling error, we can now throw the constructors of our
ElemAtError data type if there’s a problem with the index:

exceptionElemAt :: [a] -> Int -> a
exceptionElemAt _      n | n < 0 = throw NegativeIndex
exceptionElemAt []     _         = throw IndexTooLarge
exceptionElemAt (x:_)  0         = x
exceptionElemAt (_:xs) n         = exceptionElemAt xs (n - 1)

Which is very similar to what we had with the original elemAt
function:

ghci> exceptionElemAt [104,97,115,107,101,108,108] 0
104
ghci> exceptionElemAt [104,97,115,107,101,108,108] 8
*** Exception: elemAt: index too large
ghci> exceptionElemAt [104,97,115,107,101,108,108] (-8)
*** Exception: elemAt: negative index

But this time we can use the try function from Control.Exception,
which takes an action and returns either the result of that action or
an exception:

try :: Exception e => IO a -> IO (Either e a)

Since we’re using a very specific type to represent things that can go
wrong with the elemAt function, we can also be very specific about
what to do in case that something actually goes wrong:

tryExceptionElemAt :: Show a => [a] -> Int -> IO ()
tryExceptionElemAt xs n = do
  eitherExceptionElemAt <- try (evaluate (exceptionElemAt xs n))
  case eitherExceptionElemAt of
    Left  IndexTooLarge -> print IndexTooLarge
    Left  NegativeIndex -> print NegativeIndex
    Right elemAt        -> print elemAt

Given a list xs and an index n, we try to get the element at that
position using exceptionElemAt, and then use a case expression to
pattern match against the Either returned by try. In this case,
we’re simply printing the error or the result, which is not very
useful.

For now, we can try our lists and indexes, and see that we succesfully
handled everything that could go wrong:

ghci> tryExceptionElemAt [104,97,115,107,101,108,108] 0
104
ghci> tryExceptionElemAt [104,97,115,107,101,108,108] 8
elemAt: index too large
ghci> tryExceptionElemAt [104,97,115,107,101,108,108] (-8)
elemAt: negative index

It’s obviously better to use a safe function such as eitherElemAt or
errorElemAt, but exceptionElemAt gives us a good idea of how to
raise and catch exceptions in Haskell.

Finally, let’s consider reading a file using the readFile function,
which could fail for two reasons: the file doesn’t exist or the user
doesn’t have enough permissions to read it. We’ll use the tryJust
function, which is like try but takes a handler that allows us to
select which exceptions are caught:

tryJust :: Exception e => (e -> Maybe b) -> IO a -> IO (Either b a)

Here’s a function that tries to read a given file:

tryJustReadFile :: FilePath -> IO ()
tryJustReadFile filePath = do
  eitherExceptionFile <- tryJust handleReadFile (readFile filePath)
  case eitherExceptionFile of
   Left  er   -> putStrLn er
   Right file -> putStrLn file
  where
    handleReadFile :: IOError -> Maybe String
    handleReadFile er
      | isDoesNotExistError er = Just "readFile: does not exist"
      | isPermissionError   er = Just "readFile: permission denied"
      | otherwise              = Nothing

Given a file name, we try to read it with readFile and choose the
exceptions we’re going to handle with the handleReadFile
function. If the result of trying to read the file is a Left, we
print the exception message. If it’s a Right, we print the contents
of the file. The handleReadFile function returns appropriate
messages for errors that satisfy ioDoesNotExistError or
isPermissionError (which are exceptions in System.IO.Error), and
ignores any other exception.

Let’s try to read the contents of a file called haskell before
creating it:

ghci> tryJustReadFile "haskell"
readFile: does not exist

We don’t get an *** Exception because we handled the exception and
decided to simply print the exception message.

If we create the file and add something to it, and then try to read
its contents, we get the expected result:

$ echo [104,97,115,107,101,108,108] > haskell
ghci> tryJustReadFile "haskell"
[104,97,115,107,101,108,108]

And if we don’t have permissions to read the file, we get the expected
exception message:

ghci> tryJustReadFile "haskell"
readFile: permission denied

For more information about errors and exceptions in Haskell, see the
Error Handling chapter in Real World Haskell or the
Control.Exception module in the base package.

Haskell is a marvellous language, but there are some things I don’t like about it. My least favorite: Haskell has no fewer than 8
different APIs for reporting errors.

To make a bad situation worse, the choice of API varies between
popular libraries. To give a particularly unfortunate example,
Network.URI.parseURI and Network.HTTP.simpleHTTP report
errors in entirely different ways, turning a “download this URL” program into a page of code,
nearly half of which is devoted to dealing with various kinds of errors. (The rest is boilerplate that could be refactored into a nice wrapper.)

Let’s begin with a toy function, the simplest possible program that could
actually fail:

As every algebra student knows, we can’t divide by zero. Using this
function as our example, let’s take a look at all the different ways we
can implement error-reporting in Haskell.

1. Use error

The most popular way to report errors in Haskell is error,
which works as follows:

myDiv1 :: Float -> Float -> Float
myDiv1 x 0 = error "Division by zero"
myDiv1 x y = x / y

(This is similar to the error-reporting that’s built into integer division, actually.)
We can catch the error using Control.Exception.catch:

import qualified Control.Exception as E

example1 :: Float -> Float -> IO ()
example1 x y =
  E.catch (putStrLn (show (myDiv1 x y)))
          (err -> putStrLn (show err))

There are two limitations here: Our error is a free-form string, and we can
only catch errors from within the IO monad. So this only
works for smaller, informal programs.

2. Use Maybe a

What if we don’t have access to the IO monad? Well, we can
always use Haskell’s Maybe type to represent a computation
that might fail:

myDiv2 :: Float -> Float -> Maybe Float
myDiv2 x 0 = Nothing
myDiv2 x y = Just (x / y)

example2 x y =
  case myDiv2 x y of
    Nothing -> putStrLn "Division by zero"
    Just q  -> putStrLn (show q)

And thanks to the magic of monads, we can actually string together calls to
myDiv2 quite nicely:

divSum2 :: Float -> Float -> Float ->
           Maybe Float
divSum2 x y z = do
  xdy <- myDiv2 x y
  xdz <- myDiv2 x z
  return (xdy + xdz)

This approach to error-reporting is used by Network.URI.parseURI,
which is included with most Haskell compilers.

3. Use Either String a

But what if we want to have different error messages for different errors?
In that case, we can use Either to represent computations
which might return either an error message or a value:

myDiv3 :: Float -> Float ->
          Either String Float
myDiv3 x 0 = Left "Divison by zero"
myDiv3 x y = Right (x / y)

example3 x y =
  case myDiv3 x y of
    Left msg -> putStrLn msg
    Right q  -> putStrLn (show q)

Once again, we can treat Either String as a monad, allowing us
to combine these computations with a minimum of fuss:

divSum3 :: Float -> Float -> Float ->
           Either String Float
divSum3 x y z = do
  xdy <- myDiv3 x y
  xdz <- myDiv3 x z
  return (xdy + xdz)

This approach is used by many small programs that need to recover from
multiple kinds of non-IO errors, but I don’t think it appears in the
standard Haskell libraries.

4. Use Monad and fail to generalize 1–3

But what if we don’t care what
monad our caller is using? In that case, we can rewrite our code to work
in any monad m, and use fail to report the error.

myDiv4 :: (Monad m) => Float -> Float ->
          m Float
myDiv4 x 0 = fail "Divison by zero"
myDiv4 x y = return (x / y)

This will do the right thing if our caller is expecting
Maybe or Either:

example4a x y =
  case myDiv4 x y of
    Nothing -> putStrLn "Division by zero"
    Just q  -> putStrLn (show q)

example4b x y =
  case myDiv4 x y of
    Left msg -> putStrLn msg
    Right q  -> putStrLn (show q)

You can even use it with the IO monad!

example4c x y =
  E.catch (do q <- myDiv4 x y
              putStrLn (show q))
          (err -> putStrLn (show err))

This style of error-reporting is used widely in the standard libraries,
because it’s so flexible. You can find several examples in
Data.Map.

If you’re writing new Haskell libraries for public consumption, and all
your errors are strings, please consider using this error-reporting
method.

5. Use MonadError and a custom error type

What if we want to keep track of specific types of errors? In that case,
we could use the error Error type class:

import Control.Monad.Error

data CustomError = DivByZero
                 | OutOfCheese
                 | MiscError String

instance Show CustomError where
  show DivByZero = "Division by zero"
  show OutOfCheese = "Out of cheese"
  show (MiscError str) = str

instance Error CustomError where
  noMsg = MiscError "Unknown error"
  strMsg str = MiscError str

This works like the fail example, but instead of using error
messages, we use error values:

myDiv5 :: (MonadError CustomError m) =>
          Float -> Float -> m Float
myDiv5 x 0 = throwError DivByZero
myDiv5 x y = return (x / y)

example5 :: Float -> Float ->
            Either CustomError String
example5 x y =
  catchError (do q <- myDiv5 x y
                 return (show q))
             (err -> return (show err))

Note that this approach will work in almost any monad except the IO
monad. This approach will also fail if we start mixing libraries,
because each library will define its own set of errors, and we’ll need to
write code which converts them all to our preferred error type.

This approach is used by many popular libraries, including parsec. An
unusual variant of this approach is used by Network.HTTP, which
returns values of type IO (Either ConnError a), but doesn’t
make ConnError an instance of Error.

6. Use throwDyn in the IO monad

We can also use our custom error type in the IO monad, thanks
to throwDyn and catchDyn from
Control.Exception.

import Data.Typeable

data CustomError = DivByZero
                 | OutOfCheese
                 | MiscError String
  deriving (Typeable)

myDiv6 :: Float -> Float -> IO Float
myDiv6 x 0 = E.throwDyn DivByZero
myDiv6 x y = return (x / y)

example6 x y =
  E.catchDyn (do q <- myDiv6 x y
                 putStrLn (show q))
             handler
  where handler :: CustomError -> IO ()
        handler err = putStrLn (show err)

This relies on the fact that Exception is extensible, thanks
to its DynException constructor. If you’re working in the IO
monad, this approach is almost ideal for production code: You get support
for custom exception types, it’s easy to make libraries compatible, and
it’s compatible will all the other IO-based examples we’ve
seen.

Note that this very flexible approach could generalized to
non-IO monads by making Exception an instance of
Error, and writing appropriate versions of
throwDyn and catchDyn for
MonadError. This would actually be very convenient for people
who have to work with many libraries at once. But I’ll refrain from
actually providing code, because there’s too many error-reporting
conventions already!

7. Use ioError and catch

This is a close cousin to the throwDyn example above. It also
relies on Exception.

myDiv7 :: Float -> Float -> IO Float
myDiv7 x 0 = ioError (userError "Division by zero")
myDiv7 x y = return (x / y)

example7 :: Float -> Float -> IO ()
example7 x y =
  catch (do q <- myDiv7 x y
            putStrLn (show q))
        (err -> putStrLn (show err))

This one is pretty rare, as far as I can tell.

8. Go nuts with monad transformers

Several of the error-reporting approaches we’ve seen are based on
non-IO monads. Most of these can can be generalized to the
equivalent monad transformers. For example, Either CustomError
a
becomes:

type ErrIO = ErrorT CustomError IO

myDiv8 :: Float -> Float -> ErrIO Float
myDiv8 x 0 = throwError DivByZero
myDiv8 x y = return (x / y)

example8 x y = do
  result <- runErrorT (myDiv8 x y)
  case result of
    Left err -> putStrLn (show err)
    Right q  -> putStrLn (show q)

In the IO monad, this is usually a bad idea (though, again, I’ve seen it).
Instead, consider using throwDyn. But if you’re working with
a base monad other than IO, this can occasionally be useful.

A plea for consistency

Several of these error-reporting approaches offer interesting insights into
Haskell. And most of them have legitimate uses.

But I’d be just as happy if we could standardize on two or three of the above whenever possible!

Update: Don Stewart has started a thread about Haskell error-handling conventions on the Haskell library list.

It is one of
the most useful functions to report errors, error function takes a string
message and display the message to the console.

Following is
the signature of error function.

error :: [Char]
-> a

As you observe,
it has return type of a, so we can call this from any function, it always has
right type. When you call error function from your code, it immediately aborts
the execution and prints the error message.

errorUtil.hs

divide x 0 = error "You can't divide a number by 0"
divide x y = x / y

When you call the divide method by
passing second argument 0, it throws an Exception immediately.

*Main> :load errorUtil.hs
[1 of 1] Compiling Main             ( errorUtil.hs, interpreted )
Ok, modules loaded: Main.
*Main> 
*Main> divide 10 0
*** Exception: You can't divide a number by 0
*Main> 

We can rewrite
above program using Maybe data type like below.

errorUtil.hs

divide x 0 = Nothing
divide x y = Just (x / y)

*Main> :load errorUtil.hs
[1 of 1] Compiling Main             ( errorUtil.hs, interpreted )
Ok, modules loaded: Main.
*Main> 
*Main> divide 10 0
Nothing
*Main> 
*Main> divide 10 123456
Just 8.100051840331778e-5
*Main>

Whenever you want to throw an error, try
to use Maybe.

Another
example

Suppose, you car going to implement
factorial function, usually factorial os not applicable for -ve numbers, so
when user call factorial function with –ve numbers, your function should throw
an error.

Sample.hs

factorial :: Int -> Int
factorial num
    | (num == 0) = 1
    | (num > 0) = num * factorial(num-1)
    | otherwise = error "factorial is only defined for +ve integers"

Prelude> :load Sample.hs
[1 of 1] Compiling Main             ( Sample.hs, interpreted )
Ok, modules loaded: Main.
*Main> 
*Main> factorial 0
1
*Main> factorial 5
120
*Main> factorial (-9)
*** Exception: factorial is only defined for +ve integers

Previous                                                

Next                                                
Home

Exceptions

Exceptions
Creative Commons image by gynti

Most languages make a distinction between values that represent failure
(errors) and the mechanism to abort computations and unwind the stack
(exceptions.) Haskell is unique in that the type system makes it safe
and easy to build failure into types instead of lumping everything into
something like NULL or -1.

It also stands out by supporting exceptions through a library of
functions and types instead of directly in the syntax of the language.
The fact that there’s no dedicated keywords for exceptions might seem
weird until you discover how flexible and expressive Haskell is.

This presentation aims to show how closely related errors and exceptions
are, and how to keep them separate.

Haskell Exceptions

  • No dedicated syntax

  • Very limited in Haskell 2010

  • Expanded by GHC

Type Inhabitants

In order to understand how exceptions work we first need to talk about
type inhabitants and bottom.

The Bool type is a very simple type that doesn’t use any type
variables and only has 2 data constructors. This means that there can
only be 2 unique values for this type. Or does it?

Bottom (⊥)

All types in Haskell support a value called bottom. This means that the
Bool type actually has 3 possible values. Exceptions and
non-termination are examples of bottom values.

The list below illustrates that bottom values aren’t a problem until
they’re evaluated.

bools :: [Bool]
bools = [False, True, undefined]

Creating ⊥

Haskell includes 2 functions for creating bottom values: undefined and
error. In GHC undefined is implemented using error and error
throws an exception.

You can create a bottom value directly by writing a non-terminating
function.

-- Raises exceptions in GHC:
undefined :: a
error :: String -> a

-- Non-termination:
badBoy :: a
badBoy = badBoy

Catching Exceptions (Inline)

Catching exceptions is straight forward as long as you remember that you
can only catch exceptions in the IO monad.

inline :: Int -> IO Int
inline x =
  catch (shortFuse x)
        ((_ex :: StupidException) -> return 0)

The second argument to catch is a function to handle a caught
exception. GHC uses the type of the function to determine if it can
handle the caught exception. If GHC can’t infer the type of the function
you’ll need to add a type annotation like in the example above. This
requires the ScopedTypeVariables extension.

If you want to handle more than one exception type you’ll need to use
something like the catches function. To catch all possible exceptions
you can catch the SomeException type since it’s at the top of the
exception type hierarchy. This isn’t generally wise and instead you
should use something like the bracket or finally functions.

One interesting thing to note is that GHC differs from Haskell 2010 with
regards to catch. Haskell 2010 states that catch should catch all
exceptions regardless of their type. Probably because those exceptions
would all be IOErrors.

Catching Exceptions (w/ a Helper)

Below is another example of catching exceptions. This time a helper
function with an explicit type signature is used to handle the
exception. This allows us to avoid inline type annotations and the
ScopedTypeVariables extension.

helper :: Int -> IO Int
helper x =
  catch (shortFuse x)
        handler
  where
    handler :: StupidException -> IO Int
    handler _ = return 0

Throwing Exceptions

Throwing exceptions is really easy, although you must be in the IO
monad to do so. Haskell 2010 provides a set of functions for creating
and raising exceptions.

Haskell 2010:

-- Create an exception.
userError :: String -> IOError

-- Raise an exception.
ioError :: IOError -> IO a

-- fail from the IO Monad is both.
fail = ioError . userError :: String -> IO a

Throwing Exceptions

GHC adds on to Haskell 2010 with functions like throwIO and throw.
The throw function allows you to raise an exception in pure code and
is considered to be a misfeature.

GHC:

shortFuse :: Int -> IO Int
shortFuse x =
  if x > 0
    then return (x - 1)
    else throwIO StupidException

Throwing from Pure Code

As mentioned above, GHC adds a throw function that allows you to raise
an exception from pure code. Unfortunately this makes it very difficult
to catch.

naughtyFunction :: Int -> Int
naughtyFunction x =
  if x > 0
    then x - 1
    else throw StupidException

Catching Exceptions From throw

You need to ensure that values are evaluated because they might contain
unevaluated exceptions.

In the example below you’ll notice the use of the «$!» operator. This
forces evaluation to WHNF so exceptions don’t sneak out of the catch
function as unevaluated thunks.

forced :: Int -> IO Int
forced x =
  catch (return $! naughtyFunction x)
        ((_ex :: StupidException) -> return 0)

Creating Custom Exceptions

Any type can be used as an exception as long as it’s an instance of the
Exception type class. Deriving from the Typeable class makes
creating the Exception instance trivial. However, using Typeable
means you need to enable the DeriveDataTypeable GHC extension.

You can also automatically derive the Show instance as with most other
types, but creating one manually allows you to write a more descriptive
message for the custom exception.

data StupidException = StupidException
  deriving (Typeable)

instance Show StupidException where
  show StupidException =
    "StupidException: you did something stupid"

instance Exception StupidException

Threads and Exceptions

Concurrency greatly complicates exception handling. The GHC runtime uses
exceptions to send various signals to threads. You also need to be very
careful with unevaluated thunks exiting from a thread when it
terminates.

Additional problems created by concurrency:

  • Exceptions are used to kill threads

  • Exceptions are asynchronous

  • Need to mask exceptions in critical code

  • Probably don’t want unevaluated exceptions leaking out

There’s a Package For That

Just use the async package.

Errors (Instead of Exceptions)

  • Explicit

  • Checked by the compiler

  • Way better than NULL or -1

Stupid

Haskell is great about forcing programmers to deal with problems at
compile time. That said, it’s still possible to write code which may not
work at runtime. Especially with partial functions.

The function below will throw an exception at runtime if it’s given an
empty list. This is because head is a partial function and only works
with non-empty lists.

stupid :: [Int] -> Int
stupid xs = head xs + 1

Better

Prefer errors to exceptions.

A better approach is to avoid the use of head and pattern match the
list directly. The function below is total since it can handle lists
of any length (including infinite lists).

Of course, if the list or its head is bottom (⊥) then this function will
throw an exception when the patterns are evaluated.

better :: [Int] -> Maybe Int
better []    = Nothing
better (x:_) = Just (x + 1)

Reusing Existing Functions

This is the version I like most because it reuses existing functions
that are well tested.

The listToMaybe function comes with the Haskell Platform. It takes a
list and returns its head in a Just. If the list is empty it returns
Nothing. Alternatively you can use the headMay function from the
Safe package.

reuse :: [Int] -> Maybe Int
reuse = fmap (+1) . listToMaybe

Providing Error Messages

Another popular type when dealing with failure is Either which allows
you to return a value with an error. It’s common to include an error
message using the Left constructor.

Beyond Maybe and Either it’s also common to define your own type
that indicates success or failure. We won’t discuss this further.

withError :: [Int] -> Either String Int
withError []    = Left "this is awkward"
withError (x:_) = Right (x + 1)

Maybe and Either

Maybe and Either are also monads!

If you have several functions that return one of these types you can use
do notation to sequence them and abort the entire block on the first
failure. This allows you to write short code that implicitly checks the
return value of every function.

Things tend to get a bit messy when you mix monads though…

Maybe and IO

The code below demonstrates mixing two monads, IO and Maybe. Clearly
we want to be able to perform I/O but we also want to use the Maybe
type to signal when a file doesn’t exist. This isn’t too complicated,
but what happens when we want to use the power of the Maybe monad to
short circuit a computation when we encounter a Nothing?

size :: FilePath -> IO (Maybe Integer)
size f = do
  exist <- fileExist f
  
  if exist
    then Just <$> fileSize f
    else return Nothing

Maybe and IO

Because IO is the outer monad and we can’t do without it, we sort of
lose the superpowers of the Maybe monad.

add :: FilePath -> FilePath -> IO (Maybe Integer)
add f1 f2 = do
  s1 <- size f1
  case s1 of
    Nothing -> return Nothing
    Just x  -> size f2 >>= s2 ->
      case s2 of
        Nothing -> return Nothing
        Just y  -> return . Just $ x + y

MaybeT

Using the MaybeT monad transformer we can make IO the inner monad
and restore the Maybe goodness. We don’t really see the benefit in the
sizeT function but note that its complexity remains about the same.

sizeT :: FilePath -> MaybeT IO Integer
sizeT f = do
  exist <- lift (fileExist f)
           
  if exist
    then lift (fileSize f)
    else mzero

MaybeT

The real payoff comes in the addT function. Compare with the add
function above.

addT :: FilePath -> FilePath -> IO (Maybe Integer)
addT f1 f2 = runMaybeT $ do
  s1 <- sizeT f1
  s2 <- sizeT f2
  return (s1 + s2)

Either and IO

This version using Either is nearly identical to the Maybe version
above. The only difference is that we can now report the name of the
file which doesn’t exist.

size :: FilePath -> IO (Either String Integer)
size f = do
  exist <- fileExist f

  if exist
    then Right <$> fileSize f
    else return . Left $ "no such file: " ++ f

Either and IO

To truly abort the add function when one of the files doesn’t exist
we’d need to replicate the nested case code from the Maybe example.
Here I’m cheating and using Either‘s applicative instance. However,
this doesn’t short circuit the second file test if the first fails.

add :: FilePath -> FilePath -> IO (Either String Integer)
add f1 f2 = do
  s1 <- size f1
  s2 <- size f2
  return ((+) <$> s1 <*> s2)

ErrorT

The ErrorT monad transformer is to Either what MaybeT is to
Maybe. Again, changing size to work with a transformer isn’t that
big of a deal.

sizeT :: FilePath -> ErrorT String IO Integer
sizeT f = do
  exist <- lift $ fileExist f

  if exist
    then lift $ fileSize f
    else fail $ "no such file: " ++ f

ErrorT

But it makes a big difference in the addT function.

addT :: FilePath -> FilePath -> IO (Either String Integer)
addT f1 f2 = runErrorT $ do
  s1 <- sizeT f1
  s2 <- sizeT f2
  return (s1 + s2)

Hidden/Internal ErrorT

The really interesting thing is that we didn’t actually have to change
size at all. We could have retained the non-transformer version and
used the ErrorT constructor to lift the size function into the
transformer. The MaybeT constructor can be used in a similar way.

addT' :: FilePath -> FilePath -> IO (Either String Integer)
addT' f1 f2 = runErrorT $ do
  s1 <- ErrorT $ size f1
  s2 <- ErrorT $ size f2
  return (s1 + s2)

Turning Exceptions into Errors

The try function allows us to turn exceptions into errors in the form
of IO and Either, or as you now know, ErrorT.

It’s not hard to see how flexible exception handling in Haskell is, in
no small part due to it not being part of the syntax. Non-strict
evaluation is the other major ingredient.

try :: Exception e => IO a -> IO (Either e a)

-- Which is equivalent to:
try :: Exception e => IO a -> ErrorT e IO a

Final Thought

  • Prefer Errors to Exceptions!

  • Don’t Write/Use Partial Functions!

Show me how you handle errors and I’ll tell you what programmer you are. Error handling is fundamental to all programming. Language support for error handling varies from none whatsoever (C) to special language extensions (exceptions in C++, Java, etc.). Haskell is unique in its approach because it’s expressive enough to let you build your own error handling frameworks. Haskell doesn’t need built-in exception support: it implements it in libraries.

We’ve seen one way of dealing with errors: calling the error function that terminates the program. This works fine for runtime assertions, which alert us to bugs in the program. But many «errors» are actually expected. We’ve seen one such example: Data.Map.lookup fails when called with a key that’s not present in the map. The possibility of failure is encoded in the Maybe return type of lookup. It’s interesting to compare this with similar functions in other languages. In C++, for instance, std::map defines multiple accessor functions varying only in their failure behavior:

  • at throws an exception
  • find returns an empty iterator
  • operator[] inserts a dummy value using a default constructor for it.

The last one is the most bizarre of the three. Since the array access operator must return a reference, even if the key is not found, it has to create a dummy value. The behavior of at is potentially dangerous if the client forgets to catch the exception. Of the three, find is the safest, since the return type suggests to the client iteration rather than straight dereference; and iteration normally starts with checking for termination.

In functional programming, failure is another way of saying that the computation is partial; that is, not defined for all values of arguments. In Haskell we always try to use total functions — functions defined for all values of their arguments. If the domain of a computation is known at compile time, we can often define a restricted data type to be used for its arguments; for instance, an enumeration instead of an integer. This is not always possible or feasible, so the other option is to turn a partial function into a total function by changing its return type. C++ method find does this trick by returning an iterator (it always returns an iterator for any value of its argument); Haskell lookup does it by returning a Maybe.

This trick of returning a different type in order to turn a non-functional computation into a pure function is used extensively in Haskell and finds its full expression in monads.

«Computation» or «notion of computation» often describes what we want to do in terms that may or may not have immediate pure function implementation: that is values in, values out. Transforming computations into functions is what functional programming is all about.

Either May Be Better than Maybe

The trick with Maybe is a bit limited. All we know about the failure is that it occurred. In practice, we’d like to know more. So the next step in error handling is to use the Either data structure defined in the Prelude:

data Either a b = Left a | Right b

Either is parameterized by two types, not one. A value of the Either type either contains a value of type a or of type b. We can discriminate between the two possibilities by pattern matching on either constructor. Either is mostly used as a generalization of Maybe in which Left not only encodes failure but is accompanied by an error message. Right encodes success and the accompanying value. So a is often fixed to be a String.

Here’s how we can use Either to encode the failure of a lookup without terminating the whole program:

lookUp :: String -> SymTab -> Either String (Double, SymTab)
lookUp str symTab = 
    case M.lookup str symTab of
      Just v  -> Right (v, symTab)
      Nothing -> Left ("Undefined variable " ++ str)

Now the burden is on the caller to pattern match the result of any such call and either continue with the successful result, or handle the failure. More often than not, the error is just passed unchanged to the caller, and so on; and somewhere at the top of the call tree, displayed to the user (remember, the very top is always an IO function).

It’s a know phenomenon that error propagation spreads like a disease throughout the code. You could have seen that in C and sometimes even in C++, when exceptions are not an option. And indeed, you can create the same mess in Haskell. Here’s a naive implementation of the evaluator of SumNode, which checks for errors that can happen in the evaluation of its children, and propagates them if necessary:

evaluate :: Tree -> SymTab -> Either String (Double, SymTab)

evaluate (SumNode op left right) symTab = 
    case evaluate left symTab of
      Left msg -> Left msg
      Right (lft, symTab') ->
        case evaluate right symTab' of
          Left msg -> Left msg
          Right (rgt, symTab'') ->
            case op of
              Plus  -> Right (lft + rgt, symTab'')
              Minus -> Right (lft - rgt, symTab'')

Notice the creeping indentation. I’m showing you this code so that you know what awaits you if you refuse to learn about monads. However ugly this code is, it works, so the principle is right. We just need some monadic sugar to make it palatable.

Errors are propagated up, until they are finally dealt with in the main IO loop:

loop symTab = do
   str <- getLine
   if null str
   then
      return ()
   else
      let toks = tokenize str
          tree = parse toks
      in
          case evaluate tree symTab of
          Left msg -> do
              putStrLn $ "Error: " ++ msg
              loop symTab -- use old symTab
          Right (v, symTab') -> do
              print v
              loop symTab'

The nice thing is that, since we keep around the old copy of the symbol table, we can start the next iteration after a failure as if nothing happened. It’s as if one transaction had been aborted and another started from the the same state.

Abstracting the Either Pattern

Looking at the code above, you can see a pattern arising. First of all, we have all these functions that return their results wrapped in the Either type. Every time we get an Either value, we pattern match it and fork the computation: When the result is Right we make the value in it available to the rest of the computation. If the result is Left, we skip the rest of the computation and propagate the error. We would like to capture this pattern. We’d like to isolate the boilerplate code, leaving «holes» for the client-provided variables. One such hole is to be filled by the initial Either value. The tricky part is the «skip the rest of the computation» part of the pattern. This can be done if the pattern has another hole for «the rest of the computation» so that it can either execute it or not.

Let’s first identify this pattern in the evaluation of a unary node:

The first pattern filler is the value returned by evaluate tree symTab. The «rest of the calculation» filler is the code:

case op of
   Plus  -> Right ( x, symTab')
   Minus -> Right (-x, symTab')

We can abstract this code into a lambda function. Notice that this code uses x and symTab', which were extracted from the first filler. So when we abstract it, the resulting lambda function should take this tuple as an argument:

(x, symTab') -> case op of
   Plus  -> Right ( x, symTab')
   Minus -> Right (-x, symTab')

The parameter op is captured by the lambda from the outer environment — remember, lambdas are closures: they can capture values from the environment in which they are defined.

Let’s call this new pattern bindE and implement it as a function of two arguments. Here’s how we would call it from the evaluator:

evaluate (UnaryNode op tree) symTab =
    bindE (evaluate tree symTab)
          ((x, symTab') ->
              case op of
                Plus  -> Right ( x, symTab')
                Minus -> Right (-x, symTab'))

This is just a straight call to bindE with two arguments — the second one being a multi-line lambda function.

The implementation of bindE is pretty straightforward. It just picks the common parts of the pattern and combines them in one function. It deals with the tedium of pattern matching the first argument and propagating the error. In case of success, it calls the continuation with the right arguments:

bindE :: Either String (Double, SymTab) 
      -> ((Double, SymTab) -> Either String (Double, SymTab))
      -> Either String (Double, SymTab)
bindE ev k =
    case ev of
      Left msg -> Left msg
      Right (x, symTab') -> k (x, symTab')

Have a good look at the signature of bindE. Again, the second argument to bindE is the function that encapsulates the rest of the computation. It takes a pair (Double, SymTab) and returns another pair encapsulated in Either. This function argument is often called a continuation, because it continues the computation. The corresponding argument is often named k (think kontinuation; c would be too generic a name).

This pattern can be further abstracted by parameterizing it on the type of the contents of Right. In fact we need this if we want to include addSymbol in our scheme (see the unit in the return type):

addSymbol :: String -> Double -> SymTab -> Either String ((), SymTab)

If we want to bind the result of, say, UnaryNode evaluator to addSymbol, the input will be of type Either String (Double, SymTab) and the result of type Either String ((), SymTab), so bindE really needs two type parameters, a and b:

bindE :: Either String a
      -> (a -> Either String b)
      -> Either String b
bindE ev k =
    case ev of
      Left msg -> Left msg
      Right v -> k v

Notice that as long as the client only uses bindE to deal with Either values, they don’t have to deal with them directly (e.g., pattern match), except when they have to create them; and in the final unpacking, when they want to display the error message. The creation of Either values can also be abstracted by providing two more functions, return and fail:

return :: a -> Either String a
return x = Right x

fail :: String -> Either a
fail msg = Left msg

We can use these functions in the implementation of lookUp

lookUp :: String -> SymTab -> Either String (Double, SymTab)
lookUpb str symTab = 
    case M.lookup str symTab of
      Just v  -> return (v, symTab)
      Nothing -> fail ("Undefined variable " ++ str)

We can also replace all Right constructors that are used to return values by calls to return. This way the client code can be written without any knowledge of the fact that values are encapsulated in the Either type.

Here’s, for instance, an improved version of the UnaryNode evaluator, this time with no mention of Either or any of its constructors:

evaluate (UnaryNode op tree) symTab =
    bindE (evaluate tree symTab)
          ((x, symTab') ->
              case op of
                Plus  -> return ( x, symTab')
                Minus -> return (-x, symTab'))

Maybe it doesn’t seem like these transformations buy us much in code size or readability, but they are a step in the direction of hiding the details of error handling.

But we can do even better, once we realize that we have just defined a monad.

The Either Monad

A monad in Haskell is defined by a type constructor (a type parameterized by another type) and two functions, bind and return (optionally, fail). In our case, the type constructor is based on the Either a b type, with the first type variable fixed to String (yes, it’s exactly like currying a type function). Let’s formalize it by defining a new (paremeterized) type:

newtype Evaluator a = Ev (Either String a)

The newtype declaration is a compromise between type alias and a full-blown data declaration. It can be used to define data types that have only one data constructor that takes only one argument — Ev in this case. (newtype is preferred over similar data declaration because of slightly better performance characteristics.)

The first change to our code will be to replace the type signature of evaluate from:

evaluate :: Tree -> SymTab -> Either String (Double, SymTab)

to:

evaluate :: Tree -> SymTab -> Evaluator (Double, SymTab)

We can now redefine bindE, return, and fail in terms of Evaluator. Then we have to tell Haskell that we are defining a monad. Why does Haskell need to know about our monad? Because with this knowledge it will allow us to use the do notation — that’s the sugar we’ve been craving.

Monad is a typeclass — I’ll talk more about typeclasses soon. For now, it’s enough to know that, in order to tell Haskell that we are defining a monad, we need to instantiate Monad with our Evaluator. When instantiating a Monad we have to provide the appropriate definitions of bind, return and, optionally, fail. The only tricky one is bind, since Monad defines it as an infix operator, >>= (pronounced bind). Operators are just functions with funny names composed of special characters. An infix operator is a function that takes two arguments, one on the left and one on the right. Without further ado, here’s the instance declaration for our first monad:

instance Monad Evaluator where
    (Ev ev) >>= k =
        case ev of
          Left msg -> Ev (Left msg)
          Right v -> k v
    return v = Ev (Right v)
    fail msg = Ev (Left msg)

These are exactly the definitions we’ve seen before, except for the additional layer of the Ev constructor. Notice that I used infix notation in defining operator >>=. Its left argument is the pattern (Ev ev), and its right argument is the continuation k.

Let’s see how this new monad works in the evaluator of SumNode — first without the do notation:

evaluate (SumNode op left right) symTab = 
    evaluate left symTab >>= (lft, symTab') ->
        evaluate right symTab' >>= (rgt, symTab'') ->
            case op of 
              Plus  -> return (lft + rgt, symTab'')
              Minus -> return (lft - rgt, symTab'')

Notice that the second argument to the first >>= is a lambda that continues up to the end of the function. Inside its body there is another >>= with its own lambda. Notice also that the innermost lambda has access not only to rgt — which is its argument — but also to the external lft and op, which it captures from its environment.

Here’s the same code in do notation:

evaluate (SumNode op left right) symTab = do
    (lft, symTab')  <- evaluate left symTab
    (rgt, symTab'') <- evaluate right symTab'
    case op of 
      Plus  -> return (lft + rgt, symTab'')
      Minus -> return (lft - rgt, symTab'')

As you can see, the do block hides the binding >>= between the lines, it automatically converts «the rest of the code» to continuations, and it lets you treat the arguments to lambdas as if they were local variables to be assigned to. It’s this syntactic sugar that makes do blocks look so convincingly imperative, even though they are purely functional in nature.

Compare this with our starting point:

evaluate (SumNode op left right) symTab = 
    case evaluate left symTab of
      Left msg -> Left msg
      Right (lft, symTab') ->
        case evaluate  right symTab' of
          Left msg -> Left msg
          Right (rgt, symTab'') ->
            case op of
              Plus  -> Right (lft + rgt, symTab'')
              Minus -> Right (lft - rgt, symTab'')

That’s definitely progress. If we could only get rid of this noise of threading symTab all over the place. Oh, wait, that’s the next tutorial and another monad.

Let’s bask in the monadic sunshine some more. Did you notice that we have essentially implemented exceptions? fail behaves very much like throw. It shortcuts the execution of the current function, propagates the error to its caller, who will in turn shortcut its execution and so on, until the «exception is caught.» Catching the exception means unpacking the Evaluator returned by a function, and either retrieving the result or doing something with the error.

Can you forget to «catch» the exception? Not really — it’s part of the return type. You can’t even access the value of your computation without unpacking it. And if your unpacking matches all possible Either patterns, as it should, you’ll be forced to deal with the error case anyway.

Do you remember exception specifications in Java or C++ (currently obsoleted)? In Haskell there’s no need for this ad hoc feature because «exception specification» is encoded in the return type of a function. You can’t ignore types in Haskell, so you essentially leave it to the compiler to enforce exception safety.

There is a full-blown exception library Control.Exception in Haskell, complete with throw, catch, and a long list of predefined exception types. The Prelude also defines the Monad instance for Either, so we could have used it directly. But I thought «inventing» the monad from scratch would give you a better learning experience.

Type Classes

I said that Monad is a typeclass, but I haven’t explained what a typeclass is. It’s not exactly like a class in OO languages, but there are some similarities.

If you’re familiar with Java or C++, you know that a class may define a set of virtual functions. When you have a polymorphic reference to one of the descendants of such a class, and call one of these virtual functions on it, the function that is called depends on the actual type of the object, which is determined at runtime. This kind of late binding is possible because a polymorphic object carries with it a vtable: an array of function pointers.

Similarly, in Haskell, a type class let’s you define the signatures of a set of functions. You may think of a typeclass as an interface. (Although it is possible for a typeclass to provide default implementations for some of the functions — the monadic fail, for instance, has such implementation: it calls error by default.) There’s even an analog of a vtable that is (invisibly) passed to polymorphic functions.

The actual implementation of typeclass functions is provided by the client every time they declare some type to be an instance of a typeclass. In Java or C++ this would happen when the client defines concrete classes that implement an interface. In Haskell you define instances instead. The interesting thing is that, in Haskell, the client is able to connect any typeclass with any type (including built-in types) «after the fact.» (In most OO languages you can’t make an int a subclass of IFoo — in Haskell you can.)

Let’s work through an example:

class Valuable a where
    evaluate :: a -> Double

data Expr = Const Double | Add Expr Expr

instance Valuable Expr where
    evaluate (Const x) = x
    evaluate (Add lft rgt) = evaluate lft + evaluate rgt

instance Valuable Bool where
    evaluate True  = 1
    evaluate False = 0

test :: Valuable a => a -> IO ()
test v = print $ evaluate v

expr :: Expr
expr = Add (Const 2) (Add (Const 1.5) (Const 2.5))

main = do
    test expr
    test True

I have defined a typeclass Valuable with one function evaluate, which takes an instance of Valuable, a, and returns a Double. Separately I have defined a data type Expr, which has no knowledge of the class Valuable. Then I realized that Expr is Valuable, so I wrote an instance declaration that makes this connection and provides the «witness» — the actual implementation of the function evaluate.

To make things even more exciting, I decided that the built-in type Bool is also Valuable and provided and instance declaration to prove it. Now I was able to write a polymorphic function test that calls evaluate on its argument. But unlike in previous examples of polymorphism, the generic argument to test is constrained: it has to be an instance of the class Valuable. The type expression before the double arrow => defines class constraints for what follows. In this case we have one constraint, Valuable a, meaning a must be an instance of Valuable. In general there may be several such constraints listed between a set of parentheses and separated by commas.

Hadn’t I provided the type signature for test, the compiler would have figured it out by analyzing the body of test and seeing that I called evaluate on its argument.

This is an example of a more general mechanism: If you define a function that adds its arguments, the compiler will deduce the Num constraint for them. If you compare arguments for (in-) equality, it will deduce Eq constraints. If you print them, it will deduce Show, etc.

The C++ Standards Committee almost voted in the concepts proposal in 2011, which would have added constrained polymorphism to C++ templates. But then they gave up because of the tremendous complexity imposed by the requirement of backwards compatibility.

Solution to the Expression Problem

In the previous tutorial I described the expression problem: How can you create a library that would be open to adding new data and new functions. We’ve seen that, in Haskell, extending a library by adding new functions was easy, but the addition of new varieties of data required modifications to the library. Clever use of typeclasses will let us have the cake and eat it too.

Let’s work the magic on the previous example. First, instead of having one type Expr with many constructors, let’s replace it with one typeclass Expr and many individual data types (by many I mean two in this case). We then glue these data types together by making them instances of Expr. Expr itself is empty — its only purpose is to unify all expression types.

class Expr a

data Const   = Const Double
data Add a b = Add a b

instance Expr Const
instance (Expr a, Expr b) => Expr (Add a b)

Notice that I used the names Const and Add both to name data types and their constructors. These two namespaces are separate, so if you’re running out of names, it’s a common practice to reuse them this way. Also, notice the use of two class constraints in the instance definition for Add. Both a and b must be Expr for Add a b to be Expr.

Now we would like to define the function evaluate for both Const and Add nodes. But these are no longer just two constructors — they are two different data types. The only way to define evaluate for both is to overload it. This is possible, but only if evaluate is part of a typeclass. The name Valuable for such a class seems natural. Let’s make it work for Const and Add:

class (Expr e) => Valuable e where
    evaluate :: e -> Double

instance Valuable Const where
    evaluate (Const x) = x
instance (Valuable a, Valuable b) => Valuable (Add a b) where
    evaluate (Add lft rgt) = evaluate lft + evaluate rgt

This time we made sure that only expressions can be evaluated: That’s the constraint Expr e in the definition of Valuable.

So that’s our library. Let’s now see how a client may extend it by, for instance, adding a new expression type: Mul.

data Mul a b = Mul a b

instance (Expr a, Expr b) => Expr (Mul a b)
instance (Valuable a, Valuable b) => Valuable (Mul a b) where
    evaluate (Mul lft rgt) = evaluate lft * evaluate rgt

Plugging Mul into the library required only making it the instance of Expr and Valuable. Perfect! Our library is now open to adding new data.

Let’s check if it’s still open to new functions. Let’s add a pretty printing function to expressions, including the newly defined Mul. The trick is to make pretty a member of a new class Pretty and then make all the expression types its instances:

class (Expr e) => Pretty e where
    pretty :: e -> String

instance Pretty Const where
    pretty (Const x) = show x
instance (Pretty a, Pretty b) => Pretty (Add a b) where
    pretty (Add x y) = "(" ++ pretty x ++ " + " ++ pretty y ++ ")"
instance (Pretty a, Pretty b) => Pretty (Mul a b) where
    pretty (Mul x y) = pretty x ++ " * " ++ pretty y

Granted, there is a bit of syntactic noise here, but we have accomplished quite a feat: We have overcome the expression problem!

The Monad Typeclass

Let’s go back to the Monad typeclass. This is how it’s defined in the Prelude:

class Monad m where
    (>>=)   :: m a -> (a -> m b) -> m b
    (>>)    :: m a -> m b -> m b
    return  :: a -> m a
    fail    :: String -> m a

    mv >> k =  mv >>= _ -> k
    fail s  = error s

There’s something different about this typeclass: it’s not a typeclass that unifies types; rather it unifies type constructors. The parameter m is a type constructor. It requires another type parameter to become a type. See how m always acts on either a or b, which are type variables. Going back to the Evaluator in our calculator, it was a type constructor too, with the type parameter a:

newtype Evaluator a = Ev (Either String a)

That’s why we were able to make Evaluator an instance of Monad:

instance Monad Evaluator where

Specifically, notice that we used Evaluator, not Evaluator a in this definition. You always supply a type constructor to the instance definition of Monad.

Monad also defines operator >>, which ignores the value from its first argument (see its default implementation). You can bind two monadic functions using >> if you’re only calling the first one for its «side effects.» This will make more sense in the next tutorial, when we talk about the state monad. Operator >> is often used with the IO monad to sequence output functions, as in:

main :: IO ()
main = putStrLn "Hello " >> putStrLn "World"

which is the same as:

main :: IO ()
main = do 
    putStrLn "Hello "
    putStrLn "World"

Exercises

Ex 1. Define the WhyNot monad:

data WhyNot a = Nah | Sure a
  deriving Show

instance Monad WhyNot where
   ...

safeRoot :: Double -> WhyNot Double
safeRoot x = 
    if x >= 0 then 
      return (sqrt x)
    else
      fail "Boo!"

test :: Double -> WhyNot Double
test x = do
   y <- safeRoot x
   z <- safeRoot (y - 4)
   w <- safeRoot z
   return w


main = do
    print $ test 9
    print $ test 400

Ex 2. Define a monad instance for Trace (no need to override fail). The idea is to create a trace of execution by sprinkling you code with calls to put. The result of executing this code should look something like this:

["fact 3","fact 2","fact 1","fact 0"]
6

Hint: List concatenation is done using ++ (we’ve seen it used for string concatenation, because String is just a list of Char).

newtype Trace a = Trace ([String], a)

instance Monad Trace where
    ...

put :: Show a => String -> a -> Trace ()
put msg v = Trace ([msg ++ " " ++ show v], ())

fact :: Integer -> Trace Integer
fact n = do
   put "fact" n
   if n == 0
       then return 1
       else do
           m <- fact (n - 1)
           return (n * m)

main = let Trace (lst, m) = fact 3
       in do
           print lst
           print m

Ex 3. Instead of deriving Show, define explicit instances of the Show typeclass for Operator and Tree such that expr is displayed as:

x = (13.0 - 1.0) / y

It’s enough that you provide the implementation of the show function in the instance declaration. This function should take an Operator (or a Tree) and return a string.

data Operator = Plus | Minus | Times | Div

data Tree = SumNode Operator Tree Tree
          | ProdNode Operator Tree Tree
          | AssignNode String Tree
          | UnaryNode Operator Tree
          | NumNode Double
          | VarNode String

instance Show Operator where
    show Plus  = " + "
    ...

instance Show Tree where
    show = undefined

expr = AssignNode "x" (ProdNode Div (SumNode Minus (NumNode 13) (NumNode 1)) (VarNode "y"))

main = print expr

Ex 4 This is an example that mimics elements of OO programming. Chess pieces are implemented as separate data types: here, for simplicity, just one, Pawn. The constructor of Pawn takes the Color of the piece and its position on the board (0-7 in both dimensions). Pieces are instances of the class Piece, which declares the following functions: color, pos, and moves. The moves function takes a piece and returns a list of possible future positions after one move (without regard to other pieces, but respecting the boundaries of the board). Define both the typeclass and the instance, so that the following program works:

data Color = White | Black
    deriving (Show, Eq)

data Pawn = Pawn Color (Int, Int)

class Piece a where
   ...

instance Piece Pawn where
   ...

pieces = [Pawn White (3, 1), Pawn Black (4, 1), Pawn White (0, 7), Pawn Black (5, 0)]

main = print $ map moves pieces

The Symbolic Calculator So Far

Below is the source code for the current state of the project. Notice how lookUp and addSymbol nicely fit into the monadic scheme. I haven’t made any changes to the lexer and parser files.

{-# START_FILE Main.hs #-}
module Main where

import qualified Data.Map as M
import Lexer (tokenize)
import Parser (parse)
import Evaluator

main = do
   loop (M.fromList [("pi", pi), ("e", exp 1.0)])

loop symTab = do
   str <- getLine
   if null str
   then
      return ()
   else
      let toks  = tokenize str
          tree  = parse toks
          Ev ev = evaluate tree symTab
      in
          case ev of
          Left msg -> do
              putStrLn $ "Error: " ++ msg
              loop symTab -- use old symTab
          Right (v, symTab') -> do
              print v
              loop symTab'
{-# START_FILE Lexer.hs #-}
-- show
module Lexer (Operator(..), Token(..), tokenize, lookAhead, accept) where
-- /show
import Data.Char

data Operator = Plus | Minus | Times | Div
    deriving (Show, Eq)

data Token = TokOp Operator
           | TokAssign
           | TokLParen
           | TokRParen
           | TokIdent String
           | TokNum Double
           | TokEnd
    deriving (Show, Eq)

lookAhead :: [Token] -> Token
lookAhead [] = TokEnd
lookAhead (t:ts) = t

accept :: [Token] -> [Token]
accept [] = error "Nothing to accept"
accept (t:ts) = ts

tokenize :: String -> [Token]
tokenize [] = []
tokenize (c : cs) 
    | elem c "+-*/" = TokOp (operator c) : tokenize cs
    | c == '='  = TokAssign : tokenize cs
    | c == '('  = TokLParen : tokenize cs
    | c == ')'  = TokRParen : tokenize cs
    | isDigit c = number c cs
    | isAlpha c = identifier c cs
    | isSpace c = tokenize cs
    | otherwise = error $ "Cannot tokenize " ++ [c]

identifier :: Char -> String -> [Token]
identifier c cs = let (name, cs') = span isAlphaNum cs in
                  TokIdent (c:name) : tokenize cs'

number :: Char -> String -> [Token]
number c cs = 
   let (digs, cs') = span isDigit cs in
   TokNum (read (c : digs)) : tokenize cs'

operator :: Char -> Operator
operator c | c == '+' = Plus
           | c == '-' = Minus
           | c == '*' = Times
           | c == '/' = Div
{-# START_FILE Parser.hs #-}
-- show
module Parser (Tree(..), parse) where
-- /show
import Lexer

data Tree = SumNode Operator Tree Tree
          | ProdNode Operator Tree Tree
          | AssignNode String Tree
          | UnaryNode Operator Tree
          | NumNode Double
          | VarNode String
    deriving Show

parse :: [Token] -> Tree
parse toks = let (tree, toks') = expression toks
             in
               if null toks' 
               then tree
               else error $ "Leftover tokens: " ++ show toks'

expression :: [Token] -> (Tree, [Token])
expression toks = 
   let (termTree, toks') = term toks
   in
      case lookAhead toks' of
         (TokOp op) | elem op [Plus, Minus] -> 
            let (exTree, toks'') = expression (accept toks') 
            in (SumNode op termTree exTree, toks'')
         TokAssign ->
            case termTree of
               VarNode str -> 
                  let (exTree, toks'') = expression (accept toks') 
                  in (AssignNode str exTree, toks'')
               _ -> error "Only variables can be assigned to"
         _ -> (termTree, toks')

term :: [Token] -> (Tree, [Token])
term toks = 
   let (facTree, toks') = factor toks
   in
      case lookAhead toks' of
         (TokOp op) | elem op [Times, Div] ->
            let (termTree, toks'') = term (accept toks') 
            in (ProdNode op facTree termTree, toks'')
         _ -> (facTree, toks')

factor :: [Token] -> (Tree, [Token])
factor toks = 
   case lookAhead toks of
      (TokNum x)     -> (NumNode x, accept toks)
      (TokIdent str) -> (VarNode str, accept toks)
      (TokOp op) | elem op [Plus, Minus] -> 
            let (facTree, toks') = factor (accept toks) 
            in (UnaryNode op facTree, toks')
      TokLParen      -> 
         let (expTree, toks') = expression (accept toks)
         in
            if lookAhead toks' /= TokRParen 
            then error "Missing right parenthesis"
            else (expTree, accept toks')
      _ -> error $ "Parse error on token: " ++ show toks

{-# START_FILE Evaluator.hs #-}
module Evaluator (evaluate, Evaluator(..)) where

import Lexer
import Parser
import qualified Data.Map as M

newtype Evaluator a = Ev (Either String a)

instance Monad Evaluator where
    (Ev ev) >>= k =
        case ev of
          Left msg -> Ev (Left msg)
          Right v -> k v
    return v = Ev (Right v)
    fail msg = Ev (Left msg)

type SymTab = M.Map String Double

evaluate :: Tree -> SymTab -> Evaluator (Double, SymTab)

evaluate (SumNode op left right) symTab = do
    (lft, symTab')  <- evaluate left symTab
    (rgt, symTab'') <- evaluate right symTab'
    case op of 
        Plus  -> return (lft + rgt, symTab'')
        Minus -> return (lft - rgt, symTab'')

evaluate (ProdNode op left right) symTab = do
    (lft, symTab')  <- evaluate left symTab
    (rgt, symTab'') <- evaluate right symTab'
    case op of
        Times -> return (lft * rgt, symTab)
        Div   -> return (lft / rgt, symTab)

evaluate (UnaryNode op tree) symTab = do
    (x, symTab') <- evaluate tree symTab
    case op of
        Plus  -> return ( x, symTab')
        Minus -> return (-x, symTab')

evaluate (NumNode x) symTab = return (x, symTab)

evaluate (VarNode str) symTab = lookUp str symTab

evaluate (AssignNode str tree) symTab = do
    (v, symTab')  <- evaluate tree symTab
    (_, symTab'') <- addSymbol str v symTab'
    return (v, symTab'')

lookUp :: String -> SymTab -> Evaluator (Double, SymTab)
lookUp str symTab = 
    case M.lookup str symTab of
      Just v  -> return (v, symTab)
      Nothing -> fail ("Undefined variable " ++ str)

addSymbol :: String -> Double -> SymTab -> Evaluator ((), SymTab)
addSymbol str val symTab = 
    let symTab' = M.insert str val symTab
    in return ((), symTab')

Понравилась статья? Поделить с друзьями:

Читайте также:

  • Hdmi твч как исправить
  • Hdmi вход на телевизоре не работает как исправить
  • Hdims03 im03 ошибка e37
  • Hdims02 bx01 ошибка e98
  • Hdims02 bx01 ошибка e01

  • 0 0 голоса
    Рейтинг статьи
    Подписаться
    Уведомить о
    guest

    0 комментариев
    Старые
    Новые Популярные
    Межтекстовые Отзывы
    Посмотреть все комментарии