[Haskell] Exceptions in types and exception-free programming

Thu Jun 24 22:03:44 EDT 2004

S. Alexander Jacobson wrote:
> 	Also, is there a way to get the typesystem to
> 	tell you which functions may fail i.e. which
> 	functions have failMsg as an implicit parameter?

Generally speaking, that is not that easy. If we have a functional
composition (foo . bar), we wish its type to reflect the _union_ of
exceptions that can be raised in 'foo' and in 'bar'. Union types are
hard to get. That's why in Ocaml, for example, exceptions aren't the
part of a function signature.

The good news is that we already have union types in Haskell -- to a
large extent. To an extent enough to put exceptions into the
types. The following are two approaches. They aren't perfect -- but
that's a start. A better way however is to statically assure that no
exception can occur. That is possible to achieve in Haskell today --
and improve efficiency of the code in the process due to elimination
of run-time checks. The end of the message gives an example.

The first approach applies only to polymorphic functions. Also we have
to watch for monomorphic restriction (but we all know that). And it
also works in GHC only, because GHC is lazy when it comes to
constraint resolution (which is a good thing, IMHO).

> {-# OPTIONS -fglasgow-exts #-}
> {-# OPTIONS -fallow-undecidable-instances #-}
>
> module TER where
>
> class CERROR a c where
>     darn:: a -> c-> String -> b
>
> instance Show a => CERROR a c where
>     darn label _ msg = error $ "Error: " ++ (show label) ++ " " ++ msg
>   
>
> data EHead = EHead deriving Show
> data ETail = ETail deriving Show
> data EJust = EJust deriving Show
>
> myhead x@([]) = darn EHead x "head of an empty list"
> myhead (x:_) = x
>
> mytail x@([]) = darn ETail x "tail of an empty list"
> mytail (_:xs) = xs

Now, if we ask GHCi for the type of myhead, we get

	*TER> :t myhead
	myhead :: forall a. (CERROR EHead [a]) => [a] -> a

As we can see, the exception is apparent right in the function type.
We can define

> ht x = myhead $ mytail $ mytail x

	*TER> :t ht
	ht :: forall a. (CERROR ETail [a], CERROR EHead [a]) => [a] -> a

We can see that we indeed have a union of exceptions that can be
raised in different sub-expressions of ht.

A more involved example:

> myjust (Just a) = a
> myjust x = darn EJust x "Just nothing"
>
> foo x = myjust $ myhead $ mytail $ myjust x

*TER> :t foo
foo :: forall b.
       (CERROR EJust (Maybe [Maybe b]),
	CERROR ETail [Maybe b],
	CERROR EHead [Maybe b],
	CERROR EJust (Maybe b)) =>
       Maybe [Maybe b] -> b
*TER> foo (Just [Nothing,Just 2])
2
*TER> foo (Just [Nothing])
*** Exception: Error: EHead head of an empty list

The second method involves the use of implicit parameters. It works
both in GHC and in Hugs, and can work with monomorphic functions
too. But is has some other drawback (although that depends on the
point of view)

> idarn _ msg = error msg
>
> myhead1 [] = idarn ?e_head "head of an empty list"
> myhead1 (x:_) = x
>
> mytail1 x@([]) = idarn ?e_tail "tail of an empty list"
> mytail1 (_:xs) = xs
>
> ht1 x = myhead1 $ mytail1 $ mytail1 x

If we check for the type of ht1 (e.g., in Hugs), we get

	TER> :t ht1
	ht1 :: (?e_tail :: a, ?e_head :: b) => [c] -> c

That is, ht1 can fail because it tries to get either the head or the
tail of an empty list. Again, we get the true _union_ of exceptions.

Alas, we can't invoke ht1 in a simple way any more. We've got to bind the
errors.

	TER> let ?e_head = undefined; ?e_tail = undefined in ht1 [1,2]
	Program error: head of an empty list
	TER> let ?e_head = undefined; ?e_tail = undefined in ht1 [1,2,3]
	3

It seems that the best way to cope with exceptions is to assure
statically that they cannot occur at all -- in large segments of code.
We don't need to change Haskell. Haskell already can "fake it" quite
believably. 

Indeed, let us consider a segment of the code

	assert x msg = if x then (not x) else error msg

	foo x | assert (not (null x)) "non-empty-list in ..." = undefined
	foo x = ... body_foo x ...

Obviously, when the second clause is about to executed, _we_ know that
x is a non-empty list. Alas, that fact is not reflected in the type of
x. And it cannot in the above case. Indeed, what if we change the
order of clauses by mistake:

	foo x = ... body_foo x ...
	foo x | assert (not (null x)) "non-empty-list in ..." = undefined

To the typechecker, the order of clauses makes no difference. Yet it
does to the executioneer. There is however a way to make the assurances
statically certain. We should re-write the code as

	with_non_empty_list x foo

where
	with_non_empty_list:: [a] -> NonemptyList [a]

does the run-time check -- and generates the run-time error. But, if
the check succeeded, we know that the list is not empty. That
information can be encoded statically, and be used within the body of
foo.  If NonemptyList is a newtype, no run-time overhead
occurs. Moreover, within the body of foo we can use unsafe versions of
head and tail when applied to 'x' because we are statically assured
'x' is non-empty. We cannot avoid run-time checks -- yet we can
(greatly) reduce their quantity and improve confidence in the code.
The paper about number-parameterized types (being considered for JFP,
I guess?) shows a more involved example: list reversal with an
accumulator.

Here's a simple code:

> newtype NonEmpty a = NonEmpty [a] -- the NonEmpty constructor should
> -- be hidden (not exported from its module)
>
> head' (NonEmpty a) = head a -- no error can occur! Can use unsafe version
> tail' (NonEmpty a) = tail a -- no error can occur! Can use unsafe version
>
> -- trusted function: the only one that can use NonEmpty constructor.
> fork_list_len f g x = if null x then f else g (NonEmpty x)
>
> revers x  = revers' x []
>  where
>    revers' x accum = fork_list_len accum (g accum) x
>    g accum x = revers' (tail' x) ((head' x):accum)

That is a complex example as the sizes of all the lists involved
change on every iteration. The function is recursive. And yet it is
possible to type the function in Haskell _and_ statically assure that
all applications of head and tail are safe! There are some run-time
checks, in fork_list_len. But first, we had to do that check anyway,
according to our algorithm. And second, each iteration does one list
size check but two applications of list deconstructors. If functions
head' and tail' do no run-time checks (as they should not), then we
save one size check per iteration. So, it is possible to achieve both
efficiency and safety at the same time!