Haskell Proposal: Syntactic Sugar for Arrows

Ross Paterson, 26th January 1999.

John Hughes has defined a new abstract view of computation, in his (currently draft) paper Generalising Monads to Arrows. That paper uses a point-free style. This proposal introduces a notation for arrows similar to the existing do notation, with a semantics also defined by translation into the core language. Some familiarity with the arrows paper is assumed. Comments welcome.

To illustrate the similarity with do notation, here is a reformulation of the interpreter from the arrows paper (think of `proc' as an extended lambda):

    eval (Var s) =
            proc env ->
                 returnA (lookup s env)
    eval (Add e1 e2) =
            proc env ->
                 ~(Num u) <- eval e1 env
                 ~(Num v) <- eval e2 env
                 returnA (u + v)
    eval (If e1 e2 e3) =
            proc env ->
                 ~(Bl b) <- eval e1 env
                 if b then eval e2 env else eval e3 env
    eval (Lam x e) =
            proc env ->
		 returnA (Fun (proc v -> eval e ((x,v):env)))
    eval (App e1 e2) =
            proc env ->
                 ~(Fun f) <- eval e1 env
                 v <- eval e2 env
                 f v

    returnA :: Arrow a => a b b
    returnA = arr id

Though the syntax resembles do notation, the translation to core Haskell is a bit more convoluted.

Syntax

proc is a new reserved word.

Add the productions

    exp   ->  ...
          |   proc pat -> { stmts [;] }

    stmts ->  fexp aexp
          |   pat <- fexp aexp ; stmts
          |   let decllist ; stmts
          |   if exp then stmts else stmts
          |   case exp of { salt1 ; ... ; saltn [;] }

    salt  ->  pat -> stmts

Unlike the do case, it is necessary to treat conditionals specially, as will become clear in the translation. (I have omitted guards from the case for simplicity. Adding them doesn't raise any extra issues.)

Note also that a juxtaposition fexp aexp does not denote application in the above contexts. (Perhaps that's too tricky, and some extra syntax would be more honest. Another glitch is that composition of arrows reads from left to right, so perhaps pseudo-application should too.)

Translation

The following equations define a naive translation into current Haskell. There is plenty of scope for improvement, for example by assuming that arr preserves composition, dead variable elimination, simplifying patterns and so on.

The first step is to eliminate head patterns that could fail. These are replaced by failure-free patterns (defined as in Haskell 1.4):

    proc p -> { s } =
        proc x ->
            case x of
            p -> { Flatten(p) <- arr (\ p -> Flatten(p)) x ; s }
            _ -> zeroArrow x

where Flatten(p) is a failure-free pattern with the same irrefutable subpatterns as p. One way to do this is by replacing all non-unique constructors with tuples, e.g. Flatten(y:ys) could be (y,ys).

(Note that this translation requires an arrow in both ArrowChoice and ArrowZero. A Haskell-98-style definition would only eliminate the ArrowZero, while changing the Arrow class.)

From here on, the head pattern is assumed to be failure-free.

As noted above, juxtaposition of expressions is not application. It translates to the arrow version of application:

    proc p -> { f e } =
        arr (\ p -> (f, e)) >>> app

This requires that the arrow belongs to ArrowApply (and is thus equivalent to a monad). However, if f does not use the variables of p, the following may be used instead:

    proc p -> { f e } =
        arr (\ p -> e) >>> f

In this case, any arrow is allowed.

The binding construct for arrows requires some plumbing:

    proc p -> { p' <- f e ; s } =
        arr (\ x -> (x,x)) >>>
        first (proc p -> f e) >>>
	(proc (p',p) -> { s })

Local definitions are also possible, but they are not as useful here as in do expressions:

    proc p -> { let decllist ; s } =
        arr (\ p -> let decllist in (p',p)) >>>
	(proc (p',p) -> { s })

where p' is a tuple of the variables defined in decllist. Note that in contrast to do expressions, these variables are monomorphic. (This can be fixed with a messy translation using first class polymorphism.)

As noted above, conditionals require special translations. As one might expect, they require that the arrow is an ArrowChoice:

    proc p -> { if e then s1 else s2 } =
        arr (\ p -> if e then Left p else Right p) >>>
        (proc p -> { s1 }) ||| (proc p -> { s2 })

The translation of case is more of the same (assuming that ||| is right-associative):

    proc p -> { case e of { p1 -> s1 ; ... ; pn -> sn } } =
        arr (\ p ->
             case e of
             p1 -> Left (p, Flatten(p1))
             ...
             pn-1 -> Rightn-2 (Left (p, Flatten(pn-1)))
             pn -> Rightn-1 (p, Flatten(pn))) >>>
        (proc (p, Flatten(p1)) -> { s1 }) ||| ... ||| (proc (p, Flatten(pn)) -> { sn })

Of course the if-then-else could be treated as a special case:

    proc p -> { if e then s1 else s2 } =
        proc p -> { case e of { True -> s1; False -> s2 } }

though this yields a worse naive translation.