GHC Weekly News - 2015/07/29

bgamari - 2015-07-29

Hi *,

Welcome for the latest entry in the GHC Weekly News. Today GHC HQ met to discuss plans post-7.10.2.

GHC 7.10.2 release

GHC 7.10.2 has been released!

Feel free to grab a tarball and enjoy! See the release notes for discussion of what has changed.

As always, if you suspect that you have found a regression don’t hesitate to open a Trac ticket. We are especially interested in performance regressions with fairly minimal reproduction cases.

GHC 7.10.2 and the text package

A few days ago a report came in of long compilations times under 7.10.2 on a program with many Text literals (#10528). This ended up being due to a change in the simplifier which caused it to perform rule rewrites on the left-hand-side of other rules. While this is questionable (read “buggy”) behavior, it doesn’t typically cause trouble so long as rules are properly annotated with phase control numbers to ensure they are performed in the correct order. Unfortunately, it turns out that the rules provided by the text package for efficiently handling string literals did not include phase control annotations. This resulted in a rule from base being performed on the literal rules, which rendered the literal rules ineffective. The simplifier would then expend a great deal of effort trying to simplify the rather complex terms that remained.

Thankfully, the fix is quite straightforward: ensure that the the text literal rules fire in the first simplifier phase (phase 2). This avoids interference from the base rules, allowing them to fire as expected.

This fix is now present in text-1.2.1.2. Users of GHC 7.10.2 should be use this release if at all possible. Thanks to text’s maintainer, Bryan O’Sullivan for taking time out of his vacation to help me get this new release out.

While this mis-behaviour was triggered by a bug in GHC, a similar outcome could have arisen even without this bug. This highlights the importance of including phase control annotations on INLINE and RULE pragmas: Without them the compiler may choose the rewrite in an order that you did not anticipate. This has also drawn attention to a few shortcomings in the current rewrite rule mechanism, which lacks the expressiveness to encode complex ordering relationships between rules. This limitation pops up in a number of places, including when trying to write rules on class-overloaded functions. Simon Peyton Jones is currently pondering possible solutions to this on #10595.

StrictData

This week we merged the long-anticipated -XStrictData extension (Phab:D1033) by Adam Sandberg Ericsson. This implements a subset of the [StrictPragma] proposal initiated by Johan Tibell.In particular, StrictData allows a user to specify that datatype fields should be strict-by-default on a per-module basis, greatly reducing the syntactic noise introduced by this common pattern. In addition to implementing a useful feature, the patch ended up being a nice clean-up of the GHC’s handling of strictness annotations.

What remains of this proposal is the more strong -XStrict extension which essentially makes all bindings strict-by-default. Adam has indicated that he may take up this work later this summer.

$ AMP-related performance regression

In late May Herbert Valerio Riedel opened Phab:D924, which removed an explicit definition for mapM in the [] Traversable instance, as well as redefined mapM_ in terms of traverse_ to bring consistency with the post-AMP world. The patch remains unmerged, however, due to a failing ghci testcase. It turns out the regression is due to the redefinition of mapM_, which uses (*>) where (>>) was once used. This tickles poor behavior in ghci’s ByteCodeAsm module. The problem can be resolved by defining (*>) = (>>) in the Applicative Assembler instance (e.g. Phab:1097). That being said, the fact that this change has already exposed performance regressions raises doubts as to whether it is prudent.

GHC Performance work

Over the last month or so I have been working on nailing down a variety of performance issues in GHC and the code it produces. This has resulted in a number of patches which in some cases dramatically improve compilation time (namely Phab:1012 and Phab:D1041). Now since 7.10.2 is out I’ll again be spending most of my time on these issues. We have heard a number of reports that GHC 7.10 has regressed on real-world programs. If you have a reproducible performance regression that you would like to see addressed please open a Trac ticket.

Merged patches

  • Phab:D1028: Fixity declarations are now allowed for infix data constructors in GHCi (thanks to Thomas Miedema)
  • Phab:D1061: Fix a long-standing correctness issue arising when pattern matching on floating point values
  • Phab:D1085: Allow programs to run in environments lacking iconv (thanks to Reid Barton)
  • Phab:D1094: Improve code generation in integer-gmp (thanks to Reid Barton)
  • Phab:D1068: Implement support for the MO_U_Mul2 MachOp in the LLVM backend (thanks to Michael Terepeta)
  • Phab:D524: Improve runtime system allocator performance with two-step allocation (thanks to Simon Marlow)

That’s all for this time. Enjoy your week!

Cheers,

  • Ben