3.3. Locale-based character encoding

The Haskell 98 Report defines values of the Char type as the code points of Unicode (or equivalently ISO/IEC 10646). However files and other I/O streams typically consist of bytes, with characters in text files encoded as one or more bytes. In many systems, a similar encoding is also required for interactions with the system. Therefore at these points Hugs converts characters to and from sequences of bytes in a manner determined by the LC_CTYPE category of the current locale.

This conversion is not applied to the contents of files opened in binary mode. It is applied to program text, so you can use all the characters representable in your locale within comments and string literals. However only ISO Latin-1 characters are permitted in identifiers.

The form of the locale string, and how it is set, vary between systems.