Building experimental Unicode version of Hugs on Windows

Graham Klyne gk at ninebynine.org
Mon Jan 12 11:32:34 EST 2004


In my previous note about building Hugs under Windows using MS Visual 
Studio, I omitted a couple of details, so I update my notes here...

Starting with:
   http://cvs.haskell.org/Hugs/downloads/snapshots/hugs98-20040109.tar.gz

Using Microsoft Visual Studio, version 6, and the supplied project file 
(hugs.dsw):

(1) edit src/msc/options.h to include #define UNICODE_CHARS 1

(2) run config.bat to copy options.h and config.h to the source directory

(3) Add the following files to the Visual Studio project:
     char.c
     edit.c
     errors.c
     evaluator.c
     goal.c
     machdep.c
     module.c
     observe.c
     opts.c
     script.c
     strutil.c

(4) Change the Visual Studio project to specify ../../icons for the 
resource include path.

(5) In version.c, changed the definition of VERSION_STRING to some value 
that this instantiation of Hugs will use as part of the registry key for 
saving its options (to avoid disturbing settings for the standard Hugs 
installstion).

The build still throws up an obscure error:
[[
Install hugs binary
The system cannot find the file specified.
Error executing c:\winnt\system32\cmd.exe.
]]
but this doesn't affect the resulting executable, which ends up in a 
subdirectory of .../src/msc, depending on the option being built.

Copied the resulting executable into the Hugs installation directory, using 
a different filename (preferably related to the VERSION_STRING value used 
above).

Using a version of Hugs built as above, I've been able to run a complex 
test suite program.  My next step is to see if the Unicode support is 
sufficient to run the HXML toolbox software.

...

Attempting to load module HUnitExample.hs from the 3.01 HXML toolbox 
distribution (), I'm getting:
[[
ERROR "..\hparser\Unicode.hs":116 - Hexadecimal character escape out of range
]]

The corresponding source code is this:
[[
-- |
-- test for a legal multi byte XML char

isMultiByteXmlChar      :: Unicode -> Bool
isMultiByteXmlChar i
     = ( i >= '\x00000080' && i <= '\x0000D7FF' )
       ||
       ( i >= '\x0000E000' && i <= '\x0000FFFD' )
       ||
       ( i >= '\x00010000' && i <= '\x0010FFFF' )
]]
(The final line is number 116)

Digging deeper, I see that the Hugs support has MAX_UNI_CHAR is 0x10FFFD, 
but the Haskell toolbox uses 0x10FFFF.  Which is correct?  For now, I'll 
edit the Hugs source, file unitable.c (not forgetting to increase the size 
of the final  Char_Block entry by 2).  That seems to be accepted, but 
subsequently I'm having some problems with the HXML source code, which I'll 
report separately.

#g


------------
Graham Klyne
For email:
http://www.ninebynine.org/#Contact



More information about the Haskell mailing list