This module provides both a native Haskell solution for parsing XML
documents into a stream of events, and a set of parser combinators for
dealing with a stream of events.
As a simple example:
>>> :set -XOverloadedStrings
>>> import Conduit (runConduit, (.|))
>>> import Data.Text (Text, unpack)
>>> import Data.XML.Types (Event)
>>> data Person = Person Int Text Text deriving Show
>>> :{
let parsePerson :: MonadThrow m => ConduitT Event o m (Maybe Person)
parsePerson = tag' "person" parseAttributes $ \(age, goodAtHaskell) -> do
name <- content
return $ Person (read $ unpack age) name goodAtHaskell
where parseAttributes = (,) <$> requireAttr "age" <*> requireAttr "goodAtHaskell" <* ignoreAttrs
parsePeople :: MonadThrow m => ConduitT Event o m (Maybe [Person])
parsePeople = tagNoAttr "people" $ many parsePerson
inputXml = mconcat
[ "<?xml version=\"1.0\" encoding=\"utf-8\"?>"
, "<people>"
, " <person age=\"25\" goodAtHaskell=\"yes\">Michael</person>"
, " <person age=\"2\" goodAtHaskell=\"might become\">Eliezer</person>"
, "</people>"
]
:}
>>> runConduit $ parseLBS def inputXml .| force "people required" parsePeople
[Person 25 "Michael" "yes",Person 2 "Eliezer" "might become"]
This module also supports streaming results using
yield. This
allows parser results to be processed using conduits while a
particular parser (e.g.
many) is still running. Without using
streaming results, you have to wait until the parser finished before
you can process the result list. Large XML files might be easier to
process by using streaming results. See
http://stackoverflow.com/q/21367423/2597135 for a related
discussion.
>>> import Data.Conduit.List as CL
>>> :{
let parsePeople' :: MonadThrow m => ConduitT Event Person m (Maybe ())
parsePeople' = tagNoAttr "people" $ manyYield parsePerson
:}
>>> runConduit $ parseLBS def inputXml .| force "people required" parsePeople' .| CL.mapM_ print
Person 25 "Michael" "yes"
Person 2 "Eliezer" "might become"
Previous versions of this module contained a number of more
sophisticated functions written by Aristid Breitkreuz and Dmitry
Olshansky. To keep this package simpler, those functions are being
moved to a separate package. This note will be updated with the name
of the package(s) when available.