> Expat4D Home > Documentation

Programming with Expat4D


Expat is an event based parser, which means that it reads through the document and calls handler routines whenever particular items of XML are found. Expat doesn't do anything to the XML at all. It just alerts the processing application of parsing events and gives it basic information about them.

You use Expat4D to write XML processors. It provides you with what's in the document, you must arrange it into an order that you can work with. This may seem like Expat4D isn't doing much for you, because of the way the parsing is done in Expat4D, you can parse documents which are much larger than your available RAM by passing the data in smaller chunks until you reach the end of the document.

In the DOM approach to parsing XML, the entire document must be loaded into RAM in order to work with the document which in some cases isn't practical. DOM parsers are often use event or SAX (Simple API for XML) based parsers in order to create the object in the first place, so in the demo database there is an example of implementing this type of model in 4D using ObjectTools.

When parsing documents using Expat4D, you need to create at least a few 4D methods. One will be your method which contains the code which creates the parser, sets up the handlers, then runs and frees the parser. The others will be any handler routines you want to install.

You create the parser using xml_ParserCreate or xml_ParserCreateNS (for using a parser with a namespace processor). These routines return a parserID which is your handle to the parser and a value which you must not lose until you call xml_ParserFree to free the RAM afterwards. At the moment, if you lose the parser handles, you will cause a memory leak and there is no facility in Expat4D at the moment to track this down. Good programming practice should mean that this never happens, and I advise that you always create and free the parser in the same routine.

Once you created the parser, you can set the handler methods for that parser. This must be done on a per parser basis. If you pass the name of a method which does not exist, a Expat4D Error is generated (see below). One you have configured all your options for the parser, you can call xml_Parse or xml_ParseBLOB to parse your XML document. During parsing, the handler routines will be called by Expat4D whenever the parsing events occur and will receive different parameters depending on the type of handler.

Every handler receives the parser ID as a first parameter, so if you need to associate global variables with the parser, this could be used as an index. More importantly in the startElementHandler, you will need this parser ID if attributes were found and you want to receive them. To make attributes easy to handle, the command xml_GetAttributes can be called within a startElementHandler returns the attribute names and values to two text arrays, but in order to do so, you need to pass it the parser ID.

Other parameters passed to handler methods are specific to each type of handler, e.g. the startElement handler receives the parser ID, the name of the element and the number of attributes found in the element, the CDATA handler receives just the parser ID and the character data. For full details there are example handlers in the demo database for each type of handler supported by Expat4D, and there are also examples in this documentation under the commands for setting the handlers.

Using the information passed to you in the handlers you can easily work out the structure of an XML document and use the data as it is passed to your handlers in a variety of ways, but it is up to you to keep track of what's going on. This can be quite difficult to get your head around to start with, but hopefully the examples in the demonstration database will give you some good ideas. We will be continually adding new examples to the demo, so it's worth checking the website from time to time to see if we've added any major features. Otherwise, you can join the Expat4D-Discuss mailing list to be kept informed of all the latest releases.

Error Handling

If you pass an invalid parserID to any of the Expat4D commands, it will generate an Expat4D error. These are silent errors which you can trap by setting an error handler method using xml_SetErrorHandler. The error handler is global in scope, so you can simply install it in the startup method of your database if you want to have it on all the time. A useful tip gleaned from the ObjectTools manual is to put a breakpoint in your error handler routine, then when an error occurs, you can just step forward into where you called the plug-in command which generated the error.

Error are also generated for invalid method names passed to handler configuration commands and a few other cases. See the ErrorCodes section of the manual for more information on the error codes which can be returned.

Parsing errors are error generated by Expat when parsing a document and are returned by xml_Parse and xml_ParseBLOB. They do not generate an Expat4D error and should be dealt with in your parser routine.

Using ObjectTools

ObjectTools is a great plug-in from Automated Solutions Group which every 4D developer should have in their toolbox. It has been used fairly extensively in the demonstration database, and so if you want to use the interpreted version and you don't have your own licence of ObjectTools, the demos will run extremely slowly as ObjectTools reminds you that you don't have a licence.

There are a number of reasons I like using ObjectTools in conjunction with Expat4D. Firstly, I don't like global variables. I like to compile my databases with the "All variables are typed" option in the compiler, and having global variables means that I have to constant stick them into the COMPILER methods which is a pain. When using Expat4D, because of the nature of the parser, you need to rely on global data of some kind so that the handlers have some kind of context to go on as they are called during the parsing. Expat4D lets you pass a longint "UserData" value to each parser which can be obtained in the handler by calling xml_GetUserData. ObjectTools is great because you reference an object using a longint handle, which can then be easily set as your "UserData" value in the parser. Each handler then has access to the object reference by the UserData value, saving you the need for any global variables at all.

The xmlObj_ module works completely without global variables. Each time a start element tag is found, a new object is created and the last UserData value is stored in the new object. The handler determines what relationship the element is to the last object, sets up the new object, and then sets this as the "UserData" value.

  Expat4D Logo

Developer Documentation

Creating Parsers, Destroying Parsers, Parsing Text

Configuring Handler Methods

Position and Error Reporting Functions

Miscellaneous Functions

Output Buffer Functions

Error Codes


> Expat4D Home > Documentation

Last Modified: 19th April 2001 at 11:11 PM