Skip to main content
Previous sectionNext section

Customizing How the Caché SAX Parser Is Used

Whenever Caché reads an XML document, it uses the Caché SAX (Simple API for XML) Parser. This topic describes your options for controlling the Caché SAX Parser. It discusses the following items:

About the Caché SAX Parser

The Caché SAX Parser is used whenever Caché reads an XML document.

It is an event-driven XML parser that reads an XML file and issues callbacks when it finds items of interest, such as the start of an XML element, start of a DTD, and so on.

(More accurately, the parser works in conjunction with a content handler, and the content handler issues the callbacks. This distinction is important only if you are customizing the SAX interface, as described in “Creating a Custom Content Handler,” later in this topic.)

The parser uses the standard Xerces-C++ library, which complies with the XML 1.0 recommendation and many associated standards. For a list of these standards, see http://xml.apache.org/xerces-c/.

Available Parser Options

You can control the behavior of the SAX parser in the following ways:

  • You can set flags to specify the kinds of validation and processing to perform.

    Note that the parser always checks whether the document is a well-formed XML document.

  • You can specify the events in which you are interested (that is, the items you want the parser to find). To do this, you specify a mask that indicates the events of interest.

  • You can provide a schema specification against which to validate the document.

  • You can disable entity resolution by using a special-purpose entity resolver.

  • You can specify a timeout period for entity resolution.

  • You can specify a more general custom entity resolver, if you need to control how the parser finds the definitions for any entities in the document.

  • If the source document is accessed at a URL, you can specify the request sent to the web server, as an instance of %Net.HttpRequest.

    For details on %Net.HttpRequest, see the book Using Caché Internet Utilities. Or see the class documentation for %Net.HttpRequest.

  • You can specify a custom content handler.

  • You can use HTTPS.

The available options depend on how you are using the Caché SAX Parser, as summarized in the following table:

SAX Parser Options in %XML Classes
Option %XML.Reader %XML.TextReader %XML.XPATH.Document %XML.SAX.Parser
Specifying parser flags supported supported supported supported
Specifying which parsing events are interesting (for example, start of element, end of element, comments) not supported supported not supported supported
Specifying a schema specification supported supported supported supported
Disabling entity resolution or otherwise customizing entity resolution supported supported supported supported
Specifying a custom HTTP request (if parsing a URL) not supported supported not supported supported
Specifying the content handler not supported not supported not supported supported
Parse documents at HTTPS locations supported not supported not supported supported
Resolve entities at HTTPS locations not supported not supported not supported supported

Specifying the Parser Options

You specify the parser behavior differently depending on how you are using the Caché SAX Parser:

  • If you are using %XML.Reader, you can set the Timeout, SAXFlags, SAXSchemaSpec, and EntityResolver properties of the reader instance. For example:

       #include %occInclude
       #include %occSAX
       // set the parser options we want
       Set flags = $$$SAXVALIDATION
                   + $$$SAXNAMESPACES
                   + $$$SAXNAMESPACEPREFIXES
                   + $$$SAXVALIDATIONSCHEMA
       
       Set reader=##class(%XML.Reader).%New()
       Set reader.SAXFlags=flags
    
    Copy code to clipboard

    These macros are defined in the %occSAX.inc include file.

  • In other cases, you specify arguments of the method you are using. For example:

       #include %occInclude
       #include %occSAX
       
       //set the parser options we want
       Set flags = $$$SAXVALIDATION
                   + $$$SAXNAMESPACES
                   + $$$SAXNAMESPACEPREFIXES
                   + $$$SAXVALIDATIONSCHEMA
    
      Set status=##class(%XML.TextReader).ParseFile(myfile,.doc,,flags)
    
    Copy code to clipboard

    For details on the argument lists for the relevant methods, see the following sections:

    Or see the class documentation.

Setting the Parser Flags

The %occSAX.inc include file lists the flags that you can use to control the validation performed by the Xerces parser. The basic flags are as follows:

  • $$$SAXVALIDATION — Specifies whether to perform schema validation. If this flag is on (the default), all validation errors are reported.

  • $$$SAXNAMESPACES — Specifies whether to recognize namespaces. If this flag is on (the default), the parser processes namespaces. If this flag is off, Caché causes the localname of the element to be an empty string on the startElement() callback of the %XML.SAX.ContentHandler.

  • $$$SAXNAMESPACEPREFIXES — Specifies whether to process namespace prefixes. If this flag is on, the parser reports the original prefixed names and attributes used for namespace declarations. By default, this flag is off.

  • $$$SAXVALIDATIONDYNAMIC — Specifies whether to perform validation dynamically. If this flag is on (the default), validation is performed only if a grammar is specified.

  • $$$SAXVALIDATIONSCHEMA — Specifies whether to perform validation against a schema. If this flag is on (the default), validation is performed against the given schema, if any.

  • $$$SAXVALIDATIONSCHEMAFULLCHECKING — Specifies whether to perform full schema constraint checking, including time-consuming or memory-intensive checking. If this flag is on, all constraint checking is performed. By default, this flag is off.

  • $$$SAXVALIDATIONREUSEGRAMMAR — Specifies whether to cache the grammar for reuse in later parses within the same Caché process. By default, this flag is off.

  • $$$SAXVALIDATIONPROHIBITDTDS — Special flag that causes the parser to throw an error if it encounters a DTD. Use this flag if you need to prevent processing of DTDs. To use this flag, you must explicitly add the value $$$SAXVALIDATIONPROHIBITDTDS to the parse flags passed to the various parsing methods of %XML.SAX.Parser.

The following additional flags provide useful combinations of the basic flags:

  • $$$SAXDEFAULTS — Equivalent to the SAX defaults.

  • $$$SAXFULLDEFAULT — Equivalent to the SAX defaults, plus the option to process namespace prefixes.

  • $$$SAXNOVALIDATION — Do not perform schema validation but do recognize namespaces and namespace prefixes. Note that the SAX parser always checks whether the document is a well-formed XML document.

For details, see %occSAX.inc, which also provides links to further details on these kinds of validation.

The following fragment shows how you can combine parser options:

...
#include %occInclude
#include %occSAX
...
 ;; set the parser options we want
 set opt = $$$SAXVALIDATION
               + $$$SAXNAMESPACES
               + $$$SAXNAMESPACEPREFIXES
               + $$$SAXVALIDATIONSCHEMA
...
  set status=##class(%XML.TextReader).ParseFile(myfile,.doc,,opt)
  //check status
  if $$$ISERR(status) {do $System.Status.DisplayError(status) quit}
Copy code to clipboard

Specifying the Event Mask

The %occSAX.inc include file also lists the flags that you use to specify which event callbacks to process. For performance reasons, it is desirable to process only the callbacks that you need. You may or may not need to specify the mask, depending on which class you use to call the Caché SAX Parser.

  • For %XML.TextReader, the default is $$$SAXCONTENTEVENTS. All event callbacks except for comments are processed.

  • For %XML.SAX.Parser, the default is 0, which means that the parser calls the Mask() method of the content handler. In turn, this method computes the mask by detecting all the event callbacks that you have customized. Only those events are processed. You would use %XML.SAX.Parser if you had created a custom content handler; see “Creating a Custom Content Handler,” later in this topic.

Basic Flags

The basic flags are as follows:

  • $$$SAXSTARTDOCUMENT — Instructs the parser to issue a callback when it starts the document.

  • $$$SAXENDDOCUMENT — Instructs the parser to issue a callback when it ends the document.

  • $$$SAXSTARTELEMENT — Instructs the parser to issue a callback when it finds the start of an element.

  • $$$SAXENDELEMENT — Instructs the parser to issue a callback when it finds the end of an element.

  • $$$SAXCHARACTERS — Instructs the parser to issue a callback when it finds characters.

  • $$$SAXPROCESSINGINSTRUCTION — Instructs the parser to issue a callback when it finds a processing instruction.

  • $$$SAXSTARTPREFIXMAPPING — Instructs the parser to issue a callback when it finds the start of a prefix mapping.

  • $$$SAXENDPREFIXMAPPING — Instructs the parser to issue a callback when it finds the end of a prefix mapping.

  • $$$SAXIGNORABLEWHITESPACE — Instructs the parser to issue a callback when it finds ignorable whitespace. This applies only if the document has a DTD and validation is enabled.

  • $$$SAXSKIPPEDENTITY — Instructs the parser to issue a callback when it finds a skipped entity.

  • $$$SAXCOMMENT — Instructs the parser to issue a callback when it finds a comment.

  • $$$SAXSTARTCDATA — Instructs the parser to issue a callback when it finds the start of a CDATA section.

  • $$$SAXENDCDATA — Instructs the parser to issue a callback when it finds the end of a CDATA section.

  • $$$SAXSTARTDTD — Instructs the parser to issue a callback when it finds the start of a DTD.

  • $$$SAXENDDTD — Instructs the parser to issue a callback when it finds the end of a DTD.

  • $$$SAXSTARTENTITY — Instructs the parser to issue a callback when it finds the start of an entity.

  • $$$SAXENDENTITY — Instructs the parser to issue a callback when it finds the end of an entity.

Convenient Combination Flags

The following additional flags provide useful combinations of the basic flags:

  • $$$SAXCONTENTEVENTS — Instructs the parser to issue a callback for any event that contains “content.”

  • $$$SAXLEXICALEVENT — Instructs the parser to issue a callback for any lexical event.

  • $$$SAXALLEVENTS — Instructs the parser to issue callbacks for all events.

Combining Flags into a Single Mask

The following fragment shows how you can combine multiple flags into a single mask:

...
#include %occInclude
#include %occSAX
...
 // set the mask options we want
 set mask = $$$SAXSTARTDOCUMENT
               + $$$SAXENDDOCUMENT
               + $$$SAXSTARTELEMENT
               + $$$SAXENDELEMENT
               + $$$SAXCHARACTERS
...
 // create a TextReader object (doc) by reference
 set status = ##class(%XML.TextReader).ParseFile(myfile,.doc,,,mask)

Copy code to clipboard

Specifying a Schema Document

You can specify a schema specification against which to validate the document source. Specify a string that contains a comma-separated list of namespace/URL pairs:

"namespace URL,namespace URL,namespace URL,..."
Copy code to clipboard

Here namespace is the XML namespace (not a namespace prefix) and URL is a URL that gives the location of the schema document for that namespace. There is a single space character between the namespace and URL values. For example, the following shows a schema specification with a single namespace:

"http://www.myapp.org http://localhost/myschemas/myapp.xsd"
Copy code to clipboard

The following shows a schema specification with two namespaces:

"http://www.myapp.org http://localhost/myschemas/myapp.xsd,http://www.other.org http://localhost/myschemas/other.xsd"
Copy code to clipboard

Disabling Entity Resolution

Even when you set SAX flags to disable validation, the SAX parser still attempts to resolve external entities, which can be time-consuming, depending on their locations.

The class %XML.SAX.NullEntityResolver implements an entity resolver that always returns an empty stream. Use this class if you want to disable entity resolution. Specifically, when you read the XML document, use an instance of %XML.SAX.NullEntityResolver as the entity resolver. For example:

   Set resolver=##class(%XML.SAX.NullEntityResolver).%New()
   Set reader=##class(%XML.Reader).%New()
   Set reader.EntityResolver=resolver
   
   Set status=reader.OpenFile(myfile)
   ...
Copy code to clipboard
Important:

Because this change disables all resolution of external entities, this technique also disables all external DTD and schema references in your XML document.

Performing Custom Entity Resolution

Your XML document may contain references to external DTDs or other entities. By default, Caché attempts to find the source documents for these entities and resolve them. To control how Caché resolves external entities, use the following procedure:

  1. Define an entity resolver class.

    This class must extend the %XML.SAX.EntityResolver class and must implement the resolveEntity() method, which has the following signature:

    method resolveEntity(publicID As %Library.String, systemID As %Library.String) as %Library.Integer
    Copy code to clipboard

    This method is invoked each time the XML processor finds a reference to an external entity (such as a DTD); here publicID and systemID are the Public and System identifier strings for that entity.

    The method should fetch the entity or document, return it as a stream, and then wrap the stream in an instance of %XML.SAX.StreamAdapter. This class provides the necessary methods that are used to determine characteristics of the stream.

    If the entity cannot be resolved, the method should return $$$NULLOREF to indicate to the SAX parser that the entity cannot be resolved).

    Important:

    Despite the fact that the method signature indicates that the return value is %Library.Integer, the method should return an instance of %XML.SAX.StreamAdapter or a subclass of that class.

    Also, identifiers that reference external entities are always passed to the resolveEntity() method as specified in the document. Particularly, if such an identifier uses a relative URL, the identifier is passed as a relative URL, which means that the actual location of the referencing document is not passed to the resolveEntity() method, and the entity cannot be resolved. In such scenarios, use the default entity resolver rather than a custom one.

    For an example of an entity resolver class, see the source code for %XML.SAX.EntityResolver.

  2. When you read an XML document, do the following:

    1. Create an instance of your entity resolver class.

    2. Use that instance when you read the XML document, as described in “Specifying the Parser Options,” earlier in this topic.

Also see the previous section, “Disabling Entity Resolution”; note that %XML.SAX.NullEntityResolver (discussed in that section) is a subclass of %XML.SAX.EntityResolver.

Example 1

For example, consider the following XML document:

<?xml version="1.0" ?>
<!DOCTYPE html SYSTEM  "c://temp/html.dtd">
<html>
<head><title></title></head>
<body>
<p>Some < xhtml-content > with custom entities &entity1; and &entity2;.</p>
<p>Here is another paragraph with &entity1; again.</p>
</body></html>
Copy code to clipboard

This document uses the following DTD:

<!ENTITY entity1
         PUBLIC "-//WRC//TEXT entity1//EN"
         "http://www.intersystems.com/xml/entities/entity1">
<!ENTITY entity2
         PUBLIC "-//WRC//TEXT entity2//EN"
         "http://www.intersystems.com/xml/entities/entity2">
<!ELEMENT html (head, body)>
<!ELEMENT head (title)>
<!ELEMENT title (#PCDATA)>
<!ELEMENT body (p)>
<!ELEMENT p (#PCDATA)>
Copy code to clipboard

To read this document, you would need a custom entity resolver like the following:

Class CustomResolver.Resolver Extends %XML.SAX.EntityResolver
{

Method resolveEntity(publicID As %Library.String, systemID As %Library.String) As %Library.Integer
{
    Try {
        Set res=##class(%Stream.TmpBinary).%New()
        //check if we are here to resolve a custom entity
        If systemID="http://www.intersystems.com/xml/entities/entity1" 
        {
            Do res.Write("Value for entity1")
            Set return=##class(%XML.SAX.StreamAdapter).%New(res)
            }
            Elseif systemID="http://www.intersystems.com/xml/entities/entity2" 
            {
                Do res.Write("Value for entity2")
                Set return=##class(%XML.SAX.StreamAdapter).%New(res)
            }
            Else //otherwise call the default resolver
            {
                Set res=##class(%XML.SAX.EntityResolver).%New()
                Set return=res.resolveEntity(publicID,systemID)
            }
    }
    Catch 
    {
        Set return=$$$NULLOREF
    }
    Quit return
}

}
Copy code to clipboard

The following class contains a demo method that parses the file shown earlier and uses this custom resolver:

Include (%occInclude, %occSAX)

Class CustomResolver.ParseFileDemo
{

ClassMethod ParseFile() 
{
    Set res= ##class(CustomResolver.Resolver).%New()  
    Set file="c:/temp/html.xml"
    Set parsemask=$$$SAXALLEVENTS+$$$SAXERROR
    Set status=##class(%XML.TextReader).ParseFile(file,.textreader,res,,parsemask,,0)
    If $$$ISERR(status) {Do $system.OBJ.DisplayError(status) Quit }

    Write !,"Parsing the file ",file,! 
    Write "Custom entities in this file:"
    While textreader.Read()
    {
        If textreader.NodeType="entity"{
            Write !, "Node:", textreader.seq
            Write !,"    name: ", textreader.Name
            Write !,"    value: ", textreader.Value
        }
    }

}

}
Copy code to clipboard

The following shows the output of this method, in a Terminal session:

GXML>d ##class(CustomResolver.ParseFileDemo).ParseFile()
 
Parsing the file c:/temp/html.xml
Custom entities in this file:
Node:13
    name: entity1
    value: Value for entity1
Node:15
    name: entity2
    value: Value for entity2
Node:21
    name: entity1
    value: Value for entity1
Copy code to clipboard

Example 2

For example, suppose that you need to read an XML document that contains the following (assuming that c:\cachesys is your Cache installation directory; see “Default Caché Installation Directory” in the Caché Installation Guide for the actual location on your system):

<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.1.2//EN"
 "c:\cachesys\csp\docbook\doctypes\docbook\docbookx.dtd">
Copy code to clipboard

In this case, the resolveEntity method would be invoked with publicId set to -//OASIS//DTD DocBook XML V4.1.2//EN and systemId set to c:\cachesys\csp\docbook\doctypes\docbook\docbookx.dtd.

The resolveEntity method determines the correct source for the external entity, returns it as a stream, and wraps it in an instance of %XML.StreamAdaptor. The XML parser reads the entity definition from this specialized stream.

For an example, refer to the %XML.Catalog and %XML.CatalogResolver classes included in the Caché library. The %XML.Catalog class defines a simple database that associates public and system identifiers with URLs. The %XML.CatalogResolver class is an entity resolver class that uses this database to find the URL for a given identifier. The %XML.Catalog class can load its database from an SGML-style catalog file; this file maps identifiers to URLs in a standard format.

Creating a Custom Content Handler

You can create a custom content handler for your own needs, if you call the Caché SAX Parser directly. This section discusses the following items:

Overview of Creating Custom Content Handlers

To customize how the Caché SAX Parser imports and handles XML, create and use a custom SAX content handler. Specifically, create a subclass of %XML.SAX.ContentHandler. Then, in the new class, override any of the default methods to perform the actions that are required. Use the new content handler as an argument when you parse an XML document; to do this, you use the parsing methods of the %XML.SAX.Parser class.

This operation is illustrated in the following diagram:

generated description: sax.jpg

The process for creating and using a custom import mechanism is as follows:

  1. Create a class that extends %XML.SAX.ContentHandler.

  2. In that class, include the methods that you wish to override and provide new definitions as needed.

  3. Write a class method that reads an XML document by using one of the parsing methods of the %XML.SAX.Parser class, namely ParseFile(), ParseStream(), ParseString(), or ParseURL().

    When you call the parsing method, specify your custom content handler as an argument.

Customizable Methods of the SAX Content Handler

The %XML.SAX.ContentHandler class automatically executes certain methods at specific times. By overriding them, you can customize the behavior of your content handler.

Responding to Events

The %XML.SAX.ContentHandler class parses an XML file and generates events when it reaches particular points in the XML file. Depending on the event, a different method is executed. These methods are as follows:

  • OnPostParse() — Triggered when XML parsing is complete.

  • characters() — Triggered by character data.

  • comment() — Triggered by comments.

  • endCData() — Triggered by the end of a CDATA section.

  • endDocument() — Triggered by the end of the document.

  • endDTD() — Triggered by the end of a DTD.

  • endElement() — Triggered by the end of an element.

  • endEntity() — Triggered by the end of an entity.

  • endPrefixMapping() — Triggered by the end of a namespace prefix mapping.

  • ignorableWhitespace() — Triggered by ignorable whitespace in element content.

  • processingInstruction() — Triggered by an XML processing instruction.

  • skippedEntity() — Triggered by a skipped entity.

  • startCData() — Triggered by the beginning of a CDATA section.

  • startDocument() — Triggered by the beginning of the document.

  • startDTD() — Triggered by the beginning of a DTD.

  • startElement() — Triggered by the start of an element.

  • startEntity() — Triggered by the start of an entity.

  • startPrefixMapping() — Triggered by the start of an namespace prefix mapping.

These methods are empty by default, and you can override them in your custom content handler. For information on their expected argument lists and return values, see the class documentation for %XML.SAX.ContentHandler.

Handling Errors

The %XML.SAX.ContentHandler class also executes methods when it encounters certain errors:

  • error() — Triggered by a recoverable parser error.

  • fatalError() — Triggered by a fatal XML parsing error.

  • warning() — Triggered by notification of a parser warning.

These methods are empty by default, and you can override them in your custom content handler. For information on their expected argument lists and return values, see the class documentation for %XML.SAX.ContentHandler.

Computing the Event Mask

When you call the Caché SAX Parser (via the %XML.SAX.Parser class), you can specify a mask argument that indicates which callbacks are interesting. If you do not specify a mask argument, the parser calls the Mask() method of the content handler. This method returns an integer that specifies the composite mask that corresponds to your overridden methods of the content handler.

For example, suppose that you create a custom content handler that contains new versions of the startElement() and endElement() methods. In this case, the Mask() method returns a numeric value that is equivalent to the sum of $$$SAXSTARTELEMENT and $$$SAXENDELEMENT, the flags that corresponding to these two events. If you do not specify a mask argument to the parsing method, the parser calls the Mask() method of your content handler and thus processes only those two events.

Other Useful Methods

The %XML.SAX.ContentHandler class provides other methods that are useful in special situations:

  • LocatePosition() — Returns, by reference, two arguments that indicate the current position in the parsed document. The first indicates the line number, and the second indicates the line offset.

  • PushHandler() — Pushes a new content handler on the stack. All subsequent callbacks from SAX go to this new content handler, until this handler is finished processing.

    You use this method if you are parsing a document of one type and you encounter a segment of XML that you want to parse in a different way. In this case, when you detect the segment that you want to handle differently, you call the PushHandler() method, which creates a new content handler instance. All callbacks go to this content handler until you call PopHandler() to return the previous content handler.

  • PopHandler() — Returns to the previous content handler on the stack.

These methods are final and cannot be overridden.

Argument Lists for the SAX Parsing Methods

To specify a document source, you use the ParseFile(), ParseStream(), ParseString(), or ParseURL() method of the %XML.SAX.Parser class. In any case, the source document must be a well-formed XML document; that is, it must obey the basic rules of XML syntax. The complete argument list is as follows, in order:

  1. pFilename, pStream, pString, or pURL — The document source.

  2. pHandler — A content handler, which is an instance of the %XML.SAX.ContentHandler class.

  3. pResolver — An entity resolver to use when parsing the source. See “Performing Custom Entity Resolution,” earlier in this topic.

  4. pFlags — Flags to control the validation and processing performed by the SAX parser. See “Setting the Parser Flags,” earlier in this topic.

  5. pMask — A mask to specify which items are of interest in the XML source. Usually you do not need to specify this argument, because for the parsing methods of %XML.SAX.Parser, the default mask is 0. This means that the parser calls the Mask() method of the content handler. That method computes the mask by detecting (during compilation) all the event callbacks that you customized in the event handler. Only those event callbacks are processed. However, if you want to specify the mask, see “Specifying the Event Mask,” earlier in this topic.

  6. pSchemaSpec — A schema specification, against which to validate the document source. This argument is a string that contains a comma-separated list of namespace/URL pairs:

    "namespace URL,namespace URL"
    Copy code to clipboard

    Here namespace is the XML namespace used for the schema and URL is a URL that gives the location of the schema document. There is a single space character between the namespace and URL values.

  7. pHttpRequest (For the ParseURL() method only) — The request to the web server, as an instance of %Net.HttpRequest.

    For details on %Net.HttpRequest, see the book Using Caché Internet Utilities. Or see the class documentation for %Net.HttpRequest.

  8. pSSLConfiguration — Configuration name of a client SSL/TLS configuration.

    See “Using HTTPS,” later in this topic.

Note:

Notice that this argument list is slightly different from that of the parse methods of the %XML.TextReader class. For one difference, %XML.TextReader does not provide an option to specify a custom content handler.

A SAX Handler Example

Suppose you want a list of all the XML elements that appear in a file. To do this, you need simply to note every start element. Then the process is as follows:

  1. Create a class, here called MyApp.Handler, which extends %XML.SAX.ContentHandler:

    Class MyApp.Handler Extends %XML.SAX.ContentHandler
    {
    }
    Copy code to clipboard
  2. Override the startElement() method with the following content:

    Class MyApp.MyHandler extends %XML.SAX.ContentHandler
    {
    // ...
    
    Method startElement(uri as %String, localname as %String, 
                 qname as %String, attrs as %List)
    {
        //we have found an element
        write !,"Element: ",localname
    }
    
    }
    
    Copy code to clipboard
  3. Add a class method to the Handler class that reads and parses an external file:

    Class MyApp.MyHandler extends %XML.SAX.ContentHandler
    {
    // ...
    ClassMethod ReadFile(file as %String) as %Status
    {
        //create an instance of this class
        set handler=..%New()
    
        //parse the given file using this instance
        set status=##class(%XML.SAX.Parser).ParseFile(file,handler)
    
        //quit with status
        quit status
    }
    }
    Copy code to clipboard

    Note that this is a class method because it is invoked in an application to perform its processing. This method does the following:

    1. It creates an instance of a content handler object:

          set handler=..%New()
      Copy code to clipboard
    2. It invokes the ParseFile() method of the %XML.SAX.Parser class. This validates and parses the document (specified by filename) and invokes the various event handling methods of the content handler object:

          set status=##class(%XML.SAX.Parser).ParseFile(file,handler)
      Copy code to clipboard

      Each time an event occurs while the parser parses the document (such as a start or end element), the parser invokes the appropriate method in the content handler object. In this example, the only overridden method is startElement(), which then writes out element names. For other events, such as reaching end elements, nothing happens (the default behavior).

    3. When the ParseFile() method reaches the end of the file, it returns. The handler object goes out of scope and is automatically removed from memory.

  4. At the appropriate point in the application, invoke the ReadFile() method, passing it the file to parse:

     Do ##class(Samples.MyHandler).ReadFile(filename)
    Copy code to clipboard

    Where filename is the path of the file being read.

For instance, if the content of the file is as follows:

<?xml version="1.0" encoding="UTF-8"?>
<Root>
  <Person>
    <Name>Edwards,Angela U.</Name>
    <DOB>1980-04-19</DOB>
    <GroupID>K8134</GroupID>
    <HomeAddress>
      <City>Vail</City>
      <Zip>94059</Zip>
    </HomeAddress>
    <Doctors>
      <Doctor>
        <Name>Uberoth,Wilma I.</Name>
      </Doctor>
      <Doctor>
        <Name>Wells,George H.</Name>
      </Doctor>
    </Doctors>
  </Person>
</Root>
Copy code to clipboard

Then the output of this example is as follows:

Element: Root
Element: Person
Element: Name
Element: DOB
Element: GroupID
Element: HomeAddress
Element: City
Element: Zip
Element: Doctors
Element: Doctor
Element: Name
Element: Doctor
Element: Name
Copy code to clipboard

Using HTTPS

%XML.SAX.Parser supports HTTPS. That is, you can use this class to do the following:

  • (For ParseURL()) Parse XML documents served at HTTPS locations.

  • (For all parsing methods) Resolve entities at HTTPS locations.

In all cases, if any of these items are served at an HTTPS location, do the following:

  1. Use the Management Portal to create an SSL/TLS configuration that contains the details of the needed connection. For information, see the topic “Using SSL/TLS with Caché” in the Caché Security Administration Guide.

    This is a one-time step.

  2. When you invoke the applicable parsing method of %XML.SAX.Parser, specify the pSSLConfiguration argument.

By default, Caché uses the Xerces entity resolution. %XML.SAX.Parser uses its own entity resolution only in the following cases: