Skip to main content
Previous sectionNext section

Overview of Property Paths in XML Virtual Documents

This chapter provides an overview of property paths in XML virtual documents. It discusses the following:

The next two chapters describe in detail how to create property paths.

Note:

The code examples in this chapter are fragments from data transformations, because data transformations generally use a richer set of property paths than do rule sets and search tables. Also, the emphasis is on DOM-style paths, because those are the paths that you must create manually. (In contrast, when you specify a schema to use, Ensemble displays the structure of the document and automatically generates schema-dependent paths when you drag and drop or when you use auto-completion.)

Orientation to Virtual Property Paths for XML Virtual Documents

This section briefly introduces virtual property paths for XML virtual documents.

As noted earlier, you can use schema-dependent paths only if you have loaded the corresponding XML schema. You can always use DOM-style paths, even when no schema is available.

Basic Syntax for Schema-dependent Paths

For XML virtual documents, a schema-dependent path consists of a set of path units separated by periods, as in the following example:

unit1.unit2.unit3
Copy code to clipboard

Where unit1 is the name of a child XML element in the document, unit2 is the name of a child element within unit1, and so on. The leaf unit is the name of either a child XML element or an XML attribute.

For example:

HomeAddress.City

For complete information, see the chapter “Specifying Schema-dependent Paths.”

Basic Syntax for DOM-style Paths

A DOM-style path always starts with a slash and has the basic structure shown in the following example:

/root_unit/unit1/unit2/unit3
Copy code to clipboard

Each path unit has the following form.

namespace_identifier:name
Copy code to clipboard

Where namespace_identifier represents the XML namespace; this is a token that Ensemble replaces with the actual namespace URI, as discussed in a later subsection. This token is needed only if the element or attribute is in a namespace, as you will see later in this chapter.

name is the name of an XML element or attribute.

For example:

/$2:Patient/$2:HomeAddress/$2:City

For complete information, see “Specifying DOM-style Paths.”

XML Namespace Tokens

When you load a schema into Ensemble, Ensemble establishes a set of tokens for the namespaces used in that schema, for use in any DOM-style paths.

The token $1 is used for first namespace that is declared in the schema; this usually corresponds to the XML schema namespace (http://www.w3.org/2001/XMLSchema). The token $2 is used for the next namespace that is declared in the schema, $3 is used for the third, and so on.

Ensemble assigns namespace tokens for all namespaces declared in the schema, whether or not those namespaces are actually used. Therefore, Ensemble might use $3 or a higher value rather than $2 for the items of interest to you, if additional namespaces are declared in the schema. It is practical to use the Management Portal to view the individual path units, as discussed in the next section, to be sure that you are using the correct token for a specific path unit.

You can use namespace tokens if you have also loaded the corresponding schema (and have configured the applicable business host to use that schema). Otherwise, you must use the namespace prefixes exactly as given in the XML document.

Viewing Path Units for XML Virtual Documents

Until you are familiar with property paths for XML virtual documents, it is useful to use the Management Portal to view the individual path units. You can do this if you have loaded the corresponding schema.

To view the path units for the elements and attributes in a schema:

  1. Load the schema as described in the previous chapter.

    For example, consider the following XML schema, shown here for reference, for the benefit of readers who are familiar with XML schemas:

    <?xml version="1.0" encoding="UTF-8"?>
    <schema xmlns="http://www.w3.org/2001/XMLSchema" 
                elementFormDefault="qualified" targetNamespace="http://myapp.com"     xmlns:myapp="http://myapp.com">
      <element name="Patient" type="myapp:Patient"/>
      <complexType name="Patient">
        <sequence>
          <element minOccurs="0" name="Name" type="string"/>
          <element minOccurs="0" name="FavoriteColors" 
                       type="myapp:ArrayOfFavoriteColorString" />
          <element minOccurs="0" name="Address" type="myapp:Address" />
          <element minOccurs="0" name="Doctor" type="myapp:Doctor" />
        </sequence>
        <attribute name="MRN" type="string"/>
        <attribute name="DL" type="string"/>
      </complexType>
      <complexType name="ArrayOfFavoriteColorString">
        <sequence>
          <element maxOccurs="unbounded" minOccurs="0" name="FavoriteColor" 
                       nillable="true" type="string"/>
        </sequence>
      </complexType>
      <complexType name="Address">
        <sequence>
          <element minOccurs="0" name="Street" type="string"/>
          <element minOccurs="0" name="City" type="string"/>
          <element minOccurs="0" name="State" type="string"/>
          <element minOccurs="0" name="ZIP" type="string"/>
        </sequence>
      </complexType>
      <complexType name="Doctor">
        <sequence>
          <element minOccurs="0" name="Name" type="string"/>
        </sequence>
      </complexType>
    </schema>
    Copy code to clipboard

    The following shows an example XML document that obeys the schema shown in this section:

    <?xml version="1.0" ?>
    <Patient MRN='000111222' xmlns='http://myapp.com'>
        <Name>Georgina Hampton</Name>
        <FavoriteColors>
            <FavoriteColor>Red</FavoriteColor>
            <FavoriteColor>Green</FavoriteColor>
        </FavoriteColors>
        <Address>
            <Street>86 Bateson Way</Street>
            <City>Fall River</City>
        </Address>
        <Doctor>
            <Name>Dr. Randolph</Name>
        </Doctor>
    </Patient>
    Copy code to clipboard
  2. Click Ensemble > Interoperate > XML > XML Schema Structures. This displays the XML Schemas page. The left column lists XML schemas loaded into this Ensemble namespace.

  3. Click Category link in the row corresponding to the XML schema of interest.

    If we do this for the XML schema shown previously, Ensemble then displays this:

    images/exml_doc_types.png

  4. Click the link for the document type of interest.

    If we click Patient, Ensemble then displays this:

    images/exml_path_demo_patient.png

    On this page:

    • Above the table, the value in large font displays the DocType value for this XML element. In this case, DocType is MyApp:Patient.

    • The Name column shows path units in the format needed for schema-dependent paths.

      In this case, this page tells us that we can use Name, FavoriteColors, Address, Doctor, MRN, and DL as path units in schema-dependent paths.

    • The Element column shows path units in the format needed for DOM-style property paths.

      In this case, this page tells us that we can use $3:Name, $2:FavoriteColors/$2:FavoriteColor, $2:Address, $2:Doctor/$2:Name, @MRN, and @DL as path units in DOM-style paths. Notice that @MRN and @DL do not have a namespace prefix; these attributes are not in any namespace.

  5. Click additional sub-items as wanted.

    If we click Address in the Name column, Ensemble displays this:

    images/exml_path_demo_address.png

    This page displays any additional path units within Address.

    In this case, this page tells us that we can use these additional path units in combination with the path unit that we used to get to this page, for example:

    Schema-dependent path (partial) ...Address.Street
    DOM-style path (partial) /.../$2:Address/$2:Street

The following sections note specific variations due to schema variations.

Redundant Inner Elements for Schema-dependent Paths

For schema-dependent paths, Ensemble collapses redundant inner elements. This is best explained by example:

  • The <FavoriteColors> element contains a sequence of multiple <FavoriteColor> elements. On the schema viewer page, <FavoriteColors> is shown simply as FavoriteColors() in the Name column (which shows the path unit for schema-dependent paths). This column is displayed in blue in the following figure.

    images/exml_path_demo_patient.png

    In contrast, the same element is shown as $2:FavoriteColors/$2:FavoriteColorsItem in the Element column on the right. This column shows the path unit for DOM-style paths.

    For a sequence of multiple items of the same type, the schema-dependent path does not use the name of the inner element. (In contrast, the DOM-style path uses all the element names.) More generally, any redundant inner levels found in a schema are ignored in schema-dependent paths; the following item shows another example.

  • The <Doctor> element includes a single <Name> element. On the schema viewer page, the <Doctor> item is shown as Doctor in the Name column, as shown in the previous figure.

    Notice that the schema-dependent path to the data inside <Doctor> does not use the name of the inner element.

    In contrast, the same item is shown as $3:Doctor/$3:Name in the Element column on the right. This column shows the path unit for DOM-style paths.

Repeating Fields

If a given element can occur multiple times, the Name column displays parentheses () at the end of the element name. For example, see the FavoriteColors() row in the preceding figure.

The Type and Element columns indicate the number of times the element can be repeated. In this case, the element can be repeated five times. If there is no number displayed in parentheses in the Type column, the element can be repeated any number of times.

Duplicate Names

If an XML schema has multiple elements at the same level that have the same name but different types, then Ensemble appends _2, _3, and so on, as needed to create unique names at that level. This procedure applies only to the schema-dependent paths. For example, consider a schema that defines the <Person> element to include two elements named <Contact>. One is of type <Phone> and the other is of type <Assistant>. Ensemble displays the schema for the <Person> element as follows:

images/exml_duplicate_names.png

Similarly, if the schema has multiple elements at the same level but in different namespaces, then Ensemble appends _2, _3, and so on, as needed to create unique names at that level. This procedure applies only to the schema-dependent paths.

Choice Structures

Some schemas include <choice> structures, like the following example:

<xsd:choice>
  <xsd:element name="OptionA"  type="my:OptionType"/>
  <xsd:element name="OptionB"  type="my:OptionType"/>
  <xsd:element name="OptionC"  type="my:OptionType"/>
</xsd:choice>
Copy code to clipboard

Ensemble represents this structure differently for the two kinds of paths. The following shows an example:

images/exml_path_choice1.png

For schema-dependent paths, the Name displays a generic name for the <choice> structure, and the Type column displays a numeric placeholder. The Element column does not display anything.

If we click choice, Ensemble then displays the following:

images/exml_path_choice2.png

In this case, these pages tell us that we can use the following paths to access OptionB:

Schema-dependent path (partial) ...Parent.choice.OptionB
DOM-style path (partial) /.../Parent/OptionB

Groups Included by Reference

A schema can include a <group> that is included via the ref attribute. For example:

<s01:complexType name="Patient">
   <s01:sequence>
      <s01:element name="Name" type="s01:string" minOccurs="0"/>
      <s01:element name="Gender" type="s01:string" minOccurs="0"/>
      <s01:element name="BirthDate" type="s01:date" minOccurs="0"/>
      <s01:element name="HomeAddress" type="s02:Address" minOccurs="0"/>
      <s01:element name="FavoriteColors"
        type="s02:ArrayOfFavoriteColorsItemString" minOccurs="0"/>
      <s01:element name="Container" type="s02:ContainerType" minOccurs="0"/>
      <s01:element name="LatestImmunization" type="s02:Immunization" minOccurs="0"/>
      <s01:element ref="s02:Insurance" minOccurs="0"/>
      <s01:group ref="s02:BoilerPlate" minOccurs="1" maxOccurs="1"/>
   </s01:sequence>
...
<s01:group name="BoilerPlate">
   <s01:sequence>
      <s01:element name="One" type="s01:string"/>
      <s01:element name="Two" type="s01:string"/>
      <s01:element name="Three" type="s01:string"/>
   </s01:sequence>
</s01:group>
Copy code to clipboard

Ensemble represents this structure differently for the two kinds of paths. The following shows an example:

images/exml_path_refgroup1.png

For schema-dependent paths, the Name displays the name of the group, and the Type column displays a numeric placeholder. The Element column also displays the name of the group.

If we click BoilerPlate, Ensemble then displays the following:

images/exml_path_refgroup2.png

In this case, these pages tell us that we can use the following paths to access Two:

Schema-dependent path (partial) ...Patient.BoilerPlate.Two
DOM-style path (partial) /.../$2:Patient/$2:Two