The ResultXSLT™ Environment

G. Ken Holman

Abstract

"ResultXSLT™" is a methodology, an XSLT 1.0 stylesheet and a supplemental validation expression for the synthesis of importable XSLT 1.0 stylesheet fragments out of pro-forma transformation results that are seeded with the necessary information to build the desired transformation.

Keywords: XSLT; UBL

G. Ken Holman

Mr. G. Ken Holman is the Chief Technology Officer for Crane Softwrights Ltd., current international secretary of the ISO subcommittee responsible for the SGML family of standards, an invited expert to the W3C and member of the W3C Working Group that developed XML from SGML, the founding chair of the two OASIS XML and XSLT Conformance Technical Committees and current chair of the UBL FPSC subcommittee, the former chair of the Canadian committee to the ISO, the author of electronically-published and print-published books on XML-related technologies, and a frequent conference speaker.

The ResultXSLT™ Environment

G. Ken Holman [Crane Softwrights Ltd.]

Extreme Markup Languages 2005® (Montréal, Québec)

Copyright © 2005 G. Ken Holman. Reproduced with permission.

Evolution

The Universal Business Language (UBL) is a large XML vocabulary. Version 0p70 described seven document types, version 1.0 describes eight document types, and a future version could very well have 20 or more document types.

Based on lessons learned from the "LiterateXSLT™" environment developed in December 2002 for UBL 0p70, this environment is the second generation of XSLT 1.0-based stylesheet synthesis tools created to tackle the task for UBL 1.0. Both environments implement an annotation approach of decorating a prototypical transformation result with signals used to generate a stylesheet suitable for transforming actual data instances into corresponding outputs.

Whereas the LiterateXSLT set of annotations provided for the synthesis of a complete stylesheet tied to a single input vocabulary, the new "ResultXSLT™" is more modest and, in the end, turns out to be more flexible to accommodate multiple input vocabularies. In this way not only are instances of the UBL vocabulary satisfied but other vocabularies can use the generated works as well. This approach reduces the traditional manual tasks of writing stylesheets for different vocabularies for the same result.

One of the earliest observations regarding the first UBL 0p70 stylesheets created by LiterateXSLT was that synthesizing a stylesheet tied to a single vocabulary did not exploit the investment in developing prototype transformation results. Users of other vocabularies asked about the use of the UBL stylesheets and it was obvious the stylesheets were so hardwired to the one vocabulary as to have no utility in other contexts.

A lot of thought has gone into this ResultXSLT environment to ensure that a single investment in a prototypical transformation result could be exploited by many vocabularies. The result ended up being very simple, and sometimes the simplest ideas are the most powerful. The entire vocabulary has only two required action attributes, and two required namespace declarations. Optionally one may wish to use two attribute namespaces, four housekeeping attributes and one action element that has two attributes.

Crane's UBL 1.0 stylesheet library is constructed using ResultXSLT and provides an illustrative example of the use of all of the features of this vocabulary. Each of the stylesheet library, the LiterateXSLT environment and the ResultXSLT environment are all freely downloadable.

NOTE:

The use of "™" claims only a trademark on the use of the name for this process, such that others may not use the same name to represent any other kind of process. There are no restrictions on the downloading and use of these technologies, as described in the copyright in the stylesheet code.

Overview

It is often difficult for a stylesheet writer to conceptualize the result tree order when writing an XSLT stylesheet.

Yet, when writing an XSLT stylesheet, the author is obliged to ensure the transformation result is produced in result parse order. The author must orient their primary thinking around the result tree and treat the source tree with a secondary attitude. Having determined "what belongs next" in the result tree, the author goes to the source tree to find the information that is necessary to produce the result.

A common "good practice" when approaching writing an XSLT stylesheet is to mock up the result as a result instance. This helps the author of the stylesheet precisely determine the correct result structure needed in order to produce the result rendering. For example, XSL-FO processors accept hand-authored XML instances of the XSL-FO vocabulary as the input to the rendering process. This allows the stylesheet author to create a complete mockup of the result as a result instance, using an XSL-FO processor to render the mockup and the author to visually confirm the desired result.

At this stage of stylesheet development, the stylesheet author must "deconstruct" the result instance into separate template rules, divine the XPath patterns to match each of these template rules, divine the XPath select expressions to apply the template rules to the source nodes of the input data, establish components of the stylesheet to be localized for maintenance purposes, and organize the stylesheet into an order that promotes successful long-term maintenance.

ResultXSLT does the deconstruction step of this process for the stylesheet writer based on seeded signals of required "branches" in the result tree. Once deconstructed into a set of fragments of the result tree, all that remains is to write stylesheet logic to engage and be engaged at various points of the building of the result. Through XSLT importation, an importing stylesheet can provide the vocabulary-specific logic and import a generic vocabulary-independent collection of result tree fragments.

This process is applicable when wishing in XSLT to decouple the result tree construction from the source tree matching. Such decoupling allows the result tree construction to be (1) maintained separately from the source tree matching and (2) exploited with different matching for different source vocabularies.

The old and new processes are shown in diagrams that follow, beginning in the bottom left of each diagram with the prototypical result instance.

The prototypical result instance is created as a sample visualization of the results of transforming the production source documents. Typically this is hand authored or could be the export from a tool in which the visualization is created. Placebo values for source constructs are included in the visualization.

At any time the prototypical result instance can be forwarded to the production process for visualization using the production tools, rather than any creation tools. Visualizing the result instance in the production environment will give assurances that the choice of markup in the prototypical result instance is acceptable.

Each of the old and new processes recognize annotations in the prototypical result and synthesize XSLT 1.0 stylesheets fragments. The differences are in the nature of the fragments and their use with the production source instances.

The "old" process

The LiterateXSLT process can be overviewed as follows:

This process synthesizes complete XSLT stylesheets geared specifically for the input vocabulary as described in the annotations. It is this hardwiring to the input vocabulary that is distinct from the new process. Various validation and preview processes are shown in the top left and are likewise available in the new process (described below).

The "new" process

The ResultXSLT process can be overviewed as follows:

This process synthesizes XSLT stylesheet fragments without any bias to any input vocabulary. The annotation vocabulary described in this environment is a set of signals that are added to the prototypical result to indicate where out/in branch pairs are to be created in the result tree. An out/in branch pair is a "hook" into the stylesheet fragment that can be (but not necessarily need be) taken advantage of by an importing stylesheet.

Other annotations indicate portions of the result that are not to exist in the synthesized stylesheet and yet other annotations perform housekeeping duties regarding the order and documentation of constructs in the output stylesheet.

The production source instances are all of a given vocabulary described by a document type of some kind. These are the source files that will be transformed to the production results by the authored stylesheet.

The authored stylesheet imports the synthesized importable stylesheet and typically (but not necessarily) overrides the "out" template rule of each of the synthesized out/in branch pairs, making calls where needed to each corresponding "in" template rule.

The production process can take the prototypical result instance and the transformed results of the source files to produce the final results.

There are three files in the ResultXSLT environment:

stripres.xsl stylesheet

When the presence of the ResultXSLT vocabulary in the prototypical result instance disturbs the production process, the stripres.xsl XSLT 1.0 stylesheet can remove the presence of any attribute or element from any of the ResultXSLT namespaces. The output instance can then be passed to the production process for visualization.

result.rnc grammar expression

The grammar of ResultXSLT annotations is normatively described by the ISO/IEC 19757-2 RELAX-NG Compact Syntax expression in result.rnc.

Using this expression one can validate the use of the annotations in the prototypical result instance.

result.xsl synthesis stylesheet

An importable stylesheet is synthesized by applying the result.xsl XSLT 1.0 stylesheet transformation against a prototypical result instance.

Out/in branch pairs

The crux of ResultXSLT annotation is the identification of important branches in the input prototypical-result tree, thus creating out/in branch pairs in the output importable stylesheet fragment.

Consider this simple example where a prototypical HTML result has been created in order to ascertain all of the markup needed for the end result of a transformation.

Figure 3
<html>
  <head>
    <title>A test of ResultXSLT&#x2122;</title>
  </head>
  <body>
    <p>Here is a list of the inputs:</p>
    <ul>
      <li>
  ...

This prototypical result is annotated to indicate that a branch named "list" of the result tree generation begins at the <body> element:

Figure 4
<html xmlns:x="http://www.CraneSoftwrights.com/ns/result"
      x:out-prefix="out"
      x:in-prefix="in"
      xmlns:out="urn:ss-URI:out"
      xmlns:in="urn:ss-URI:in">
  <head>
    <title>A test of ResultXSLT&#x2122;</title>
  </head>
  <body x:branch="list">
    <p>Here is a list of the inputs:</p>
    <ul>
      <li x:branch="item">
  ...

The ResultXSLT environment synthesizes the following stylesheet fragment in a file called, say, test-frag.xsl:

Figure 5
<!--Root: default processing for the root node-->
   <xsl:template match="/">
      <html>
         <head>
            <title>A test of ResultXSLT&#x2122;</title>
         </head>
         <xsl:call-template name="out:list"/>
      </html>
   </xsl:template>

<!--Branch: list-->
   <xsl:template name="out:list">
      <xsl:call-template name="in:list"/>
   </xsl:template>
   <xsl:template name="in:list">
      <body>
         <p>Here is a list of the inputs:</p>
         <ul>
            <xsl:call-template name="out:item"/>
         </ul>
      </body>
   </xsl:template>

Note in the above how the result tree will be constructed by out:list calling in:list should the importing stylesheet have nothing to say about the processing of out:list.

Should the importing stylesheet have something to say, it can say it by overriding the out:list named template and then continue processing by calling the in:list named template, as in the following:

Figure 6
<xsl:import href="test-frag.xsl"/>

<!--set the context for the entire list to be "/tests"-->

<xsl:template name="out:list">
  <xsl:apply-templates select="/tests"/>
</xsl:template>
<xsl:template match="/tests">
  <xsl:call-template name="in:list"/>
</xsl:template>

Note that the above example importing stylesheet is using the "push" style of writing XSLT stylesheets as this modularity provides for even further specialization by other stylesheets importing this stylesheet. The "pull" style of writing XSLT stylesheets would work equally as well.

Illustrative complete test

The following is a transcription of a session creating an importable stylesheet fragment for the creation of an HTML result tree. Note how the prototypical result has four lines for illustrative purposes but the final result has all five tests from each of the input XML files.

To illustrate the independence of the one imported stylesheet fragment on different source vocabularies, a second data set using different element types is processed and illustrated.

Consider the input data that is to be processed by the stylesheet:

Figure 7
T:\resultXSLT>type test.xml 
<?xml version="1.0" encoding="iso-8859-1"?>
<tests>
  <test>First test source</test>
  <test>Second test source</test>
  <test>Third test source</test>
  <test>Fourth test source</test>
  <test>Fifth test source</test>
</tests>
T:\resultXSLT>

Annotate a prototypical result that illustrates the desired output:

Figure 8
T:\resultXSLT>type testout.xhtml 
<html xmlns:x="http://www.CraneSoftwrights.com/ns/result"
      x:out-prefix="out"
      x:in-prefix="in"
      xmlns:out="urn:ss-URI:out"
      xmlns:in="urn:ss-URI:in">
  <head>
    <title>A test of ResultXSLT&#x2122;</title>
  </head>
  <body x:branch="list">
    <p>Here is a list of the inputs:</p>
    <ul>
      <li x:branch="item">
        <i x:branch-children="item-content">
          test line 1
        </i>
      </li>
      <li x:ignore="item"><i>test line 2</i></li>
      <li x:ignore="item"><i>test line 3</i></li>
      <li x:ignore="item"><i>test line 4</i></li>
    </ul>
  </body>
</html>
T:\resultXSLT>

Validate the use of the annotations (using Jing 20020724):

Figure 9
T:\resultXSLT>jing -c result.rnc testout.xhtml 

T:\resultXSLT>

Synthesize the importable stylesheet fragment (using Saxon 6.5.3):

Figure 10
T:\resultXSLT>saxon -o test-frag.xsl testout.xhtml result.xsl 

T:\resultXSLT>

Examine the result fragment:

Figure 11
T:\resultXSLT>type test-frag.xsl 
<?xml version="1.0" encoding="utf-8"?>

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
 xmlns:xst="http://www.w3.org/1999/XSL/Transform"
 xmlns:out="urn:ss-URI:out" xmlns:in="urn:ss-URI:in"
 version="1.0" exclude-result-prefixes="out in">

<!--Root: default processing for the root node-->
   <xsl:template match="/">
      <html>
         <head>
            <title>A test of ResultXSLT&#x2122;</title>
         </head>
         <xsl:call-template name="out:list"/>
      </html>
   </xsl:template>

<!--Branch: list-->
   <xsl:template name="out:list">
      <xsl:call-template name="in:list"/>
   </xsl:template>
   <xsl:template name="in:list">
      <body>
         <p>Here is a list of the inputs:</p>
         <ul>
            <xsl:call-template name="out:item"/>
         </ul>
      </body>
   </xsl:template>

<!--Branch: item-->
   <xsl:template name="out:item">
      <xsl:call-template name="in:item"/>
   </xsl:template>
   <xsl:template name="in:item">
      <li>
         <i>
            <xsl:call-template name="out:item-content"/>
         </i>
      </li>
   </xsl:template>

<!--Branch: item-content-->
   <xsl:template name="out:item-content">
      <xsl:call-template name="in:item-content"/>
   </xsl:template>
   <xsl:template name="in:item-content">
          test line 1
        </xsl:template>
</xsl:stylesheet>
T:\resultXSLT>

Write a stylesheet for access to the source node tree:

Figure 12
T:\resultXSLT>type test.xsl 
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
  xmlns:out="urn:ss-URI:out" 
  xmlns:in="urn:ss-URI:in" 
  exclude-result-prefixes="out in"
  version="1.0">

<xsl:import href="test-frag.xsl"/>

<!--set the context for the entire list to be "/tests"-->

<xsl:template name="out:list">
  <xsl:apply-templates select="/tests"/>
</xsl:template>
<xsl:template match="/tests">
  <xsl:call-template name="in:list"/>
</xsl:template>

<!--process each of the tests in the given set-->

<xsl:template name="out:item">
  <xsl:apply-templates select="test"/>
</xsl:template>
<xsl:template match="test">
  <xsl:call-template name="in:item"/>
</xsl:template>

<!--process the contents of a test-->
<xsl:template name="out:item-content">
  <xsl:apply-templates/>
</xsl:template>

</xsl:stylesheet>
T:\resultXSLT>

Process the input with the authored stylesheet (using Saxon 6.5.3 in this example):

Figure 13
T:\resultXSLT>saxon -o test.html test.xml test.xsl 

T:\resultXSLT>

Get the desired result:

Figure 14
T:\resultXSLT>type test.html 
<html>
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
   
      <title>A test of ResultXSLT&#x2122;</title>
   </head>
   <body>
      <p>Here is a list of the inputs:</p>
      <ul>
         <li><i>First test source</i></li>
         <li><i>Second test source</i></li>
         <li><i>Third test source</i></li>
         <li><i>Fourth test source</i></li>
         <li><i>Fifth test source</i></li>
      </ul>
   </body>
</html>
T:\resultXSLT>

Consider another set of input data:

Figure 15
T:\resultXSLT>type test2.xml 
<?xml version="1.0" encoding="iso-8859-1"?>
<others>
  <other>First other source</other>
  <other>Second other source</other>
  <other>Third other source</other>
  <other>Fourth other source</other>
  <other>Fifth other source</other>
</others>
T:\resultXSLT>

Write another stylesheet for access to the different vocabulary:

Figure 16
T:\resultXSLT>type test2.xsl 
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
  xmlns:out="urn:ss-URI:out" 
  xmlns:in="urn:ss-URI:in" 
  exclude-result-prefixes="out in"
  version="1.0">

<xsl:import href="test-frag.xsl"/>

<!--set the context for the entire list to be "/others"-->

<xsl:template name="out:list">
  <xsl:apply-templates select="/others"/>
</xsl:template>
<xsl:template match="/others">
  <xsl:call-template name="in:list"/>
</xsl:template>

<!--process each of the others in the given set-->

<xsl:template name="out:item">
  <xsl:apply-templates select="other"/>
</xsl:template>
<xsl:template match="other">
  <xsl:call-template name="in:item"/>
</xsl:template>

<!--process the contents of an other-->
<xsl:template name="out:item-content">
  <xsl:apply-templates/>
</xsl:template>

</xsl:stylesheet>
T:\resultXSLT>

Process the input with the authored stylesheet (using Saxon 6.5.3 in this example):

Figure 17
T:\resultXSLT>saxon -o test2.html test2.xml test2.xsl 

T:\resultXSLT>

Get the desired result:

Figure 18
T:\resultXSLT>type test2.html 
<html>
   <head>
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
   
      <title>A test of ResultXSLT&#x2122;</title>
   </head>
   <body>
      <p>Here is a list of the inputs:</p>
      <ul>
         <li><i>First other source</i></li>
         <li><i>Second other source</i></li>
         <li><i>Third other source</i></li>
         <li><i>Fourth other source</i></li>
         <li><i>Fifth other source</i></li>
      </ul>
   </body>
</html>

Vocabulary

ResultXSLT is a vocabulary of attributes and an element used to decorate a prototypical result instance of an arbitrary XML vocabulary. This environment will validate the use of these annotations as expected on the input prototypical instance and then will accommodate the annotations in the synthesis of the output XSLT stylesheet fragment.

The following documentary conventions are used for namespace prefixes. Of course namespace prefixes are entirely arbitrary and you are not obliged to use the prefixes used in the following summary of namespace URI strings:

  • xmlns:x="http://www.CraneSoftwrights.com/ns/result"
  • xmlns:xa="http://www.CraneSoftwrights.com/ns/result/attr"
  • xmlns:xap="http://www.CraneSoftwrights.com/ns/result/attrp"

The vocabulary is normatively described by the supplied result.rnc schema using the ISO/IEC 19757-2 compact syntax also called RELAX-NG.

Action attributes

These attributes dictate the fragmentation of the synthesized stylesheet fragment. These are required attributes to make the result fragment useful to an importing stylesheet.

  • x:branch="template-name"
    • create an out/in branch pair of the given name in the output stylesheet fragment with the top element of the "in" template rule being the element on which this attribute is used
  • x:branch-children="template-name"
    • create an out/in branch pair of the given name in the output stylesheet fragment with the top constructs of the "in" template rule being the child constructs of the element on which this attribute is used (necessary on the parent of sibling nodes when the sibling nodes must all be in the same generated template rule)

Namespace usage

These required attributes on the document element of the input declare the namespaces to be used for the importing stylesheet to interact with the synthesized stylesheet fragment.

  • x:out-prefix="namespace-prefix"
    • declare the prefix of an in-scope namespace URI string to be used in the output stylesheet fragment for the prefix of the "out" template rule in an out/in branch pair
  • x:in-prefix="namespace-prefix"
    • declare the prefix of an in-scope namespace URI string to be used in the output stylesheet fragment for the prefix of the "in" template rule in an out/in branch pair

Attribute annotation namespace usage

These two namespaces are used to override any attributes used in the result that need to be replaced with XSLT expressions.

  • xa:attribute-name="expression"
    • this declares the expression to use in the output stylesheet fragment in place of the attribute of the same name; when not accompanied by xap:attribute-name= this adds or overrides an attribute in no namespace
  • xap:attribute-name="namespace-prefix"
    • this is used in combination with xa:attribute-name of the same local-name part so as to add or override an attribute of the full name namespace-prefix:attribute-name= in the output stylesheet

Note that this use of namespaces goes beyond simple vocabulary distinction. Namespaces are used here to both disambiguate a signal to the processing environment from a property of the result tree, as well as identify which signal is being communicated to the processing environment. Specific behaviors are being triggered by the presence of these two namespaces, using the attribute name as a datum feeding the behavior, rather than identifying a particular component of a vocabulary. As with all namespaces, the prefix is irrelevant, as the processing environment is using the presence of any attribute in the given namespace as a signal for behavior and a parameter (the name) to the behavior.

Housekeeping attributes

These optional attributes do not affect the facilities of the synthesized fragment, but do serve an aesthetic purpose of organizing the fragmentation found in the synthesized fragment.

  • x:ignore="template-name"
    • this signals the element does not have a role in the output stylesheet and is to be ignored; the named template rule must exist (ostensibly to be the template rule that accommodates this result tree content) so as to assert this particular annotation was not inadvertently used
  • x:sort="numeric-value"
    • top-level stylesheet template rules and declarations are ordered in the output fragment in numerical order of this attribute; an absent annotation indicates a numeric value of zero; annotated constructs of equal sort value are sorted in document order of the construct found in the input
  • x:comment="string"
    • this adds a top-level comment to the output before the top-level constructs created by the annotated element
  • x:comment-children="string"
    • this adds a top-level comment to the output before the top-level constructs created for the children of the annotated element

Action element

This optional element is used to help in the organization of the synthesized fragment when the author wishes to expose global variables to be overridden or accessed from the importing stylesheet.

  • <x:decl name="qname" value="expression"/>
    • this adds a top-level variable declaration to the output using the given qualified name and XPath expression; note the attributes are not in any namespace so are not prefixed
    • this element may include x:comment= and x:sort= annotations provided those annotations are explicitly prefixed in the ResultXSLT namespace

Limitations

When working across eight document types, a limitation was the need for repetition for similar constructs used in different reports. This might have been mitigated with extensive use of entities where common parts of the report would be distilled into referenced entities. Making a single change in the entity would then be reflected across all of the document types.

Conclusions

The power of XML Namespaces to annotate a document, and the flexibility of vocabularies to accommodate foreign namespaces creates an opportunity to do machine processing where traditional manual tasks are used. Both XSL-FO and HTML are examples of vocabularies that accommodate foreign namespace attributes without interference.

ResultXSLT illustrates this annotation/generation technique and provides a mechanism for transforming a prototypical result instance into an importable stylesheet fragment with which different importing stylesheets supporting different input vocabularies can all take advantage of the investment in the layout.


The ResultXSLT™ Environment

G. Ken Holman [Crane Softwrights Ltd.]