A non-XSLT DTD for writing XSLT stylesheets

Lynne A. Price
lprice@txstruct.com

Abstract

This presentation describes editing XSLT stylesheets using a variation of the XSLT element structure. For example, the variation supports automatically numbered lists within comments. The use of cross-references for repeating parameter, variable, and template names throughout a stylesheet makes it trivial to change names within a single stylesheet or throughout a set of related stylesheets. An XML-aware editor such as FrameMaker keeps the stylesheet constantly pretty-printed and makes it easy to rearrange templates, to change one structure to another (alternating between <xsl:if> and <xsl:choose>, for instance), and to locate desired material. A hypertext-linked table of contents and index can also be prepared.

Keywords: XSLT; Editing/Authoring

Lynne A. Price

Lynne A. Price is president of Text Structure Consulting, Inc., a consulting company that specializes in structured FrameMaker and XML. Prior to founding Text Structure Consulting in 1996, Lynne was a software engineer at Frame Technology and then Adobe where she worked on FrameMaker's structure features. While Lynne has been active in the XML/SGML community since 1985, her interest in structured documentation began in graduate school. She completed a Ph.D. in Computer Sciences at the University of Wisconsin-Madison in 1978, writing a dissertation titled Representing Text Structure for Automatic Processing.

A non-XSLT DTD for writing XSLT stylesheets

Lynne A. Price [Text Structure Consulting, Inc.]

Extreme Markup Languages 2006® (Montréal, Québec)

Copyright © 2006 Text Structure Consulting, Inc. Reproduced with permission.

This paper describes an editing environment that the author created to facilitate a major revision of a few thousand lines of XSLT 1.0 code distributed over multiple files. The challenges were to:

The solution is based on recognition that XSLT transformations share two significant properties with other XML documents:

The approach described here uses Adobe FrameMaker® to edit a structure that is formatted to look like XSLT but actually uses a slightly different element structure. Once editing is complete, XSLT is used to transform the editing structure to XSLT itself. Similar environments could be based on other XML editing/formatting software.

When the XSLT developer is creating a transform to deliver to someone else (or to another organization), the delivered version can use the editing structure if the recipient has the same environment. If the recipient, however, expects an XSLT document and does not care what internal tools the developer used to create it, the deliverable will be the transformed result.

Pretty-printing and well-formedness

The user must be aware of both the element structure used in FrameMaker to represent XSLT as well as the desired XSLT structure. As he edits, a WYSIWYG window that displays the formatted XSLT transform is always open. Except for wrapping of long lines and font changes such as use of bold and color, this window displays the final XSLT text that is eventually generated. The text is automatically indented to indicate the nesting structure of the element hierarchy, tags for result elements and the value of name attributes are bolded, and comments appear in red. As shown in figure 1, the nesting level is displayed in the left margin to aid the user in recognizing the hierarchical relationships among elements that may be separated by several lines or appear in different pages:

Figure 1: The formatted transform displayed during editing
[Link to open this graphic in a separate page]

Since the FrameMaker user inserts elements rather than start- or end-tags, the document must be well-formed; since the user creates elements by choosing them from a predefined list, there is no possibility of a typographical error in an element name. Furthermore, the FrameMaker user interface makes it unlikely that the document will not be valid, preventing meaningless constructs (such as insertion of an <xsl:when> element other than as a child of an <xsl:choose> element) unless the user requests the ability (presumably temporarily while rearranging existing code) to violate validation rules.

While the document window in figure 1 shows a representation of the eventual XSLT output, the user creates this view by manipulating the alternate element structure. This structure can always be shown in a second window, the Structure View, which the user can open and close as desired. Figure 2 shows how the same transform might appear in the Structure View:

Figure 2: The optional Structure View
[Link to open this graphic in a separate page]

As indicated here, each element in the structure is represented by a “bubble” in the Structure View. Bubbles are indented to indicate the element hierarchy and siblings are connected by a vertical line. The element’s content (or the first part of long content) appears to the right of its bubble. Elements can be collapsed (as are the two <xsl:when> elements here) to suppress display of their descendants or expanded to show all descendants. Judicious use of collapse and expand helps understand the logic of nested structures.

Screen shots of the Structure View are used throughout this paper to show the element hierarchy of particular constructs.

The FrameMaker user interface makes it easy to navigate through the element hierarchy. Elements can be moved by dragging their bubbles in the Structure View. It is also straightforward to wrap new elements (such as <xsl:if>) around existing content or to unwrap an element (that is, to discard the element itself but retain its content). As the user completes each editing change, the formatting in the document window updates to reflect the current element structure.

Attributes to elements

The most significant difference between the editing structure and actual XSLT is that the editing structure uses subelements where XSLT uses attributes. As a result, the elements that represent XSLT attributes themselves can contain subelements and have attributes. As discussed in the remainder of this section, the editing structure takes advantage of this ability in more than one way.

Cross-references (links)

One use of subelements within elements that represent XSLT attributes is to eliminate the need to retype names of templates, parameters, variables, and modes. Instead, the user types a name once and links to it through an empty cross_reference element. The value of an ID attribute on the original name element is used as an IDREF attribute on the cross_reference; FrameMaker is configured to format a cross_reference element by repeating the contents of the name element with the matching ID. The result is illustrated in the figure 3, where the underlined blue text consists of cross-references to the name attribute of the <xsl:param> element:

Figure 3: Linking repeated names
[Link to open this graphic in a separate page]

The fully expanded Structure View for the beginning of this template is shown in figure 4:

Figure 4: ID and IDREF attributes in the Structure View
[Link to open this graphic in a separate page]

To rename the parameter, perhaps replacing abbrev with a shorter form such as abb or spelling out the word abbreviation in full, the user need only edit the contents of the name element; FrameMaker updates display of the cross_reference elements automatically. Furthermore, FrameMaker automatically links each cross-reference to the referenced element. The user can click on any name entered as a cross-reference to jump to the point in the document where it is defined. For example, when the name attribute of an <xsl:call-template> element is specified as a cross-reference, the user can click on the name to see the corresponding <xsl:template>.

Here, the ID value BIHGFDHA is a psuedo-random string that FrameMaker automatically creates the first time the user cross-references the parameter name. The user can choose instead to assign meaningful strings to the ID attributes.

The target of a cross_reference can itself be a cross_reference. Suppose that several templates have a common parameter. The developer can type the parameter name in the counterpart of the <xsl:param> element in one template and use cross_references within the analogous declarations in other templates. In each case, uses of the parameter can cross-reference the <xsl:param> element in the same template. This approach is diagrammed in figure 5 where the green arrows represent cross-references to the parameter name in template1 and the red arrows represent cross-references to the parameter name in template2:

Figure 5: Linking to other links
[Link to open this graphic in a separate page]

Thus, the developer can change the parameter name in template1 and FrameMaker will update all uses of the name to match. Alternatively, if the developer decides that the parameters of the two templates are actually used for different purposes, he can replace the cross_reference in the name of the second <xsl:param> element with a different name and FrameMaker will update only references to the name in the second template.

Balancing parentheses

The group element is another example of a subelement in the editing structure element that corresponds to an XSLT attribute. Although the user is free to type any parentheses needed in an expression, the nestable group element can be used instead. This element’s automatic formatting surrounds it with matching parentheses. Thus, if the developer uses group elements, unbalanced parentheses are not possible. Furthermore, the color assigned to group elements varies with the nesting level, making it easy to pair up matching parentheses visually. For example, the expression in figure 6:

Figure 6: Changing color to indicate matching parentheses
[Link to open this graphic in a separate page]

has the structure illustrated in figure 7:

Figure 7: The structure of nested groups
[Link to open this graphic in a separate page]

Note that the outermost group is displayed in black, the second-level group in red, and both third-level groups in green.

Indexes

The index_term element is a third possible subelement of an editing structure element that corresponds to an XSLT attribute. This element does not appear in the document (although the user can see it in the Structure View), but is used when FrameMaker creates an index of an XSLT stylesheet or set of related stylesheets.

Result Elements

The alternate structure uses elements named result_element with a numeric suffix that indicates nesting level for result elements. An attribute of this element defines the generic identifier of the result element. This structure enables automatic nesting of result elements within the transform and ensures that the same element name is used in the matching start- and end-tags. For example, the template in figure 8:

Figure 8: A result element
[Link to open this graphic in a separate page]

has the structure in figure 9:

Figure 9: A result element in the Structure View
[Link to open this graphic in a separate page]

Note that the name and value of the gi attribute are listed below the result element bubble.

Comments

The XML_comment element inserts a comment into a transform (and hence differs from <xsl:comment> which inserts a comment into the output of the transform). An XML_comment can contain a simple string, or can consist of any combination of paragraphs, automatically numbered lists, and XSLT elements. For example, the comment in figure 10:

Figure 10: Lists within comments
[Link to open this graphic in a separate page]

has the structure in figure 11:

Figure 11: A comment in the Structure View
[Link to open this graphic in a separate page]

Of course, when the editing structure is transformed to XSLT, the list numbering and indentation is preserved.

It may be useful to comment out portions of a transform for various reasons, including to:

  • Disable partially complete templates.
  • Disable sections that are not working in order to test surrounding sections.
  • Preserve templates that provide alternative output (using conditional sections would be a different approach).
  • Illustrate an alternative approach.
  • Retain a previous version of a section, possibly adding documentation of why the segment was changed.

Simply surrounding material to be commented out with the appropriate delimiters is insufficient. The result would not be well-formed if the commented-out material contains any comments because the outer comment would be terminated by the pair of hyphens at the start of a nested comment. This application solves the problem by inserted a space between the two hyphens in the comment delimiters of nested comments. As illustrated in figure 12, this subtle difference in formatting does not prevent the reader from recognizing the nested comments:

Figure 12: A nested comment
[Link to open this graphic in a separate page]

To restore commented-out elements, the user need only unwrap the surrounding XML_comment element. The spaces between the hyphen pairs will be removed. Of course, when the editing structure is transformed to XSLT, the spaces are retained in any nested comments.

Entities (variables)

The user can define entities (called variables in FrameMaker) for strings that are used repeatedly. Changing the definition of an entity automatically updates all references to it. In figure 13, the green text represents two references to a single entity:

Figure 13: Display of simple text entities
[Link to open this graphic in a separate page]

Collections of files

A project may involve several XSLT transforms, whether used for different purposes (as the two transforms described here for converting in both directions between XSLT and the editing structure) or modules assembled using <xsl:include> or <xsl:import>. FrameMaker supports a construct called a book which is a collection of other files. It allows the user to search for a string or element throughout a book and one file in a book can cross-reference another file. Thus, it is quite possible for a <call:template> element in one file to use a cross-reference to name a template defined in another file. Furthermore, cross-references can be used to identify the module referenced in the href attribute of an <xsl:include> or <xsl:import> element.

Generated files and indexing

As mentioned earlier, FrameMaker can generate an index from the index_term elements in a document. A topic index can be useful to the developer who is new to a project or who has put it aside long enough to forget the template names used for various purposes. FrameMaker can also generate lists of elements of selected types, optionally restricting entries to particular contexts. Thus, it’s possible to generate a list of all name attributes within <xsl:param> elements, or <xsl:variable> elements that immediately precede <xsl:choose> elements. Entries in a generated list can appear in document order (as is true of a table of contents) or can be alphabetized.

Entries in both an index and a generated list can be hyperlinked to the referenced content, so that the user can click on an entry in these files and jump to the referenced structure in the transform. Either type of generated file can create entries based on a single file or on all files in a book.

For the project that motivated this application, the author created two generated lists. One lists all the templates in the book of transforms. Typical entries are shown in figure 14:

Figure 14: A generated list of templates
[Link to open this graphic in a separate page]

Note the two-part page references, which identify both the file containing the template definition and the page within that file.

The second generated file is a list of called templates, illustrated in figure 15:

Figure 15: A generated list of <call-template> elements
[Link to open this graphic in a separate page]

These lists made it easy to review all calls to a template before modifying it. Confirming that each name in the list of called templates is formatted in the link style made it easy to confirm that the appropriate cross-references had been defined. Comparing entries in the two lists helped identify templates that were no longer used.

Similar lists of defined and referenced parameters and variables helped identify unused instances of those constructs as well.

Ways to locate a template

One concern at the beginning of the project was to establish conventions for organizing the templates within a transform. Should they be alphabetized by the value of their name or match attributes? Should called templates be placed as near as possible to one of the templates that call them? The purpose of such conventions is to make it easy for the developer to locate a template in order to review it. The FrameMaker environment made it so easy to locate a template that the order of templates within a file became moot. Techniques for jumping to different parts of a transform include:

  • Search throughout the book for the value of the name or match attribute. Heavy use of cross-references minimizes the number of times these strings appear as plain text.
  • Search for the value of an ID or IDREF attribute.
  • Click on a cross-reference.
  • Click on a link in a generated file.
  • Click on the Go to Source button in the Cross-Reference dialog box.

The last of these has not yet been described. FrameMaker’s Cross-Reference dialog box appears in figure 16:

Figure 16: The Cross-Reference dialog box
[Link to open this graphic in a separate page]

It is most often used to create or modify a cross-reference. On the left, FrameMaker lists the element types that have an ID attribute, possibly divided into subcategories. In this application, the name element corresponds to a name attribute and name elements have been categorized by the type of the containing element. The visible part of the scroll list shows some of the possibilities. Here, the user has highlighted name(template), corresponding to name attributes of <xsl:template> elements. The right side of the dialog box then lists all such elements in the document selected at the top of the dialog box. This file happens to contain two named templates, and they are listed in the scroll list on the right. The user can insert a cross-reference to one of them by highlighting the entry in the right and clicking the Insert button. However, the dialog box can also be used as a navigation aid. The user can highlight an entry and then click the Go to Source button without inserting a cross-reference. FrameMaker will jump to the indicated template without making any change to the open document.

Formatting control

The editing structure provides some control over formatting both within FrameMaker and in the final XSLT output. As some of the above Structure View samples illustrate, each element representing an XSLT element contains an element that in turn contains the individual elements representing the XSLT element’s attributes. For example, the attributes of an <xsl:template> element are represented by the children of a template_attributes element and those of an <xsl:variable> element by a variable_attributes element. Each of these _attributes element has an attribute named separate_lines. Its value determines whether each represented attribute appears on a separate line or the entire start-tag is placed on a single line. Thus, the user can choose, on an element-by-element basis, either the style illustrated in figure 17:

Figure 17: A one-line start-tag
[Link to open this graphic in a separate page]

or the style of figure 18:

Figure 18: Displaying each attribute specification on a separate line
[Link to open this graphic in a separate page]

Similarly, the user can choose whether to enclose attribute values in single or double quotation marks. The specification can be made for a single attribute value or as a default at different levels in the structure.

Bringing an existing transformation into the editing environment

While the editing environment just described is an effective environment for creating new transforms and editing those that are already in this form, another tool is needed to bring existing transforms into the editing environment. An import XSLT transform can be used to convert any XSLT transform to the editing structure so they can be maintained in this environment. The current version makes no attempt to create cross-references or list elements within comments. The user can, of course, create such structures manually.

Note that while there are transforms for converting the editing structure to XSLT and XSLT to the editing structure, there is no provision for a round-trip from the editing structure to XSLT and back without loss of information. The editing environment contains information (such as index terms and the cross-references and lists just mentioned) that are not preserved in XSLT. Therefore, once the editing structure for a particular transform has been created, the developer should treat that version as the primary one.

Conclusions and next steps

The major revision of existing transforms that motivated this project was much less intimidating once the editing application become available. The application has reduced development time on other projects as well. It addresses the goals listed at the beginning of this paper. The biggest time saver has come from the links in cross-references and generated lists.

To date the system has had only one user. The next step is to see if other developers can feel comfortable viewing the XML form of one structure while manipulating the element structure of a slightly different one.

As time permits, the application may be updated to XSLT 2.0. Another possible extension is to provide an element structure for XPATH expressions and XSLT functions. Doing so would minimize keyboarding, prevent spelling and typographical errors in the names of the contained constructs, and would allow software to remind the user of the order of a function’s parameters.


A non-XSLT DTD for writing XSLT stylesheets

Lynne A. Price [Text Structure Consulting, Inc.]
lprice@txstruct.com