XAG - making XML for everyone

Charles McCathieNevile


The growth of XML provides an opportunity to present meaningful organised information. For people with disabilities, the meaning and organisation offer the opportunity to present it in ways which are useful to them, whatever their particular needs. This paper discusses W3C's draft XML Accessibility Guidelines which describe how to ensure that an XML vocabulary can be used to create information that is accessible to people with disabilities.

Keywords: Editing/Authoring

Charles McCathieNevile

Charles McCathieNevile has been working on Web accessibility since 1997, first with the Sunrise Research Laboratory in Melbourne, and since December 1998 as a staff member at W3C. Charles' major involvement in accessibility work is in the Protocols and Formats Working group of W3C, whose job is to review W3C specifications for accessibility issues. He is also working on the SWAD-Europe semantic web project.

XAG - making XML for everyone

Charles McCathieNevile [W3C]

Extreme Markup Languages 2002® (Montréal, Québec)

Copyright © 2002 Charles McCathieNevile. Reproduced with permission.


Markup languages provide opportunities to organise and structure information. The growth of the Web as a medium for almost any kind of commerce or transactional behaviour, but most particularly for information exchange, has opened up many new opportunities for people with disabilities of different kinds. The ability to read a daily newspaper at will is a novelty to most blind people, as is the ability to have a truly secret ballot. For a person with severe cerebral palsy, the ability to shop for groceries unaided can be life-changing novelty. For people with much more common disabilities such as reduced vision the ability to enlarge the print size in the newspaper is a godsend. And even for those who suffer from the disability that the information technology revolution has given to the world, overuse injuries affecting the ability to use the hands, or more recently the voice, information that can be re-organised, and presented in a manner appropriate to the user's needs can save a person's job, make their hobby less painful, or provide a new form of learning or entertainment activity they previously could not use.

Central to these gains is the ability to restructure and re-present information in a different way. The turn-of-the-century Web allowed a dyslexic person to be given an interactive, speech and image-based way of interacting with a shopping site, at the same time as a Deaf person could have sign-language explanations provided, and a blind person could rely on a braille device to present the information in a textual format. HTML 4 and its XHTML successors have provided a widely used, somewhat interoperable format that allows for a useful degree of repurposing. More strongly specified languages, such as ChemML or the extreme.dtd in which this paper is written, allow for very targetted presentation according to a user's needs of very well organised information. This is not such a surprise - the need for strong typing to support interoperable manipulation of data in many contexts (b2b, b2c, 4tf, etc) is also what supports the ability to manipulate the way a user interacts with it.

There are two approaches that have been taken to providing users with a "non-standard" or adapted interaction. One is characterised by using the methods of Artificial Intelligence - pattern recognition, adaptation to the user's preferences - to build smarter client interfaces. The Archimedes project at Stanford is an example of this approach, where they have built specialised user interfaces which connect to the normal input and output terminals of a computer (keyboard, mouse, monitor, sound) and change the information flowing as required - reading out what is on the monitor, providing force feedback or visual highlighting as a pointer is moved, scanning through a range of active items and allowing the user to blow through a straw when the one they want is highlighted. As the technology available improves, this approach has gone from the apparent absurdity of requiring an extremely powerful computer to provide input and output to a much smaller and less powerful one, to the point where the computers are of roughly equivalent power, and the interface is working more effectively, and this trend seems set to continue. This approach is important for the development of better adapted systems for users, and some of the lessons learned are widely applied in Web browsing systems designed for people with specific needs (both disability related and otherwise, such as the predictive typing systems used by people with overuse injuries or very limited input ability such as a morse-code sip/puff straw, and now also by more and more of the millions of people sending SMS messages each day). This work has a long history - the Kurzweil reading machine - essentially a scanner and Optical Character Recognition software - was a marvel in the 1970's, and an expensive one at that. It made use of the then reasonably new and expensive text to speech technology developed for blind computer users at the very beginning of the 1970s. These systems are now available either at very low cost (in the case of hardware), or free (for the software which now performs many of the functions using generic hardware).

This paper discusses the other approach - that of making the information itself easier to re-present. Since information is no longer constrained by a physical manifestation (as it was when it was present largely in hand-written manuscripts) it should be much cheaper to work on the information than it is to change the physical devices used to present the information, whether those are books or mobile phones. In particular, the growth of XML, and of languages to provide better-specified schemata, has allowed more processing of information by generic software, including the software used to re-present information for people with disabilities. Again, this is work with a long history, but the goal of this paper is to present the most recent work at the W3C in this area, which is encapsulated to some extent in the draft XAG [XML Accessibility Guidelines] [XAG]. This paper will provide an overview of the guidelines, and investigate some of the requirements in more detail.

Overview of XAG

Where does XAG come from?

Obviously there are many different "parents" for most work at W3C, although some of them have more prominence than others. The immediate parents of XAG include the three sets of accessibility guidelines [ATAG1] [UAAG1] [WCAG1] which have been on W3C Recommendation track since 1998 (two of them are version 1.0 recommendations with the working groups drafting new versions at the time of writing), and the work of the PFWG [Protocols and Formats working group] XAG [PFWG], formerly known as the HTML and CSS working group. This group's role began reviewing and proposing accessibility improvements to HTML 4 and CSS 2, and has since worked on review of many different W3C specifications. The working group, in turn, has drawn on a wide variety of sources of expertise and experience. The desire to provide the experience represented by the various contributors to the group in the form of guidelines which can be used without needing to talk directly to the people who wrote them has resulted in the drafts of XAG.

What does it look like?

The guidelines are presented as a set of general principles, in this case four, which are expanded as a total of 29 checkpoints, or normative requirements. In addition, there are various examples and explanations presented. This is the structure that has been used in the other sets of Guidelines given recommendation status by W3C, and enables the creation of normative requirements to allow conformance testing, as well as the development of explanatory material, in particular allowing the working group to readily add examples based on the work it continues to do.

The four guidelines and the explanatory paragraphs that follow them are presented here.

Guideline 1. Ensure that authors can associate multiple media objects as alternatives

Web content providers must able [sic] to offer alternative versions of their content if they wish to do so (as the Web Content Accessibility Guidelines tell them to do so). Textual alternatives, like a caption for a movie, or a table summary, can be repurposed for many different output devices, whereas audio content for instance is confined to a certain set of devices (those that can play sound).


Guideline 2. Create semantically rich languages

End-user-oriented XML should contain precise methods of encoding the data for its particular scope. By increasing the semantics of your elements, and setting linking devices to outside presentations or further semantics, you allow your data to become "Webized" and hence to operate within many environments.


Guideline 3. Design an accessible user interface

Web content is rapidly shifting from static pages to dynamic pages, called Web applications. This is most often done using a scripting language based on event callback. The language designers must ensure that the model they chose allows for user control of presentation. Always ensure that nothing in the presentational aspect of the document attempts to restrict user control of how the document instance is accessed.


Guideline 4. Document and export semantics

Make sure that all people can understand your design and map to and from your elements, and easily make assertions about them. Furthermore, make sure that you provide your own first party assertions about your languages: for example, don't make users guess an element's purpose.

Guideline 1 covers the most commonly understood accessibility requirement, that an image has a textual variant available - for example, to enable something to be presented in speech or braille for a blind user. The potential benefit of this beyond accessibility is demonstrated by the popularity of image searching, which highlights the need to have explicit associations, rather than relying on the fact that there is text somewhere in the document that explains the image.

Making this explicit association allows for broader accessibility as well. There are many people who have reading difficulties, many people who have some vision impairment but who can see to some extent (colour-blindness affects many more people than total blindness), who can benefit from being able to identify an image that is related to a block of text, or vice versa. In addition, many types of association are required - between a video track, the audio soundtrack, sound effects, a transcript, etc. Again, this benefit can be carried over to the wider Web - for example, allowing mobile devices to select the most important parts of the content to adapt to varying and limited bandwidth.

Guideline 2 requires various features in a language to ensure that information is machine processable. Some of the checkpoints could be considered as good XML authoring style - providing for containment, separating content from presentation, and not overloading the semantics of an element. Others require specific features that help accessibility. For example, the requirement in checkpoint 2.7 to provide a mechanism for identifying summary information is important to enable faster navigation of a document.

1.1 Provide a mechanism to explicitly associate alternatives for content or content fragments.

Authors using the elements/attributes in your language must have the ability to provide alternatives for any content, be it images, movies, songs, running text, whatever.

Techniques for 1.1

For example, the summary and the caption elements in the XHTML table module can be used to provide a rich textual description of a non-textual media. cf. WCAG 1.0 checkpoint 1.1.

The Guidelines at work

Incorporating graphics - the need for an alternative

Consider the example quoted above. Essentially it requires that different forms of the same information can be present - one of the most standard accessibility requirements. Its most common application scenario is for providing text equivalents to media objects for which a user does not have, or cannot use, an appropriate player. In the extreme.dtd this paper is written in, the DTD includes the following:

<!--                      GRAPHIC                               -->
<!--                      GRAPHIC may be used by itself for an
                          unnumbered, untitled illustration, or
                          used in FIGURE when the illustration
                          should be numbered and, optionally,
                          The "figname" attribute contains an
                          entity reference that identifies the
                          The "scale" attribute contains a number
                          that indicates the percentage the graphic
                          should be scaled.  -->

<!ELEMENT graphic         EMPTY                                   >
<!ATTLIST graphic
                 figname  ENTITY                      #REQUIRED
                 scale    CDATA                       "100"       >

Apparently, this part of this document specification fails the requirement, since it does not allow for an explicit association of alternative information. However, the DTD does meet checkpoint 1.1 - the figure element in which the graphic can be contained allows for a title and a caption. If the graphic element was only able to be included in a figure element, then all graphic elements would have an alternative included. In cases where the appropriate alternative is nothing (such as a decorative graphic used for visual aesthetics, for which there is no nonvisual equivalent) the title and caption can of course be left blank.

But what is the impact on actual people - why does it matter whether the figure wrapper is used or not? It is still possible to provide, in the text of the paper, an explanation of what the graphic is. So it is possible to satisfy the associated requirement of WCAG [Web Content Accessibility Guidelines] [WCAG1], to ensure that all non-text elements have a text equivalent. This means that a person who reads the entire paper can get the information that is presented in an image included using a graphic element. But there is no way to check that except read the entire paper - a long-winded approach to ensuring that someone has understood what is being presented.

The lack of an explicit association, however, limits the utility of the document, and thereby the usability and accessibility. Where there is an explicit association, a reference to a graphic element in a page (whether verbal, such as "the third chart", or an explicit Xpointer-type URI reference to the same thing) is useless to someone who cannot see the graphics. By contrast, the HTML 4 specification for the img element allows for both the required alt attribute - a short piece of text, which can play the role of the image, and the optional longdesc attribute a URI reference to a document or document fragment which contains a full description of the image in question. These two attributes allow the reader to either roughly identify the place in the paper which is being referred to, or discuss this part in detail.

One of the important parts of the XML Accessibility Guidelines, and potentially one of the most difficult checkpoints to satisfy, is 4.8 - Document techniques for WCAG, ATAG [Authoring Tool Accessibility Guidelines] [ATAG1] and UAAG [User Agent Accessibility Guidelines] [UAAG1] with respect to the XML application. In this case, techniques for WCAG checkpoint 1.1 are fairly simple - use the figure element to include graphics, and suggest that references are made to figure elements rather than their graphic children. The fact that figure elements are targets for xref will reinforce this second part of the requirement. This checkpoint not only ensures that it is possible to conform to WCAG in a document using the DTD, by forcing the DTD provider to explain how it can be done, but also means that an authoring tool or user agent developer, as well as a content author working solely from the DTD, knows what they are expected to do to import, generate, and render content in an accessible manner.

This, as I said in the previous paragraph, can seem like a heavy requirement - there are literally hundreds of techniques that need to be used for a language like XHTML 1.0. On the other hand, XHTML 1.0 is a large language designed to meet a wide variety of needs and use cases. One of the values of XML is it allows a fairly small, special-purpose language to be written for most applications - including the need for marking up papers written for a conference. In addition, it is possible to re-use existing techniques, either by minimal adaptation or directly. Checkpoint 1.3 - Reuse existing accessibility modules to indicate alternative-equivalent associations - pushes authors not to reinvent things that have been done already and meet their needs. In addition, anything copied from a XAG-conformant language will already have associated techniques, which should require very little if any modification at all.

The graphics were there for a reason...

For other people with disabilities, the problem is almost the exact opposite. Many disabilities affect people's ability to deal with large amounts of text, and the illustrations and figures that can be provided in a paper are extremely important to these people. Being able to identify the relevant image for a passage of text is helpful for systems such as talking browsers, which can highlight the text being read, and could also position it to ensure that the relevant image is onscreen. Most of this can be accomplished by applying CSS user stylesheets to pseudo-elements generated by the browser, or in combination with scripting or SMIL-animation.

This application can be supported by the extreme.dtd, as shown by the following DTD fragment:

<!--                      CROSS REFERENCE                       -->
<!--                      The attribute "refloc" is used to point to
                          the ID of the item being referenced.
                          The attribute "type" specifies the type
                          of the cross reference.  TYPE="number"
                          generates the number of the referenced
                          element (if it is numbered, and the
                          result is undefined for references to
                          elements that are not numbered).  For
                          example, 'See figure <xref type="number"
                          refloc="fg0001">' would be formatted as
                          'See figure 2'.
                          TYPE="title" generates the title of the
                          referenced element (if it does contain
                          a title element, and the result is
                          undefined for references to elements
                          that do not contain titles).  For
                          example, 'See the section "<xref
                          type="title" refloc="sec02">"' would be
                          formatted as 'See the section "Using
                          Cross References"'.                   -->
<!ELEMENT xref            EMPTY                                   >
<!ATTLIST xref       refloc IDREF                  #REQUIRED
                     type   (number | title)       "number"       >

Similarly, the xref element can be seen as meeting the requirements of checkpoint 1.2 - Define flexible associations, where a given kind of relationship can link to or from objects of varying types without constraint. There is more or less complete flexibility - similar to the a element of HTML, which is the most common expression of its hypertextuality.

A full assessment?

The working group developing the guidelines have done several assessments of XML languages being developed by W3C as implementation and implementability tests for the guidelines. These test results are available, and a complete (but personal) assessment of the DTD used for this paper, against the draft current in August 2002 will be made available.[XAG assessments]

Differences to other W3C accessibility guidelines

There are some differences between the structure of the XML Accessibility Guidelines and other accessibility guidelines produced in W3C. The most obvious is the lack of a priority system - while the other guidelines specify different levels of accessibility, the current drafts of the XML Accessibility Guidelines do not. This is based on decisions of the working group, who have felt that there is very little difference in relative importance of features to enable accessibility.

Issues in the specification

Accessibility or more?

Many of the requirements in the guidelines are considered good design requirements, not just for accessibility. There are other requirements such as those for internationalisation which are considered important for XML languages. In particular requirements for device independence are closely related to those for accessibility. There has been discussion about whether to keep the guidelines focussed on accessibility, or expand them to cover general design.

Applicability of XAG

Some XML languages are designed for general representation of information, such as HTML. Some are designed particularly for rendering on a particular type of platform, such as XML Formatting Objects. On another axis, some languages are designed to cover end-user representation, such as SVG [Scalable Vector Graphics] while others are designed for machine processing with very little if any end-user representation provided, such as XSLT or RDF.

In early drafts (up to June 2002) the XML accessibility guidelines included introductory material that suggested it was meant to apply to languages designed for general representation of data for rendering to end users. Testing with formats such as RDF, XSLT, XSL formatting objects, VoiceXML and XML schema, and preliminary experience with RDEF vocabularies such as EARL have suggested that applying the guidelines where possible to all kinds of information representation, whether or not they are intended for end-user representation, is helpful for accessility - in particular in the area of authoring. It also seems that the guidelines may help us in determining whether or not a language is likely to be intended simply for one device-specific rendering or is useful for representing information as a storage format that can be repurposed for users with different needs.

A related question is whether the guidelines can or should be applied to formats other than XML. Although the requirements are framed in terms of XML languages, and the techniques used refer to XML, it is possible to apply the requirements to other formats. In particular, some formats which result in inaccessible documents being produced may do so because they are unable to meet requirements expressed in the guidelines.

This topic is under active discussion within the working group. Feedback from developers about whether small atomic requirements or large compound requirements are easier to understand an implement is appreciated.


There are a number of checkpoints in the guidelines that identify different aspects of the same requirement. For example, checkpoint 1.1 requires that it is possible to create alternative representations, and checkpoint 1.2 requires that the form of these is flexible. People suggest that these are part of the same requirement, which is true. However, our experience of accessibility guidelines, especially the Web Content Accessibility Guidelines, suggests that making requirements as atomic and explicit as possible is helpful for implementors.

Similarly, there are many requirements under guidelines 2 and 3 for features in the markup that have matching requirements under guideline 4 that they are documented in a way that can be exposed to users or in machine-readable formats.

It is an open question for the current (June 2002) drafts which of these redundancies should be collapsed. Again, feedback is appreciated.

XML Schema, other schema languages and schema annotations

XAG requires that a schema provides documentation of elements, for example, by using XML schema in preference to XML DTDs. This is to enable authoring tools to help authors identify elements, and for authoring tools and user agents to explain what an element means.

Should the guidelines require XML schema in particular, or just a schema language that meets the requirements? This question may depend on implementation experience - if the developer community settles on a different schema language then it may be important to adapt the guidelines to the external situation. Again, this perspective comes from experience with the Web Content Accessibility Guidelines where it has been important to reflect the real-world needs of users in specified requirements


[ATAG1] Authoring Tool Accessibility Guidelines. J Richards, J Treviranus, C McCathieNevile, I Jacobs eds. W3C Recommendation 2 February 2000, http://www.w3.org/TR/ATAG10

[PFWG] Protocols and Formats Working Group. W3C working group, home page at http://www.w3.org/WAI/PF

[UAAG1] User Agent Accessibility Guidelines. J Gunderson, I Jacobs eds. W3C Candidate Recommendation, http://www.w3.org/TR/UAAG10

[WCAG1] Web Content Accessibility Guidelines. W Chisholm, I Jacobs, G Vanderheiden eds. W3C Recommendation 5 May 1999, http://www.w3.org/TR/WCAG10

[XAG] XML Accessibility Guidelines. C McCathieNevile, S Palmer eds. W3C Working draft, latest editors' version http://www.w3.org/WAI/PF/XML (The version used for this paper is the 17 June 2002 Editors' Draft - http://www.w3.org/WAI/PF/XML/xag-20020617 )

[XAG assessments] Assessments against XAG. A collection of personal assessments. These are not endorsed by any W3C working group, nor by W3C nor any of its members. Available at http://www.w3.org/WAI/PF/XML/reviews

XAG - making XML for everyone

Charles McCathieNevile [W3C]