XMLVS: Using Namespace Documents for XML Versioning

Harry Halpin
H.Halpin@ed.ac.uk

Abstract

We introduce the namespaces in XML, focusing first on a definition of elementary terms and the reason for their introduction, the disambiguation of names in XML documents. Afterwards we explain the relationship of QNames and expanded names to namespaces URIs, introducing informal standards like RDDL that were created to be standards for namespace documents.

We then approach the versioning problem for XML languages. The versioning problem for XML languages can naturally be solved using namespace documents, but it is a problem current namespace documents like RDDL do not handle. We suggest an abstract model for language versioning, and then provide a concrete colloquial XML language for managing namespace documents we call XMLVS, the XML Versioning System. This language can be transformed to both human-readable XHTML RDDL documents and machine-readable RDF.

We show, using Atom as an example, how XMLVS can be used to handle both backward and forward compatible names, as well as elements and attributes sharing a name and having multiple language versions use the same namespace document. XMLVS allows easy maintenance of XML languages in line with current W3C Recommendations and in conjunction with other software allows us to automate the boilerplate of namespace document management. Finally, we point out a number of problems with current standards and possible solutions.

Keywords: Content Management; Namespaces; RDF; Modeling

Harry Halpin

Harry Halpin is a postgraduate researcher, working primarily with Henry S. Thompson in the School of Informatics at the University of Edinburgh. He has written papers that range over topics as diverse as text mining, Web Services, and philosophy, and participates in too many list-servs discussing recent efforts in standardization. He is the co-chair of the GRDDL Working Group and on the Semantic Web Co-ordination Group.

XMLVS: Using Namespace Documents for XML Versioning

Harry Halpin [School of Informatics, University of Edinburgh]

Extreme Markup Languages 2006® (Montréal, Québec)

Copyright © 2006 Harry Halpin. Reproduced with permission.

Introduction

As XML becomes ubiquitous and mature, the problem of versioning is increasingly significant. While XML itself has a clear versioning scheme through the version attribute in the prolog, XML-based languages such as Atom do not have a standardized versioning mechanism. We propose namespace documents as the logical solution.

It could be claimed that such an approach is not needed by current markup practice in XML. However, as shown by the recent confusion the introduction of the xml:id name caused, at least seems some clarification of the gap between what W3C Recommendations actually define about namespaces and what many people think they define (as well as W3C good practice) is in order [Disposition of Names]. Once we understand namespaces, XMLVS shows that the namespace document is both an effective and practical solution for maintaining XML languages. The XML-based language XMLVS (XML Versioning System) makes the maintaining, documenting, and versioning of XML languages easier by automatically producing best-practice human and machine-readable namespace documents for XML languages.

Languages, Namespaces, and Documents

In common parlance, an element or attribute name is "in a namespace" so that the creation of a new name, like xml:id, is "adding a name" to a namespace [xml:id Version 1.0]. In these discussions, the terms "language","namespaces", and even "versions" are separate terms with distinct meanings, although in particular cases they may be functionally the same in a given instance. Languages are in general created to serve as an "application" of XML, from displaying web-site updates to creating technical documents. What gives those names their semantics or "meaning" is their application. While various standards may give definitions to some names in XML such as xml:id, the vast majority of names in languages themselves have no semantics outside a given application. Sometimes names from other namespaces can be imported into other languages, such as the import of many RDF constructs that are used in OWL DL with additional constraints [OWL Guide].

To illustrate the complexities of languages and versioning, XHTML is a language whose application is the display of documents on the Web for human consumption. The semantics of each name is given to some extent in the XHTML standard, and is concretely defined in particular by applications that embody the standard [XHTML]. Using namespaces, one may mix XHTML 1.0 with other languages like MathML. XHTML 1.0 is the first language in the XHTML family of languages, and other related languages such as XHTML-Print are already appearing. The XHTML 1.0 language has a single namespace URI: http://www.w3.org/1999/xhtml. However, three variants (Strict, Transitional, Frameset) all use the same namespace URI although they have distinct names, such that not all names valid in XHTML 1.0 Transitional are valid in XHTML 1.0 Strict. One could have easily imagined the case where each of the three variants had its own namespace URI as well. There is a new version of XHTML called XHTML 2.0 that has its own (perhaps temporary) separate namespace URI: http://www.w3.org/2002/06/xhtml2/.

To systematize common parlance, a language is a set of names for things. XML by itself just defines a notation and structure for data in terms of the Infoset, and does not give any semantics above and beyond the very basics[Infoset]. An application gives particular XML document their application semantics by defining the preferred use of a language. A language may be given a namespace, which provides a syntactic mechanism used to disambiguate the names of things within a document. These names are often elements and attributes in XML, but do not have to be: Namespaces can disambiguate names of classes and roles given in the non-XML Semantic Web N3 notation. A namespace is given a unique identifier by its namespace URI. A namespace document "is a place for the language publisher to keep definitive material about a namespace" that can be accessed by dereferencing the namespace URI, and the definite material can consist of multiple resources [Berners-Lee, 1998]. In other words, a namespace document may just consist of links to other resources, each with its own URI. The sum total of all these resources, ranging from the namespace document to other as standards and APIs, define the application semantics. According to common use, a namespace should "not be a URN" (such as the namespace used by Microsoft Office) since you cannot retrieve a namespace document from a URN [Namespace Theses].

A final example should be informative. XSLT is a language whose application is to transform XML documents. Unlike XHTML that gives different versions different namespaces, XSLT has two versions (1.0 and 2.0) that share the same namespace (http://www.w3.org/1999/XSL/Transform). This namespace document can be dereferenced to produce a single XML document that says coyly "Someday a schema for XSL Transforms will live here." The non-normative W3C Schema for XSLT 2.0 exists somewhere completely different (http://www.w3.org/2005/11/schema-for-xslt20.xsd) and is not linked to the namespace URI. The application semantics for XSLT are given by the W3C Recommendation for XSLT 1.0 [XSLT 1.0] and XSLT 2.0 [XSLT 2.0] depending on the version. Often the application semantics are dependent on different languages. While XSLT 1.0 is rather self-contained in its use of namespaces, the application semantics of XSLT 2.0 allows the import of semantics from other applications, such as XML Schema. The same is true for XQuery, which can be given an XML notation (XQueryX) that also uses XML Schema [XQueryX]. In the case of XQuery, the application semantics are formally defined, while most applications have informally defined semantics.

In conclusion, an application should not be confused with an XML language, since an application may use more than one XML language. A given application defines the semantics of an XML language. An XML language may or may not be given a namespace URI, and its namespace URI may or may not be different depending on different versions or variants. In a hopelessly ideal world, one would try to keep things simple so that a single application uses a single language that has only one version with one namespace URI, but in the wild world of the Web things are not always that simple.

What Namespaces Do

There is also confusion, especially with people new to XML, as regards how namespaces in XML function. Namespaces exist so that names from multiple languages can be combined, even if they have the same name. In order to do this, every name must be qualified with a unique namespace so names from different languages can be disambiguated. For example, the name "class" is used for different purposes in RDF and HTML. This ambiguity results in "namespace collisions" that can be avoided through the use of namespaces as given by the "Namespaces in XML" specification [Namespaces]. This is done by adding to the front of the original name a namespace prefix followed by a colon, and the originally non-disambiguated name is now called the local name. The combination of the namespace prefix and the local name is called the Qualified Name (QName). Since they are separated by a colon, both the namespace prefix and the local name should be a "NCName" (A "No-Colon Name" is a string not containing any colons).The namespace prefix is associated with a namespace URI in the "namespace declaration" using an attribute with the namespace prefix xmlns. This attribute's local name is the namespace prefix associated with the namespace URI, which is given by the value of the xmlns attribute. For example, the namespace prefix xsl is associated with a namespace URI by the attribute xmlns:xsl="http://www.w3.org/1999/XSL/Transform." So, the QName xsl:template has xsl as its namespace prefix, that maps to the namespace URI http://www.w3.org/1999/XSL/Transform, with template being the local name. All QNames have set out to do is to solve the name disambiguation problem.

Rather surprisingly and contrary to popular belief, QNames are not a shorthand notation for URIs. In other words, given a QName one gets two things, a namespace URI and a local name, not a single URI. By saying that xsl:template is in the xsl namespace, what we are saying is that its name is equal to the tuple (http://www.w3.org/1999/XSL/Transform,template). This expanded name is created when the namespace prefix is replaced by the namespace URI. There is no default construction rule for creating a single URI out of an expanded name, and furthermore there is not even a standardized mechanism to map the two parts of an expanded name to a single string for processing purposes. For example, a processor simply concatenate the namespace URI and the local name together as http://www.w3.org/1999/XSL/Transformtemplate, but they could also create http://www.w3.org/1999/XSL/Transform/template or http://www.w3.org/1999/XSL/Transform#template out of the expanded name. The specification is simply silent on this matter. Indeed, this subtle point has been often missed, as in the case where the CURIE specification states that a CURIE is "a Compact URI,and QNames are a subset of this " [CURIE Syntax]. This is a problem, as CURIEs wish to use the same syntactic colon as QNames. Like many, they seem to have mistakenly assumed QNames are just URIs in disguise, while in fact since QNames and CURIEs are for different things (qualifying or scoping a name with a URI versus abbreviating a URI), so it is logically impossible as the specifications now stand for QNames to be a subset of CURIEs. Also contrary belief, there is no "empty" or "blank" namespace, as an element not given a namespace through one of the two methods simply does not have a namespace and so is not a QName or an expanded name.

Still, the default behavior of some processors is simply to concatenate the namespace and qualified name. This has led to many using the "hash" convention (particularly in RDF and OWL), which is to append a hash to the end of their namespace declaration, as in having xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" and rdf:about resolve to http://www.w3.org/1999/02/22-rdf-syntax-ns#about. In this manner it is still possible to concatenate the local name and namespace URI together and retrieve the name of a valid URI of a namespace document through the use of fragment identifiers. Many XML applications like XML Schema follow the HTML convention that the value of an attribute serves automatically as a fragment identifier reference of a URI, so that xmlns:xs="http://www.w3.org/2001/XMLSchema" and xs:int resolves to http://www.w3.org/2001/XMLSchema#int instead of http://www.w3.org/2001/XMLSchemaint. This led Jonathan Borden to suggest that "When the namespace URI ends in an alphanumeric character treat the local name as a fragment identifier, i.e. insert a '#' between the URI and localname" (http://www.openhealth.org/RDF/QNameQuagmire.html). One might suppose that with Borden's rule, every "name in the namespace" maps to a distinct URI, but this can fail in two ways, since as pointed out by Henry Thompson, "Not all namespaces guarantee uniqueness for their identifiers" as given by the following production: URI(identifier in context of a namespace) = URI(nsid) #? identifier [Thompson, www-tag]. This is exemplified by the case of attributes and elements sharing the same name, which we will consider later. Worse, the namespace prefix could lose its namespace URI, as we demonstrate below.

As stated by the W3C Technical Architecture Group (TAG), since "there is no single, accepted way to convert a QName into a URI...the use of QNames to identify Web resources without providing a mapping to URIs is inconsistent with Web architecture," and as such QNames should not be used in attribute values in the place of a URI [WebArch]. Stated more directly by the W3C TAG, "Do not allow both QNames and URIs in attribute values or element content where they are indistinguishable" [WebArch]. Indeed, it is precisely in this context that some sort of compact URI notation like CURIE could be of use in order to avoid confusion with QNames [CURIE Syntax]. While this usage of QNames in attribute values may seem to be infrequent, it can be quite frequent in XSL transforms and XML Schemas. The reason to beware of this practice without explicit guidance from a standard is that XML processors only resolve QNames to expanded names when they are found as element and attribute names, and are not resolved when found in element content or attribute values. So, an XML processor will not resolve xsl to its namespace URI while it would resolve rdf in <rdf:Description rdf:about="xsl:template" />. If a namespace prefix isn't resolved and there's a document transformation where its declaration is lost, the namespace URI could be lost such that the next XML processor may discover a QName such as xsl and not be able to find its namespace URI.

Even assuming that the namespace declaration is preserved, there is still the chance of attributes and elements sharing the same name. A default namespace can be specified by using the ubiquitous xmlns attribute without any local name, so that it's value is the namespace of elements in the document, but that default namespace does not apply to attributes. This includes attributes of any element with the default namespace attribute! This has led to a gradual evolution in XML use, so while specifications like XSLT do not declare their attributes in a namespace, the RDF XML syntax specification gives all attributes it uses a namespace [RDFXML]. Many find that having to give every attribute a namespace explicitly to be counter-intuitive, since it would seem more natural that the namespace of an element by default qualifies its attribute. The following example is instructive in this regard:

Figure 1: Attribute Values
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                xmlns:ex="http://www.example.org/myexample#"
                xmlns="http://www.example.org/another#">
   <xsl:template match="ex:test">
      <document>
         <ex:about myattribute="found" ex:template="yes" ex:about="maybe" >
            <xsl:apply-templates />
         </ex:about>
      </document>
   </xsl:template>
</xsl:stylesheet>

With this intuition, each attribute would be given as their default namespace the namespace of their containing element. Under this mistaken reading, in the proceeding example, match would automatically be given the xsl namespace. This is not the case, it would only be given that namespace if it was declared xsl:match, regardless of its use in the XSLT application. Only attributes explicitly given namespace prefixes like the attribute ex:template are given namespaces. A namespace in an attribute value, like ex:test, does not have a namespace because the prefix is used in an attributes value, not as an attribute name. Likewise, in the example match simply does not have a namespace. Furthermore, the default namespace http://www.example.org/another# applies to the document element name but does not apply to myattribute, which also has no namespace and so does not have an expanded name. The reasoning behind this state of affairs that does not give an attribute by default the namespace of its element would be that such behavior would prevent the same attribute from being used on multiple elements.

So, the insidious problem that truly dooms QName to URI mappings is that attributes and elements may have the same expanded name, and therefore they would have a URI collision despite being two different kinds of things. This happens with the ex:about element and attribute names in our previous example. One suggestion would be to consider that each attribute be naturally qualified by its element, especially if they share the same namespace. Therefore an attribute could have a unique URI minted for it by concatenating it to its element expanded name in a standardized manner, such that it's namespace would be namespace URI + element name + attribute name, so in our example the attribute ex:about could be thought to be the URI http://www.example.org/myexample#aboutabout. This thinking is in error since it prevents the use of the same attribute in multiple elements. We cannot assume each attribute belongs to a specific named element. The other option would be to preface attributes by some sort of constant in the creation of the URI, such as namespace URI + "attr" + attribute name, which would in our example produce: http://www.example.org/myexample#attrabout. This is a clunky solution at best. Although the argument flies in the face of the standards, one argument for creating URIs out of expanded names is that it allows resources to be associated with particular local names in the namespace in a principled manner. So one could do things like make a RDF statement about only a particular expanded name. This could be very useful in managing different versions of a language. Yet in final analysis a QName is just not a shorthand for a URI, and treating it as such is problematic at best, so the practice is best to be avoided unless explicitly licensed by the standard or the namespace document. Is there a way to explicitly license this mapping from expanded names to URIs in a namespace document? Unfortunately there is no current best practice for namespace documents to license QName to URI mappings in either a human or machine-readable form.

Namespace Documents: RDDL and beyond

As exemplified by the coy XSLT namespace document, early thinking in namespaces believed that schemas, as given by W3C XML Schema or DTDs, to be the one and only namespace document. While schemas are appropriate to link from a namespace document, in general a namespace document should be more rich than only a schema, especially as there are multiple schema languages that one can use for often different purposes. Outside this, there is little agreement on what to actually dereference from a namespace URI. First, a human-readable description of the language should be there, as well as something machine-readable. Second, it would seem the obvious place to put schemas, transforms, and other resources associated with the language. The informal standard RDDL 1.0 (Resource Directory Description Language) fulfills this role [RDDL]. RDDL is descended from the XML Catalog Format ([XNCF]) and the XML Namespace Related-resource Language ([XNRL]) proposals. RDDL 1.0 is an XHTML format that could be considered an early use of "microformats," since it combines both human and machine-readable data by embedding machine-readable semantics in XLink. As regards the semantics of XLink, Tim Bray notes that RDDL 1.0 was "arguably abusing them pretty severely" [Bray, 2004]. RDDL 1.0 introduces a single new identifier, rddl:resource, that identifies the related resources by the xlink:role and xlink:arcrole attributes. Other XLink attributes, like xlink:title are allowed, although xlink:title is restricted to "simple." The rddl:resource is the parent of a XHTML href element linking to the resource. The xlink:role attribute is used to describe the nature of a related resource, which is usually the URI of the standard that defines the type of the related resource, although many things from mailboxes to software have their natures given at http://www.rddl.org/natures/. W3C XML Schemas for a namespace should be linked from a RDDL 1.0 as xlink:role = "http://www.w3.org/2001/XMLSchema". As given by the value of the xlink:arcrole attribute, an optional purpose is "designed to convey the intended usage of the related resource" and a list of purposes is given by http://www.rddl.org/purposes/ [RDDL]. W3C XML Schema, RELAX NG, and Schematron are all used for the "purpose" of validation, as given by xlink:arcrole="http://www.rddl.org/purposes#schema-validation".

Calls for a more minimal syntax and less arbitrary use with role and arcrole, as well as a standard mapping to RDF, has led to the RDDL 2.0 standard that is as of yet incomplete [RDDL2]. Its main feature is a RDF serialization that features rddl:nature and rddl:purpose as RDF alternatives to XLink. It is still a work in progress and not adopted in practice. Current fashion for Semantic Web namespace URIs is to serve the RDF Schemas, often with no connection to human-readable documentation. Microformats, by taking advantage of XHTML attribute values (such as XFN's use of the rel attribute), interestingly enough seem to leave themselves out of the versioning and namespace document story, although it is conceivable future versions will probably attempt to implement some use of versioning and namespaces [XFN]. Despite the need for a standard, no standards bodies have not approved a namespace document policy.

Versioning Problems

Versioning is important: Different versions of a language may specify different application semantics. In practice, there are two general ways to do versioning in XML languages in a given document. The first is to mimic XML itself and use a version attribute on root or arbitrary elements, and the other is to provide a more rich mechanism with links to specify the previous versions. This rich approach is exemplified by the mechanism provided by OWL ontologies to specify prior versions (using the priorVersion predicate) and can specify backwards compatibility, incompatibility, and deprecated classes and properties as well. Versioning is serious for users of the language, for in OWL if "if owl:backwardCompatibleWith is not declared, then compatibility should not be assumed" [OWL Guide]. While RDDL provides a prior-version purpose, it does not let one specify versions in detail. For example, the nature URI for XML Schema (http://www.w3.org/2001/XMLSchema) does not distinguish if version 1.0 or 1.1 of XML Schema is being used. In fact, neither does the namespace document of XML Schema, as it has as related resources only 1.0 2nd Edition normative references.

The approach of using the value of the version attribute in the root element can become problematic. What about the case in which one wants to use names from two versions of a language that use the same namespace? Should one qualify both elements with differing version attributes? One could specify that every version has its own URI, but this is often not the case, and often minor revisions may want to use the same namespace, and only use new namespace URIs for major revisions [Van der Vlist, 2001]. A case example that has attracted attention in the Web Services community, applications may want to revert to a previous version of a language if the they do not have a relevant schema or other resource to process the newest version of the language, even if the document specifies that the processor should use the latest version, in order to "scrape some information out" [Thompson, 2004].

One would hope you can just put the version number in the URI, perhaps by writing the year of the specification in the namespace URI. This is done by the W3C in the namespace of XHTML: http://www.w3.org/1999/xhtml. Regardless, the approach of trying to throw all the relevant versioning information in the URI does not solve the problem cleanly. This approach violates the rule of URI Opacity given by the W3C TAG: "Agents making use of URIs SHOULD NOT attempt to infer properties of the referenced resource"[WebArch]. The problem of figuring out more information about a language can be solved more easily by letting the URI dereference a namespace document that provides such information.

Namespaces Documents: A Survey

Before any suggestions are made as regards the shape of what should be in a namespace document, there should be some observation of what people are doing on the ground. Since there are no (even decentralized) namespace directories, in the following tabl we survey a number of namespace documents used by well-known languages:

Table 1
Standard Namespace URI Namespace Document
XHTML 1.0 http://www.w3.org/1999/xhtml XHTML
XSLT (1.0 and 2.0) http://www.w3.org/1999/XSL/Transform Near Empty XML Document
W3C XML Schema (1.0 and 1.1) http://www.w3.org/2001/XMLSchema RDDL 1.0
WS-Addressing 1.0 http://www.w3.org/2005/08/addressing/ RDDL 1.0
XQueryX http://www.w3.org/2005/XQueryX/ RDDL 1.0
RSS 1.0 http://web.resource.org/rss/1.0/ RDDL 1.0
Soap 1.2 http://www.w3.org/2003/05/soap-envelope W3C XML Schema
RDF Syntax http://www.w3.org/1999/02/22-rdf-syntax-ns# XHTML
OWL (Lite, DL, and Full) http://www.w3.org/2002/07/owl# RDFS
DocBook http://www.docbook.org/ns/docbook XHTML
FOAF http://xmlns.com/foaf/0.1/ XHTML
DOAP http://usefulinc.com/ns/doap# RDFS
DOLCE Lite http://www.loa-cnr.it/ontologies/DOLCE-Lite.owl OWL

RDDL 1.0 is used in new XML standards and many Web Service standards. Semantic Web standards routinely deliver RDF Schemas. Some namespaces just serve plain XHTML or absolutely nothing.

Namespaces: Three Interpretations

The minimalist reading of namespaces states that anyone can mint a new name by just adding a local name in a namespace in a document they produce. The power of defining the "names in a namespace" is in the hands of the user, not the owner of the namespace URI. Against the intuitions of many people unfamiliar with XML, a namespace sets absolutely no constraints on the number and kinds of names in a language. XML parsers do not attempt to retrieve anything at all from a namespace URI. The number of distinct local names that may be attached to a single namespace is infinite. This is the reading sanctioned the XML Namespaces specification [Namespaces]. As noted by Henry Thompson, "The minimalist reading is the only one consistent with actual usage -- people mint new namespaces by simply using them in an expanded name or namespace declaration, without thereby incurring any obligation to define the boundaries of some set" [Thompson, 2005]. While there has been plenty of vigorous debate about namespaces, the minimalist interpretation is widespread precisely because there is no alternative. This interpretation does not manage the versioning of XML languages or take advantage of the use of namespace documents to check the correctness of the name use in a document.

A maximalist reading of namespaces would state that there is some finite number of names in a language and local names in a namespace with some standard usage. The number of names in a namespace is defined by the owner of the namespace URI as opposed to any other user. Furthermore, a true maximalist would prefer each expanded name in a namespace to expand to a unique URI that denotes a secondary resource using some construction rule. Since a URI has a distinct owner, the owner would be the final arbitrator of the language, and any non-standard usage of the language, such as minting a new expanded name that wasn't given in the namespace document would be wrong. While this is obviously very restrictive, it is more or less how names of core constructs work in programming languages. However, this is too stringent and incompatible with most existing work.

A balance can be struck between the maximalist and minimalist readings, creating a pragmatic reading of namespaces that gives the owner of XML languages a way of expressing more information about their language in the namespace document. This would give the user more options by allowing them to discover exactly how the owner of the language wants the language to be used. The owner should choose if they prefer a maximalist, minimalist, or some moderate reading of their namespace. The user is not compelled to follow the owner's guidelines, but can at least be aware of them if they so wish. So namespace documents should state whether or not the space of possible names is delimited and whether or not every name in the namespace has a unique expanded that name that maps to a URI. A namespace document should state the version of a language, and keep track of version changes over time. For a particular name, it should state what version ranges it can be used in, and attach a human readable description to each name. This would give the user some advantages in exchange for letting them restrict themselves. For example, instead of having to worry about documenting their usage of particular names, by sticking to the namespace document given by the owner of the namespace URI, a user could pass XML documents to other applications and know that if those applications were not sure how to process a given name, the application could get a namespace document that would tell them how.

LDVL: Language Definition and Versioning Language

Henry Thompson proposed a language, which he tentatively called LDVL (Language Definition and Versioning Language) for describing names in a language [Thompson, 2005]. While previous work has tried to map all sorts of information into the namespace URI, the work presented in this paper places versioning and more information in a namespace document. Putting the language description in a namespace document as opposed to the URI not only upholds the principle of URI opacity, but allows current namespaces to keep being used without any modification. It also allows the information stated about a namespace to be extended upon arbitrarily, without resulting in ridiculously long namespace URIs. It allows "approved" names to be both added and subtracted from a namespace. Finally, we allow elements and attributes to be listed as finite sets without mistakenly forcing one to interpret an expanded name to be a URI.

In LDVL, each name can also be considered to have at least four properties: a language, a version, a kind, and a definition. Since there is more than one name per kind, it is only the combination of kind and name that makes sense when retrieving version ranges and definitions. Given a language version, a kind, and a name, one can get a definition, which is the human or machine-readable explanation of the name. In more detail, each name in a language can contain the following name metadata:

  • Language: The language that defines the meaning and use of every name in an XML language. In a namespace document, as long as the namespace document defines an application, this would be the default application of each name in the language. This application would usually be a URI that pointed to a relevant standard, like XSLT 2.0 or XQuery 1.0.
  • Version: The particular version(s) of the language used by the name. Variants such as strict versus transitional (or ad-hoc useful ones such as normal versus testing) are considered distinct versions. We use the words "variant" and "version" interchangeably. Versions do not have to be linear (1.0 traditional vs. 1.1) or numeric (strict versus transitional), and can be ranges (1.0 to 2.3). A name can have more than one variant, so you can have a "strict 2.0" as opposed to a "transitional 1.0," with both "strict" and "1.0" being versions. In summary, a name (combined with a sort and a definition) does not have versions, it has version ranges. These version ranges must be valid versions of the language.
  • Kind: What kind of thing the name denotes. This is often context-dependent, but by default many may wish to use some part of the XML Infoset to help them distinguish whether a given name should be used as an element or an attribute. However, they could also denote functions, classes, predicates, and so on.
  • Name: The name of the identifier, which can further be divided into the local name and the namespace URI (although this should default to the namespace URI of the language a name is part of), as well as an optional preferred method of creating an expanded name and a preferred prefix for the namespace URI.
  • Definition: The human-readable definition of the name. This may also contain machine-readable information. While it is the hardest part to formalize, it is the most crucial, since it is usually this information that a user wishes to retrieve.

One further option, in line with our pragmatic reading, should be:

  • Required: Whether this name is either required ("true") or optional ("false") in instance documents of a given language.

These can interact or be redundant in many ways. In some instances the language and its version can be considered the same if there is only one version of the language. Versions and the namespace URI can be considered the same, if for each version of a language a new namespace URI is minted. Often determining the kind of a name can be difficult to determine without some human-readable documentation or the context of its use in a particular instance document. As put by Henry Thompson, "Therefore, (versions of) languages tell you, for every kind they care about, what names are used for things of that kind, and, for those that have definitions, what their definition is" [Thompson, www-tag]. Every local name in a language can be annotated with these five attributes. One could imagine this being done by a modification of a Post-Schema Validation Infoset for a single document. However, it can also be done for an entire XML language by making this sort of information available in the namespace document in a interoperable manner by using RDF and XHTML.

A language itself would also have related language metadata. Obviously the human readable title would be included, as would the namespace URI and a preferred abbreviation for the namespace prefix for QNames, as well as a set of URIs that defined the application.

  • Version: URIs that point to the namespace documents of the version (variants), or just strings that give version name if no URI is available. In the case in which there was not a version, it can be left out. An optional "CurrentVersion" can point to the most current version and while a "PreviousVersion" can point to or contain a previous version. A name may participate in more than one version.
  • Unique: Whether or not every QName converts to an expanded name that maps to a unique URI via concatenation.
  • Restricted: Whether or not the namespace allows users to expand it, i.e. to mint new names using its namespace URI.
  • Change Policy: A link to the URI of the change policy of a text string describing it.
  • Owner: The body or bodies responsible for maintaining and updating the language. This could have a large number of optional metadata involved with it, such as name, organization, contact details, and so on.
  • Related Resource: A related resource to the language, such as a schema, a transform, normative or informative documentation, and so on. Each should be given a RDDL "nature" and an optional "purpose". A name that gives the resource a human-readable title and a URI with its location should also be provided, along with an option to tell if its normative or not.
  • NameMetadata: This points to the LDVL formulation for each name in the namespace.

LDVL's information can be expressed as a BNF. Every item is strictly optional, but items explicitly marked as optional are ones that we expect will actually be optional and but will likely be in common usage, while the others are recommended:

Figure 2: BNF Grammar for XMLVS LDVL
<Language> ::=           <Title>
                        {<NamespaceURI>}
      {<NamespacePrefix>}
      {<Application>}*
                        [<CurrentVersion>]
                        {<PreviousVersion>}*
                        [<Version>}
      [<Unique>]
                        [<Restricted>]
                        [<Owner>]
                        [<ChangePolicy>]
                        {<RelatedResource>}*
      {<NameMetadata>}*
                        {<Definition>}              
<Title> ::= "xsd:string"
<NamespaceURI> ::= "xsd:anyURI" | none  
<NamespacePrefix> ::= "xsd:string"       
<CurrentVersion> ::=  <Version>
<PreviousVersion> ::= <Version>
<Version> ::= "xsd:decimal" |
              "xsd:URI" |
              "xsd:string" 
<Unique> ::= "xsd:boolean" | "xsd:string"
<Restricted> ::= "xsd:boolean"
<Owner> ::= "xsd:string" | "xsd:URI" 
<ChangePolicy> ::= "xsd:string" | "xsd:URI"
<RelatedResource> ::= <Nature>
                      {<Purpose>}
                       <Location>
                      {<Title>}
                      {<Normative>}             
          {<CurrentVersion>}
          {<PreviousVersion>}
          {<Version>}*
<Normative> ::= "xsd:boolean"
<Nature> ::= "xsd:anyURI"
<Purpose> ::= "xsd:anyURI" 
<Location> ::= "xsd:anyURI" 
<NameMetadata> ::= <Localname> 
                   <Language>+
                   {<CurrentVersion>}
                   {<PreviousVersion>}
                   {<Version>}*
                   {<Kind>}
                   {<NamespaceURI>}
       {<NamespacePrefix>}
                   {<Required>}
                   {<Title>}
       {<ExpandedURI>}
                   {<Definition>}
<Localname> ::= "xsd:NCName"     
<Language> ::= "xsd:anyURI" | "xsd:string"
<Kind> ::= "xsd:anyURI" | "xsd:string"
<Required> ::= "xsd:boolean"
<ExpandedURI> ::= "xsd:anyURI" 
<Definition> ::= "xsd:string" | "xsd:anyURI"  

XMLVS: XML and RDF Language Examples

The ideas presented in LDVL can be serialized as a XML language, which we will call the XMLVS, the XML Versioning System language. W3C XML Schemas for this XML language are given at http://www.xmlvs.org/xmlvs/schema. A brief example of how we could handle language management for Atom is given in XML below, although this is only a subset of Atom used to show off some of the harder constructs the we can handle. First, note that we replicate in LDVL the functionality of RDDL 1.0, using constructs such as nature and purpose. We also for related resources provide versioning, which is not provided in either RDDL 1.0 or currently RDDL 2.0. To return to an earlier thorny point, if the unique attribute is set to true, then the namespace owner guarantees a unique URI can be constructed from each expanded name and can use the expandedUri element to give the exact URI for each name in the namespace.

First, there are two variants of the XMLVS language, the "strict" and the "lax." The division is simple: In the "strict" version, everything that one wants to make statements about (versions, kinds, even owners) must be given a URI, and the language must have a namespace (although the mapping from names to URIs does not have to be unique). This allows the "strict" version to be mapped to RDF and so be easily extensible. Also although every language may be given a URI, and this URI is usually the same as the namespace URI, it does not have to be, so that we can use different URIs for the namespace and the versions of the language. This allows different versions of the same language to use the same namespace. For example, both XSLT 1.0 and XSLT 2.0 can be given URIs and have different names in their version, but both can also state they use the same namespace URI.

The "lax" vocabulary allows aspects of LDVL given by the BNF where there is an option not to use a URI but instead use only a string or decimal for versions, and to also not specify a namespace URI. The "lax" version exists because there are many popular XML vocabularies, such as some versions of RSS and OPML, that do not use namespaces at all. Yet, they are going through version changes (RSS .93 to 2.0, and OPML 2.0 has been drafted), so it would make sense for any versioning system to be able to describe their versioning.

The heart of the language information is given in the children and attributes of the currentVersion element, which fixes the namespace URI, namespace prefix, date of change, and whether the version has a shortcut identifier as either a string ("HTML Transitional") or number ("1.0"). We allow previous versions to referenced in the same manner. We finally also provide elements to keep track of rich information about the owner. For the sake of records, we also keep the dates of changes. Lastly, we provide a link to explicitly connect to whatever standard is being implemented through the application element and also any explicit policy for changing the namespace policy, as given by the changePolicy element. After the meta-data describing the entire language is dealt with, each name is given a verbal name (title), as well as at least one kind (kind) and at least one version (version). An optional yet crucial definition gives human or machine-readable definitions of the intended use of the name. Here is a colloquial "lax" XML XMLVS document for part of Atom. Note that it is easy for humans to read and is lax because it does not give versions or kinds URIs, but only denotes them as strings or decimals.

Figure 3: Example Atom Namespace in lax colloquial XMLVS
<xvs:language xmlns:xvs="http://www.xmlvs.org/xmlvs#"
              xmlns:xvst="http://www.xmlvs.org/kinds#">
   <xvs:currentVersion xvs:versionId="1.0" xvs:restricted="true" 
                       xvs:uri="http://www.w3.org/2005/Atom#"
                       xvs:dateRelease="2005-08-17T12:15:09Z"
                       xvs:languageNamespace="http://www.w3.org/2005/Atom#">
      <xvs:languageName>Atom Syndication Format</xvs:languageName>        
      <xvs:namespacePrefix>atom</xvs:namespacePrefix>  
      <xvs:changePolicy>Still draft,not sure yet...</xvs:changePolicy>
      <xvs:owner xvs:uri="http://www.atomenabled.org"> Atom-Enabled Alliance</xvs:owner>
      <xvs:dateChange>2005-07-12T17:32:01Z</xvs:dateChange>
      <xvs:dateChange>2004-03-20T16:31:02Z</xvs:dateChange>
      <xvs:previousVersion xvs:versionId="0.3" xvs:uri="http://purl.org/atom/ns#" 
                           xvs:dateRelease="2003-12-02T09:30:10Z" 
                           xvs:languageNamespace="http://purl.org/atom/ns#" >
         <xvs:languageName>Atom Syndication Format (Draft)</xvs:languageName>
      </xvs:previousVersion>
   </xvs:currentVersion>
   <xvs:name xvs:title="id">
      <xvs:title>ID</xvs:title>
      <xvs:kind>http://www.xmlvs.org/kinds#element</xvs:kind>
      <xvs:version xvs:required="true">1.0</xvs:version>
      <xvs:version>0.3</xvs:version>  
      <xvs:definition> Identifies the feed using a universally unique and permanent 
      URI. If you have a long-term, renewable lease on your Internet domain name, then 
      you can feel free to use your website's address.</xvs:definition>
   </xvs:name>
   <xvs:name xvs:title="title">
      <xvs:kind>element</xvs:kind>
      <xvs:version xvs:required="true">1.0</xvs:version>
      <xvs:version>0.3</xvs:version>   
   </xvs:name>  
   <xvs:name xvs:title="author">
      <xvs:kind>element</xvs:kind>
      <xvs:version xvs:required="true">1.0</xvs:version>
      <xvs:version>0.3</xvs:version> 
   </xvs:name>
   ...
   <xvs:name xvs:title="info">
      <xvs:kind>element</xvs:kind>
      <xvs:version>0.3</xvs:version>
   </xvs:name>  
   <xvs:name xvs:title="updated">
      <xvs:kind>element</xvs:kind>
      <xvs:version>1.0</xvs:version>
      <xvs:previousVersion>modified</xvs:previousVersion>
   </xvs:name> 
   <xvs:name xvs:title="modified">
      <xvs:kind>element</xvs:kind>
      <xvs:version>0.3</xvs:version>
      <xvs:newVersion>updated</xvs:newVersion>
   </xvs:name> 
   <xvs:name xvs:title="href">
      <xvs:kind>attribute</xvs:kind>
      <xvs:version>0.3</xvs:version>
      <xvs:version>1.0</xvs:version>
   </xvs:name> 
   ...
   <xvs:relatedResource xvs:uri="http://www.atomenabled.org/feedvalidator/"     
                        xvs:resourceTitle = "Feed Validator"
                        xvs:nature = "http://www.atomenabled.org/feedvalidator/"
                        xvs:purpose= "http://www.rddl.org/purposes#validation">
      <xvs:version>1.0</xvs:version>
   </xvs:relatedResource>
   <xvs:relatedResource xvs:uri="http://www.osjava.org/atom4j/" 
                        xvs:resourceTitle = "Atom4J Atom Java API"
                        xvs:nature = "http://www.rddl.org/natures/software#java"
                        xvs:purpose= "http://www.rddl.org/purposes#JAR">
      <xvs:version>1.0</xvs:version>
   </xvs:relatedResource> 
   <xvs:relatedResource xvs:normative="true"
                        xvs:resourceTitle="IETF RC 4287"
                        xvs:uri="http://www.ietf.org/rfc/rfc4287.txt"    
                        xvs:nature = "http://www.ietf.org/rfc/rfc2026.txt"
                        xvs:purpose= "http://www.rddl.org/purposes#normative-reference">
      <xvs:version>1.0</xvs:version>
   </xvs:relatedResource> 
</xvs:language>

The upgrade path from "strict" to "lax" colloquial XMLVS is easy: Just add URIs! However, there is one crucial difference. We can posit the xmlvs:uri attribute of names to be fundamentally arbitrary, and then use the expandedURI element to contain the preferred URI to be constructed out of the URI. However, most people who follow the pragmatic reading of namespaces would prefer if the URI used to talk about a name in a namespace mapped to a URI via the default concatenation of the namespace URI and the local name, and so if this is the case, they can use the xmlvs:uri as the constructed URI (and marking this with a unique attribute set to true). We follow this second convention in the example below (although we show via the atom:id name the use of the expandedUri to implement the first convention). Also, since in the "strict" vocabulary every kind must have a URI and in Atom only elements are given namespaces, every name is the example is given the URI xsvt:element. Ideally, one could define kinds by pointing to a URI that defines URIs for every type of information item in the Infoset. The example of this strict colloquial XMLVS language for a fragment of Atom is below:

Figure 4: Example Atom Namespace in strict colloquial XMLVS
<xvs:language xmlns:xvs="http://www.xmlvs.org/xmlvs#"
              xmlns:xvst="http://www.xmlvs.org/kinds#">
   <xvs:currentVersion xvs:versionId="1.0" xvs:restricted="true" 
                       xvs:uri="http://www.w3.org/2005/Atom#"
                       xvs:dateRelease="2005-08-17T12:15:09Z"
                       xvs:unique="true"
                       xvs:languageNamespace="http://www.w3.org/2005/Atom#" >
      <xvs:languageName>Atom Syndication Format</xvs:languageName>
      <xvs:application xvs:uri="http://www.ietf.org/rfc/rfc4287.txt" /> 
      <xvs:namespacePrefix>atom</xvs:namespacePrefix>
      <xvs:changePolicy xvs:uri="http://www.ietf.org/rfc/rfc4287.txt"/>
      <xvs:owner xvs:uri="http://www.atomenabled.org"> 
         <xvs:ownerName>Tim Bray</xvs:ownerName>
         <xvs:ownerRole>Atompub Co-Chair</xvs:ownerRole>
         <xvs:ownerEmail>tbray@textuality.com</xvs:ownerEmail>
         <xvs:ownerOrganization>Atom-Enabled Alliance</xvs:ownerOrganization> 
      </xvs:owner>
      <xvs:dateChange>2005-07-12T17:32:01Z</xvs:dateChange>
      <xvs:dateChange>2004-03-20T16:31:02Z</xvs:dateChange>    
      <xvs:previousVersion xvs:versionId="0.3" 
                           xvs:uri="http://purl.org/atom/ns#" 
                           xvs:dateRelease="2003-12-02T09:30:10Z" 
                           xvs:languageNamespace="http://purl.org/atom/ns#" >
         <xvs:languageName>Atom Syndication Format (Draft)</xvs:languageName>
         <xvs:application xvs:uri="http://www.mnot.net/drafts/
            draft-nottingham-atom-format-02.html" />
      </xvs:previousVersion>
   </xvs:currentVersion>
   <xvs:name xvs:uri="http://www.w3.org/2005/Atom#id" xvs:title="ID">
      <xvs:localname>id</xvs:localname>
      <xvs:namespace xvs:uri="http://www.w3.org/2005/Atom#" />
      <xvs:namespacePrefix>atom</xvs:namespacePrefix>
      <xvs:expandedUri xvs:uri="http://www.w3.org/2005/Atom#id" />
      <xvs:kind xvs:uri="http://www.xmlvs.org/kinds#element" />
      <xvs:version xvs:required="true" xvs:uri="http://www.w3.org/2005/Atom#"/>
      <xvs:version xvs:uri="http://purl.org/atom/ns#" />    
      <xvs:definition> Identifies the feed using a universally unique and permanent 
      URI. If you have a long-term, renewable lease on your Internet domain name, 
      then you can feel free to use your website's address.</xvs:definition>
   </xvs:name>
   <xvs:name xvs:uri="http://www.w3.org/2005/Atom#title">
      <xvs:localname>title</xvs:localname>
      <xvs:kind xvs:uri="http://www.xmlvs.org/kinds#element"/>
      <xvs:version xvs:required="true" xvs:uri="http://www.w3.org/2005/Atom#"/>
      <xvs:version xvs:uri="http://purl.org/atom/ns#"/>   
   </xvs:name>  
   <xvs:name xvs:uri="http://www.w3.org/2005/Atom#author">
      <xvs:localname>author</xvs:localname>
      <xvs:kind xvs:uri="http://www.xmlvs.org/kinds#element"/>
      <xvs:version xvs:required="true" xvs:uri="http://www.w3.org/2005/Atom#"/>
      <xvs:version xvs:uri="http://purl.org/atom/ns#"/> 
   </xvs:name>
   ...
   <xvs:name xvs:uri="http://www.w3.org/2005/Atom#info">
      <xvs:localname>info</xvs:localname>
      <xvs:kind xvs:uri="http://www.xmlvs.org/kinds#element"/>
      <xvs:version xvs:uri="http://purl.org/atom/ns#"/>
   </xvs:name>  
   <xvs:name xvs:uri="http://www.w3.org/2005/Atom#updated">
      <xvs:localname>updated</xvs:localname>
      <xvs:kind xvs:uri="http://www.xmlvs.org/kinds#element" />
      <xvs:version xvs:uri="http://www.w3.org/2005/Atom#"  />
      <xvs:previousVersion xvs:uri="http://purl.org/atom/ns#">modified</xvs:previousVersion>
   </xvs:name> 
   <xvs:name xvs:uri="http://purl.org/atom/ns#modified">
      <xvs:localname>modified</xvs:localname>
      <xvs:kind xvs:uri="http://www.xmlvs.org/kinds#element"/>
      <xvs:version xvs:uri="http://purl.org/atom/ns#"/>
      <xvs:newVersion xvs:uri="http://www.w3.org/2005/Atom#">updated</xvs:newVersion>
   </xvs:name> 
   <xvs:name xvs:uri="http://purl.org/atom/ns#href">
      <xvs:localname>href</xvs:localname>
      <xvs:namespace>none</xvs:namespace>
      <xvs:kind xvs:uri="http://www.xmlvs.org/kinds#attribute" />
      <xvs:version xvs:uri="http://purl.org/atom/ns#" />
      <xvs:version xvs:uri="http://www.w3.org/2005/Atom#"  />
   </xvs:name> 
   ...
   <xvs:name xvs:title="http://www.w3.org/2005/Atom#version">
      <xvs:kind xvs:uri="http://www.xmlvs.org/kinds#attribute" />
      <xvs:version xvs:uri="http://purl.org/atom/ns#" />
      <xvs:version xvs:uri="http://www.w3.org/2005/Atom#"  />
   </xvs:name> 
   <xvs:relatedResource xvs:uri="http://www.atomenabled.org/feedvalidator/"     
                        xvs:resourceTitle="Feed Validator"
                        xvs:nature="http://www.atomenabled.org/feedvalidator/"
                        xvs:purpose="http://www.rddl.org/purposes#validation">
      <xvs:version xvs:uri="http://www.w3.org/2005/Atom#"/>
      <xvs:definition>This Web Service validates Atom Feeds</xvs:definition>
   </xvs:relatedResource>
   <xvs:relatedResource xvs:uri="http://www.osjava.org/atom4j/" 
                        xvs:resourceTitle = "Atom4J Atom Java API"
                        xvs:nature="http://www.rddl.org/natures/software#java"
                        xvs:purpose="http://www.rddl.org/purposes#JAR">
      <xvs:version xvs:uri="http://www.w3.org/2005/Atom#"/>
   </xvs:relatedResource> 
   <xvs:relatedResource xvs:normative="true"
                        xvs:resourceTitle="IETF RFC 4287"
                        xvs:uri="http://www.ietf.org/rfc/rfc4287.txt"    
                        xvs:nature="http://www.ietf.org/rfc/rfc2026.txt"
                        xvs:purpose="http://www.rddl.org/purposes#normative-reference">
      <xvs:version xvs:uri="http://www.w3.org/2005/Atom#" />
      <xvs:definition>The IETF RFC 4287 document is the normative definition of 
      Atom.</xvs:definition>
   </xvs:relatedResource> 
</xvs:language>

There are a few changes in this strict version. An explicit application pointing to the Atom standard is given. A title attribute no longer contains each name, as the localname element now denotes the name. A more human readable name can still be given via the title attribute, as shown by the id name in the XMLVS example. The possibility of giving a name an unique URI (as one would get from constructing one from the namespace prefix and the QName) is given by the expandedUri element. As shown by id in the example, the local name and namespace URI are explicitly separated (as given by localname and namespaceUri elements). Lastly, notice how the naming conflict between elements and attributes can be easily resolved in XMLVS. Attributes and elements with the same name are distinguished first by being given unique URIs via the uri attribute, and this URI doe not have to follow a simple mapping rule that combines their local name and namespace URI. Secondly, as shown by the example href attribute in Atom, attributes are given a different kind than elements. Lastly, as also shown by href, if an attribute does not have a namespace but it is used by a language, the namespace element of that name can be set to none. Names from other namespaces can be imported by including their language and names in the XMLVS file, although this is not shown in this relatively straightforward example.

XMLVS easily handles the upgrading of Atom from 0.3 to 1.0, which includes several substantial modifications. XMLVS handles namespace documents for multiple versions within a single namespace document. Names are changed: The name modified is renamed to updated. The info name is deprecated from version 1.0. Yet our XMLVS file allows us to keep track of multiple versions for a name in a single file by associating the version with the name and letting names have multiple versions as given by their version elements. We also via the use of newVersion and previousVersion elements can keep track of when something changes a name. However, if expandedUri, namespacePrefix, and namespace elements refer only to the current version of the language, then it makes sense to separate out names from different verisons under differing name elements, linking the URIs of the names with a previousVersion element.

As mentioned earlier, the problem of constructing URIs from expanded names can be solved by explicitly adding a constructed URI to each name in XMLVS via the expandedUri name. If the unique attribute of the language is set to true, then if both the expandedUri and namespace elements are missing from a name then URI for each expanded name should be given by the name's uri attribute. This is obviously applicable to every name in the Atom example except href(since it has a namespace set to none), therefore ruling the use of a expandedUri element redundant for this particular example. This example also shows how related resources, everything from an online feed validator to the upcoming normative IETF Atom RFC, can be linked to from the language and even associated with particular versions of the language in XMLVS.

The mapping from the "strict" XMLVS language to RDF is fairly straightforward. An RDF Schema is given here http://www.xmlvs.org/xmlvsrdf. Our previous example is translated into RDF/XML below:

Figure 5: Example Atom Namespace Document in RDF
<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" 
         xmlns:xvs="http://www.xmlvs.org/xmlvs#" 
         xmlns:xvsr="http://www.xmlvs.org/xmlvsrdf#" 
         xmlns:xvst="http://www.xmlvs.org/kinds#" 
         xmlns:rddl="http://www.rddl.org/" 
         xmlns:xlink="http://www.w3.org/1999/xlink#">
   <xvsr:Language rdf:about="http://www.w3.org/2005/Atom#">
      <xvsr:languageName>Atom Syndication Format</xvsr:languageName>
      <xvsr:languageNamespace rdf:resource="http://www.w3.org/2005/Atom#"/>
      <xvsr:changePolicy rdf:resource="http://www.ietf.org/rfc/rfc4287.txt"/>
      <xvsr:owner>
         <rdf:Description rdf:about="http://www.atomenabled.org">
            <xvsr:ownerName>Tim Bray</xvsr:ownerName>
            <xvsr:ownerRole>Atompub Co-Chair</xvsr:ownerRole>
            <xvsr:ownerEmail>tbray@textuality.com</xvsr:ownerEmail>
            <xvsr:ownerOrganization>Atom-Enabled Alliance</xvsr:ownerOrganization>
         </rdf:Description>
      </xvsr:owner>
      <xvsr:previousVersion>
         <xvsr:Language rdf:about="http://purl.org/atom/ns#">
            <xvsr:languageName>Atom Syndication Format (Draft)</xvsr:languageName>
            <xvsr:languageNamespace rdf:resource="http://purl.org/atom/ns#"/>
            <xvsr:dateRelease>2003-12-02T09:30:10Z</xvsr:dateRelease>
            <xvsr:versionId>0.3</xvsr:versionId>
            <xvsr:application rdf:resource="http://www.mnot.net/drafts/
              draft-nottingham-atom-format-02.html"/>
         </xvsr:Language>
      </xvsr:previousVersion>
      <xvsr:dateRelease>2005-08-17T12:15:09Z</xvsr:dateRelease>
      <xvsr:dateChange>2005-07-12T17:32:01Z</xvsr:dateChange>
      <xvsr:dateChange>2004-03-20T16:31:02Z</xvsr:dateChange>
      <xvsr:versionId>1.0</xvsr:versionId>
      <xvsr:application rdf:resource="http://www.ietf.org/rfc/rfc4287.txt"/>
      <xvsr:restricted>true</xvsr:restricted>
      <xvsr:unique>true</xvsr:unique>
   </xvsr:Language>
   <xvsr:Name rdf:about="http://www.w3.org/2005/Atom#id">
      <xvsr:title/>
      <xvsr:version rdf:resource="http://www.w3.org/2005/Atom#"/>
      <xvsr:version rdf:resource="http://purl.org/atom/ns#"/>
      <xvsr:localname>id</xvsr:localname>
      <xvsr:namespace rdf:resource="http://www.w3.org/2005/Atom#"/>
      <xvsr:namespacePrefix>atom</xvsr:namespacePrefix>
      <xvsr:expandedUri rdf:resource="http://www.w3.org/2005/Atom#id"/>
      <xvsr:kind rdf:resource="http://www.xmlvs.org/kinds#element"/>
      <xvsr:definition> Identifies the feed using a universally unique and 
      permanent URI. If you have a long-term, renewable lease on your Internet 
      domain name, then you can feel free to use your website's address.</xvsr:definition>
   </xvsr:Name>
   <xvsr:Name rdf:about="http://www.w3.org/2005/Atom#title">
      <xvsr:version rdf:resource="http://www.w3.org/2005/Atom#"/>
      <xvsr:version rdf:resource="http://purl.org/atom/ns#"/>
      <xvsr:localname>title</xvsr:localname>
      <xvsr:kind rdf:resource="http://www.xmlvs.org/kinds#element"/>
   </xvsr:Name>
   <xvsr:Name rdf:about="http://www.w3.org/2005/Atom#author">
      <xvsr:version rdf:resource="http://www.w3.org/2005/Atom#"/>
      <xvsr:version rdf:resource="http://purl.org/atom/ns#"/>
      <xvsr:localname>author</xvsr:localname>
      <xvsr:kind rdf:resource="http://www.xmlvs.org/kinds#element"/>
   </xvsr:Name>
   ...
   <xvsr:Name rdf:about="http://www.w3.org/2005/Atom#info">
      <xvsr:version rdf:resource="http://purl.org/atom/ns#"/>
      <xvsr:localname>info</xvsr:localname>
      <xvsr:kind rdf:resource="http://www.xmlvs.org/kinds#element"/>
   </xvsr:Name>
   <xvsr:Name rdf:about="http://www.w3.org/2005/Atom#updated">
      <xvsr:version rdf:resource="http://www.w3.org/2005/Atom#"/>
      <xvsr:localname>updated</xvsr:localname>
      <xvsr:kind rdf:resource="http://www.xmlvs.org/kinds#element"/>
   </xvsr:Name>
   <xvsr:Name rdf:about="http://purl.org/atom/ns#modified">
      <xvsr:version rdf:resource="http://purl.org/atom/ns#"/>
      <xvsr:localname>modified</xvsr:localname>
      <xvsr:kind rdf:resource="http://www.xmlvs.org/kinds#element"/>
   </xvsr:Name>
   <xvsr:Name rdf:about="http://purl.org/atom/ns#href">
      <xvsr:version rdf:resource="http://purl.org/atom/ns#"/>
      <xvsr:version rdf:resource="http://www.w3.org/2005/yAtom#"/>
      <xvsr:localname>href</xvsr:localname>
      <xvsr:kind rdf:resource="http://www.xmlvs.org/kinds#attribute"/>
   </xvsr:Name>
   ...
   <rdf:Description rdf:about="http://www.atomenabled.org/feedvalidator/">
      <xlink:title>Feed Validator</xlink:title>
      <rddl:nature rdf:resource="http://www.atomenabled.org/feedvalidator/"/>
      <rddl:purpose rdf:resource="http://www.rddl.org/purposes#validation"/>
      <xvsr:version rdf:resource="http://www.w3.org/2005/Atom#"/>
      <xvsr:definition>This Web Service validates Atom Feeds</xvsr:definition>
   </rdf:Description>
   <rdf:Description rdf:about="http://www.osjava.org/atom4j/">
      <xlink:title>Atom4J Atom Java API</xlink:title>
      <rddl:nature rdf:resource="http://www.rddl.org/natures/software#java"/>
      <rddl:purpose rdf:resource="http://www.rddl.org/purposes#JAR"/>
      <xvsr:version rdf:resource="http://www.w3.org/2005/Atom#"/>
   </rdf:Description>
   <rdf:Description rdf:about="http://www.ietf.org/rfc/rfc4287.txt">
      <xlink:title>IETF RFC 4287</xlink:title>
      <rddl:nature rdf:resource="http://www.ietf.org/rfc/rfc2026.txt"/>
      <rddl:purpose rdf:resource="http://www.rddl.org/purposes#normative-reference"/>
      <xvsr:version rdf:resource="http://www.w3.org/2005/Atom#"/>
      <xvsr:definition>The IETF RFC 4287 document is the normative definition of Atom.</xvsr:definition>
      <xvsr:normative>true</xvsr:normative>
   </rdf:Description>
</rdf:RDF>

The RDF version of the XMLVS namespace document is extremely similar by design to the "strict" colloquial XML language provided, with a few subtle changes. The primary difference is that (since RDF is URI-based) it absolutely needs URIs to talk about names in the language and different versions of the language. An alternative would have been to use numbers and strings as our primary way to denote versions instead of using URIs. If two XMLVS RDF graphs were merged that didn't use URIs for versions, the RDF graphs would assume statements about version 1.0 of one language could be mapped to statements about version 1.0 of a different language, which is almost always incorrect. There are numerous advantages conferred by the use of RDF. Among them, RDF allows Semantic Web-enabled processors to use your namespace document to automatically locate related resources like XML Schemas and so pave the way for automated schema location, validation, and language transformation. Even on the level of just names, it allows machines to achieve partial "understanding" of documents by determining if a document is using valid language names and by tracing the evolution of names. The RDF Schema of XMLVS also maps XMLVS to other commonly understood vocabularies such as Dublin Core, and so it correctly maps xvsr:dateChange as a rdfs:subPropertyOf of dc:modified and xvsr:version as a rdfs:subPropertyOf of dc:hasVersion.

In a nutshell, the "lax" colloquial XMLVS language handles versioning and for existing XML languages, including those without namespaces, while the RDF ones require namespaces, unique names for names, and preferably different URIs for different versions. We imagine users can map existing languages to the "lax" XML format, and later upgrade to using the "strict" format and then easily map that to the RDF version. This allows namespaces to slowly be brought more in-line with W3C best practices.

XMLVS is work over and beyond the minimal standards, and as such, this work is currently unstandardized but offers advantages to early adopters. First, the "Architecture of the Web" of the W3C states that "An XML format specification SHOULD include information about change policies for XML namespaces"[WebArch] and where better to put some information about change policy than in namespace document? Also, in particular the W3C Technical Architecture Group further states that "The owner of an XML namespace name SHOULD make available material intended for people to read and material optimized for software agents in order to meet the needs of those who will use the namespace language," and XMLVS provides both a machine and human-readable namespace document [WebArch]. So while the minimalist reading of name-spaces is standard compliant, a pragmatic interpretation of namespaces is encouraged by the W3C.

Moreover, using consistent namespace documents gives a host of advantages. When encountering a language with a known mapping to another language, and if one wants information in the latter language for use in an application, a RDDL link to an XSLT stylesheet can be automatically used to translate. If a new version of a language is given to an application that uses an older one, one can use the namespace document to discover mappings from the newer language to the older one, allowing the document to be processed gracefully (a use scenario that has received considerable attention from the Web Services community [Thompson, 2004]). One could use XMLVS in RDF to retrieve schemas to automatically validate an instance document. It could even allow the automatic upgrading of legacy XML languages to their newest version, and do checks of "namespace validity" to make sure one isn't minting new names in a namespace when for that particular language that practice is discouraged. It also would prevent controversies, since it would allow a namespace owner to define whether or not the number of names in a namespace were restricted or not, since as stated as good practice that "specifications that define namespaces SHOULD explicitly state their policy with respect to changes in the names defined in that namespace." [Disposition of Names]. Furthermore, the exact additions and deprecations of "names from a namespace" can therefore be recorded in detail.

XMLVS: Practical Namespace Document Management

In order to make the management of XML languages as easy as possible, we have created a number of modular programs that allow one to manage XMLVS files. The entire package can be found at http://www.xmlvs.org/download, although it is still experimental.

We first allow people to author XMLVS documents in colloquial XML or even RDF by hand or by the automated interface of their choice. Given any XMLVS colloquial XML file, we provide a XMLVS to RDDL 1.0 XSLT transformation. Given a "strict" colloquial XML XMLVS file, we provide XSLT to transform it into a valid XMLVS RDF file as given by the XMLVS RDF Schema. This allows both human and machine-readable namespace documents to be created using best practice standards.

Finally, we're working on using a versioning control system for XMLVS documents themselves, based on Simon Yuill's Social Versioning System [SVS]. This is provided for two reasons. First, the Social Versioning System, by allowing arbitrary (Python) code to be executed whenever a new version is checked in, allows a user to "check in" a new XMLVS document and then have SVS automatically invoke XSLT to create RDF and RDDL versions of the document. It can also upload those files (if the user supplies SVS the correct parameters such as username and password) to the server where the namespace document(s) is hosted. This simple functionality is provided, and users competent in Python can add other arbitrary functionality to SVS by passing SVS Python objects. Second, by providing version control, it allows all changes to a namespace to be tracked by date and version number, even if they are destroyed in the most current particular XMLVS file for a language. So if in your colloquial XMLVS XML document you decide to truncate all older versions, those older versions are maintained in the SVS archives of the XMLVS file. SVS also allows us to check in files other than colloquial XML XMLVS or RDF files, and can include (even as binaries) any related resource like JAR files and XSLT transformations, that are in turn given by the RDDL. These in turn can also be automatically uploaded to a server. In summary, SVS provides a CVS-like capability to check-in, check-out, and branch XML languages as embodied by both XMLVS files and whatever related resources exist, and also allows customizable execution of code, like XSLT transforms to RDDL and RDF, whenever a major change is checked in.

By using XMLVS, the namespace would be maintained itself as a collection of XHTML and RDF files and these would be knitted together and have version control maintained by a versioning control system like SVS. Although there may be other ways to maintain and version XML namespaces, this method allows users to scrap their boilerplate coding and get to the hard work of creating XML applications without having to worry about maintaining our namespaces any more than necessary for good practice. Furthermore, by providing easy-to-use vocabularies and tools to automate the creation of namespace documents, the likelihood of namespace documents actually being widespread and useful becomes much more likely.

Conclusion

There are a few "good practice" lessons that people managing XML languages might want to take home. Although these are common sense among many in the markup community, the amount of variance in the Web as regards the use of namespace documents is still very large.

  • Use namespaces! Yet do not confuse QNames with URIs!
  • Always explicitly give attributes namespaces, default namespaces do not cover attributes!
  • Use the hash convention when defining namespace with xmlns unless the specification states otherwise.
  • If possible, do not give an attribute the same name as an elements if they share the same namespace.
  • Use some versioning control system for your XML languages and namespace documents, such as XMLVS!

There are also a few questions for various standards

  • Should an expanded name map to a valid URI construction?
  • If so, how can this work given the previous problems pointed with namespace collisions between elements and attributes and the varying ways this is already implemented in diverse standards?
  • Should we let attributes automatically inherit the namespace of their enclosing elements if they are not explicitly given a namespace?
  • Why not allow the default namespace to include not just elements, but attributes? What standards would this break?

The list of good practices and questions is doubtless ambitious, controversial, and also conflicting. For example, if either the behavior of namespace defaulting was changed or attributes inherited the namespace of their element, one would not have to explicitly use namespaces with attributes. Regardless of the particulars of the LDVL proposal and the XMLVS application, there are advantages to consistently using namespace documents and namespace URIs that the Web markup community should investigate. As it stands, only namespace documents can save namespaces and solve the versioning problem.


Bibliography

[Berners-Lee, 1998] Berners-Lee, Tim. Web Architecture from 50,000 feet. http://www.w3.org/DesignIssues/Architecture.html

[Bray, 2004] Bray, Tim. RDDL2 Background. http://lists.w3.org/Archives/Public/www-tag/2004Jan/0045.html

[CURIE Syntax] Birbeck, Mark. CURIE Syntax 1.0 http://www.w3.org/2001/sw/BestPractices/HTML/2005-10-27-CURIE

[Disposition of Names] Walsh, Norm. The Disposition of Names in an XML Namespace. TAG Finding 9 January 2006. http://www.w3.org/2001/tag/doc/namespaceState-2006-01-09.html.

[Infoset] World Wide Web Consortium (W3C). XML Information Set (Second Edition). 2004. Editors J. Cowan and R. Tobin. http://www.w3.org/TR/xml-infoset/

[Namespace Theses] Bray, Tim. Architectural Theses on Namespaces and Namespace Documents http://www.textuality.com/tag/Issue8.html

[Namespaces] World Wide Web Consortium (W3C). Namespaces in XML 1.1. 1999. Editors T. Bray, D. Hollander, A. Layman, and R. Tobin. http://www.w3.org/TR/REC-xml-names/.

[OWL Guide] World Wide Web Consortium (W3C). OWL Web Ontology Language Guide.. 2004. Editors M. Smith, C. Welty, and D. McGuinness. http://www.w3.org/TR/owl-guide/

[RDDL] Borden, J. and Bray, T. Resource Directory Description Language (RDDL). http://www.rddl.org/

[RDDL2] Borden, J. and Bray, T. Resource Directory Description Language (RDDL) Version 2.0. http://www.rddl.org/RDDL2

[RDFXML] World Wide Web Consortium (W3C). RDF/XML Syntax Specification (Revised). 2004. Editor D. Beckett. http://www.w3.org/TR/rdf-syntax-grammar/

[SVS] Yuill, Simon. The Social Versioning System. savannah.nongnu.org/projects/socversys/

[Thompson, 2004] Thompson, Henry. Versioning Made Easy with W3C XML Schema and Pipelines. XML Europe 2004, Amsterdam: The Netherlands.http://www.idealliance.org/papers/dx_xmle04/papers/03-04-04/03-04-04.html

[Thompson, 2005] Thompson, Henry. Names, Namespaces, XML Languages and XML Definition Languages. XML 2005, Atlanta, USA. http://www.idealliance.org/proceedings/xml05/abstracts/paper82.HTML

[Thompson, www-tag] Thompson, Henry. What is a namespace, anyways? http://lists.w3.org/Archives/Public/www-tag/2005Dec/0120

[Van der Vlist, 2001] van der Vlist, Eric. Best practices: Namespaces, versions and RDDL. http://lists.xml.org/archives/xml-dev/200103/msg00995.html

[WebArch] World Wide Web Consortium (W3C). Architecture of the World Wide Web, Volume One. 2004. Editors I. Jacobs and N. Walsh. http://www.w3.org/TR/webarch/

[XFN] Celik, T., Meyer, E. and Mullenweg, M. XHTML Friends Network (XFN). http://gmpg.org/xfn.

[XHTML] World Wide Web Consortium (W3C). XHTML 1.0 The Extensible HyperText Markup Language (Second Edition). 2002. http://www.w3.org/TR/xhtml1/

[xml:id Version 1.0] World Wide Web Consortium (W3C). Namespaces in XML 1.1. 2005. Editors J. Marsh, D. Veillard, and N. Walsh. http://www.w3.org/TR/xml-id/.

[XNCF] Borden, Jonathan. XML Namespace Catalog Format (XNCF). http://www.openhealth.org/XMLCatalog/

[XNRL] Bray, Tim. XML Namespace Related-resource Language (XNRL). http://www.textuality.com/xml/xnrl.html

[XQueryX] World Wide Web Consortium (W3C). XML Syntax for XQuery 1.0 (XQueryX). 2005. Editors J. Melton and S. Muralidhar. http://www.w3.org/TR/xqueryx/

[XSLT 1.0] World Wide Web Consortium (W3C). XSL Transformations (XSLT) 1.0 . 1999. Editor J. Clark. http://www.w3.org/TR/xslt

[XSLT 2.0] World Wide Web Consortium (W3C). XSL Transformations (XSLT) 2.0 . 2005. Editor M. Kay. http://www.w3.org/TR/xslt20/



XMLVS: Using Namespace Documents for XML Versioning

Harry Halpin [School of Informatics, University of Edinburgh]
H.Halpin@ed.ac.uk