Semantics of Well Formed XML as a Human and Machine Readable Language: Why is some XML so difficult to read?

Ann Wrightson


Why is some XML easy to read, and some so difficult? This is an interesting theoretical and practical question that could be addressed in many ways; for me it is part of a long standing interest in how machines and humans communicate effectively with each other.

In this paper, XML fragments are considered as utterances uttered to people, by either machines or people. The meaning conveyed by the XML is investigated using a simple toolkit based on situation semantics, with XML communication considered as part of a broad class of communications involving humans and non-human agents such as animals and machines.

This paper also contains an introduction to basic notions of situation semantics, discussion of how it applies to communications between humans and non-human agents, and an explanation of the background and context of the approach taken in this paper.

Keywords: Semantics; Natural Language Processing

Ann Wrightson

Semantics of Well Formed XML as a Human and Machine Readable Language

Why is some XML so difficult to read?

Ann Wrightson [CSW Group Ltd]

Extreme Markup Languages 2005® (Montréal, Québec)

Copyright © 2005 Ann Wrightson. Reproduced with permission.

Background and Context

The very diverse review comments that I received on my submission draft brought home to me the extent to which the underlying theory of this paper forms part of an extensive body of knowledge and speculation that is impossible to summarize adequately in an introduction, and will not be familiar to many "Extreme" folks. This section can be skipped safely by anyone who does not want to know; for the more curious and pedantic, I hope it serves to indicate where this work sits within the (very) broad church of philosophy of language and natural language semantics, and also to provide entry points at various levels for readers who want to explore these matters further.

Regarding philosophy, I was educated in Cambridge (UK) in the 1970s, where my main influences were Timothy Smiley in logic, and Casimir Lewy in philosophy of language. Lewy's arguments from translation are a direct ancestor of the arguments I use regarding the quasi-natural-language semantics of XML in this paper. It was probably from Lewy that I also learned it was better to ignore the point-scoring analytical/critical games that often dominate philosphical debate, in favor of just getting on with solving interesting problems, in as simple terms as a problem will admit.

I was introduced to situation theory by Keith Devlin's 1991 book "Logic and Information" [Devlin91]; it provides a step by step and very accessible account of how situation semantics accounts for the communicative capability of natural language utterances. Note that I do not say "the meaning of natural language". One of the fundamental (and to me, intuitively correct) features of situation semantics is that instead of looking for mappings from utterances to something that either "is" or models the meaning of that utterance, the focus is more on solving the mystery of how it is that natural language utterances (that on the face of it appear totally inadequate for the job) are able to serve as an effective, if imperfect, means of communication in everyday life. Devlin does duck some important issues for the sake of simplicity of exposition; Jon Barwise published a concise and insightful discussion of key issues, from about the same period in the development of situation theory, in his 1989 anthology "The Situation in Logic" [Barwise89].

As a philosophy of language, situation theory is definitely much more Austinian than Russellian; it also avoids some of the Austinian pitfalls. For a detailed discussion of this and other aspects of logical theory and philosophy of language underpinning situation semantics, see Seligman & Moss's excellent chapter in North-Holland's 1997 compendium on "Logic and Language" [SelMos97].

Situation theory as a logical system was one of the many motivating examples for Dov Gabbay's elegant synthesis of logical theory, Labeled Deductive Systems [GabbayLDS]. Unfortunately the situation semantics content was cut right down in the final published book, however an earlier, more discursive exposition can be found in the proceedings of the 1993 conference Situation Theory and its Applications [SitThApps93].

The logic of information flow in distributed systems was modeled from a more mathematical logical perspective by Barwise & Seligman [BarSel97]. Their approach uses category-theoretic abstractions to express the kinds of similarity required between different abstract structures to support justified flow of information. My 2001 Extreme paper [Wrightson01] was a first shot at applying that approach to XML, but served only to indicate some interesting directions. I now believe that a better informal yet rigorous understanding of XML as a medium of meaningful communication is a prerequisite for further formal/mathematical modeling of XML as a constituent of information flow in a distributed system. This paper is hopefully a useful step in that direction.


This paper explores the middle ground between natural language communication amongst humans, and abstract information flow between arbitrary agents, using a simple toolkit of concepts drawn from situation semantics. The toolkit used in this paper is introduced in the next section. Readers who are interested in the broader background and theoretical context of this work within situation theory and natural language semantics should read the preceding section on Background and Context.

After laying out the situation semantics toolkit, the next step is to apply it to a broad range of communicative situations, where communication is considered to be the effective transfer of meaningful information (evidenced eg by action in response). To lay the foundations for considering communication via XML, these situations illustrate flow of information not only between humans, but also between humans and non-human agents such as animals and machines.

Finally (at last?) well-formed XML is considered, as a medium of communication from machines to humans.A strong motivation behind this part of the paper is to explore, using situation semantics, why some fragments of XML are so much harder to read and understand than others.

Situations and Meaningful Utterances

The key idea behind the notion of a situation, and the theory of meaning which goes with it, is the way we succeed in talking about our complex and incompletely known world using relatively simple signals. If Jack arrives late at the office and says irritably "The bus was late again this morning", then your recognition of a familiar situation allows that simple sentence to explain his late arrival and short temper.

Notice that this only works if you and Jack have sufficient common experience and knowledge for you to understand what he is talking about. You may even be able to say some more about the situation, for example by replying "Yes, there was an accident on the ring road". This brings out another characteristic of situations encountered in everyday life - they are not completely known, and two people can talk about the same situation, each contributing observations new to the other. Situations can be very fuzzy like the late bus example above. On the other hand, they may be clearly delimited, either by organizers or participants (eg a football game, a circus performance) or by others (eg an accident under investigation). Let's look at a few examples in more detail.

After the XSLT tutorial, Pat and Kay met with several others who were there, including one who is new to the subject, to compare notes and discuss practical issues.

A conference tutorial; a group of students working together comparing notes; someone being new to a subject; a group of colleagues discussing issues: the new situation is described in terms of these familiar (kinds of) situations. Note here that situations can be grouped together and characterized, as being of various kinds or types.This is central to situation semantics.

While I'm taking a break at work, two of my colleagues are discussing a rugby match they watched at the week-end.

I don't play rugby, and have only a very basic understanding of the rules and patterns of play. I learn about the match, but very imperfectly; much of what they say leaves me with pretty vague and probably pretty faulty impressions of what happened. Yet my colleagues are responding to these same utterances quite differently; to them they are vivid and precise. The meaning conveyed by what they say to each other clearly depends very much on effective connexions between what is said ("That second try!") and the actual described situation, the match they both watched on Saturday. Also involved is the rugby game situation in general, with all its intricacies of rules and patterns of play.

In the actual situation I have in mind, my poor understanding was because I knew nothing about that match and little about rugby. I was not completely lost, however, since I have played other team sports with complex rules, and am familiar with the general situation of watching a match and discussing it afterwards.

An experienced requirements analyst listens to two domain experts discussing the intended role and scope of a proposed new system, a type of system that is new to the analyst.

Although this is superficially similar to my listening to the conversation about the rugby game, it is nevertheless a very different situation. The role of the analyst is quite different, and this would come out in a fuller description of the analyst's behavior. For example, the analyst would pay close attention to the speakers, take notes, and ask for clarification of unfamiliar terms - whereas I was happy to relax and drink my coffee.

Basic Concepts of Situation Semantics

Agents and Individuation

The world according to situation theory contains many agents that make sense of each other and of the rest of the world in terms of schemes of individuation. Each agent has its own scheme(s) of individuation; what this means is that the agents involved (eg people) not only perceive objects such as doors, pens, dogs, people, but also recognize them as doors, pens, dogs, and people. They also recognize situations, such as a coffee break, consulting an architect, taking a dog for a walk, being on holiday. Different agents can, and generally will, have different schemes of individuation, which reflect their different viewpoints on the world, and their capacities for distinguishing and identifying situations, objects and so on.

Another key aspect of schemes of individuation is that the situations and objects individuated are recognized as having (or not having) properties eg a dog may be black, a coffee break may be short; and also as having relationships eg that the black dog and me are related by it being my dog; that the black dog likes the situation of my taking it for a walk. A weaker form of individuation is discrimination. Discrimination is evidenced just by changed behavior, for example a computer system can discriminate between two user login identities, and manifests this discrimination for example by allowing different kinds of access to stored information when each logs on.

The rationale behind the foundational place of agents and schemes of individuation in situation semantics is discussed thoroughly in [Devlin91].Situation theorists tend to avoid long discussions about psychology etc. - they accept that these things happen, whether or not we understand how, and use the theory on that basis. You can find a comprehensive discussion touching on psychology and epistemology in [Devlin91], with some more philosophical discussion in [Barwise89].

Situations, individuals, properties and relations

This is a sketch of these concepts as used in situation theory. There is much more, of course, but hopefully this gives a sufficient outline to understand what follows. (I have drawn on the account of basic concpets in Devlin [Devlin91], except that I follow one of Barwise's other "choices" [Barwise89] in allowing situations to be aggregated and to have more internal structure.)


(In this paragraph only, the abstract notion of a situation is distinguished from the everyday notion of a situation thus: `situation'.) A `situation' is an abstract model of our everyday experience of being in a situation, for example the situation I am in at present of sitting writing these words. Also taken from everyday experience, into the idea of a `situation-type', is our ability to recognize various kinds of situations, for example all the occasions I have sat preparing a document. Notice that a situation (and hence a `situation') naturally carries with it a time and place, and that talking about kinds of situations (and hence a `situation-type') naturally carries with it the relevant common features of situations at different times and places. These times and places can be short or long, large or small.


An individual is something individuated by a scheme of individuation. An individual can be an object, agent, situation etc. Individuals are considered as given, i.e. individuated by theorist, or by an agent (as seen by a theorist!). Individuals are not atomic - eg a table may be an individual, as may its components such as a table leg.


A property is something which holds or fails to hold (or some intermediate/other if you want your logic to do that) of an individual. For example, a mug may be red; a child may be six years old.


Relations are like properties, but they involve several individuals. For example, as I sit writing this, I, my chair, my workstation and my desk are in a relation. Properties and relations are distinct from individuals; each property or relation has argument roles which need to be filled byappropriate individuals. For example, a taking-a-drink-from relation involving me, my mug, and my drink of coffee, might have its roles filled by you, your glass, and your drink of water ... but not by my office, my chair, and a pad of paper.

Situation semantics is usually developed on the basis of properties and relations simply holding or failing to hold, but there is no reason in principle to prevent a situation having a more refined logic, or to prevent logics varying between situations. See Barwise & Seligman [BarSel97] for discussion of some related technical modeling issues.


An infon is a formalization of a single piece of information.(The concept of `infon' is also a formalization of the concept of a single piece of information, though that's not quite the same thing; for detailed discussion see Devlin, & Barwise ([Devlin91],[Barwise89]).) Infon notation is usually as follows: to represent the information that my dog Ben is running at time t: << running,Ben,t,1 >> and to represent the information that my dog Ben has no collar on at time t: <<wearing,Ben,collar,t,0 >> The `1' or `0' at the end is the polarity of the infon, where 1 is positive, 0 is negative. Since there is a subtle but significant difference between an infon and an assertion, polarity is a distinct concept from `true' and `false'. (Negation is a surprisingly subtle thing to pin down. Polarity is used in situation theory to represent facts about the world which are represented by true/false statements in other theories. See Gabbay [GabbayLDS], chapter 7, for a thorough technical discussion of the role of negation in a logical theory.)

So, in situation theory information is itemized, and arranged in items of form: objects a1, ...,an do/do not stand in relation R, where R is some property/relation in the ontology, and a1, ...,an are objects appropriate to their roles in R. The identification of the objects is not part of the infon's content, rather the infon (as individuated) pertains to certain given objects. Infons are semantic objects, not syntactic representations. That is, there is more going on here than the symbols you see on the page. Most modern formal logic, with some notable exceptions, has been formalist at heart, i.e. dealing in principle with meaningless symbols that only acquire a relationship with the real world when a logical theory is applied to some domain, for example when a logical theory is used to characterize fault-tolerance. Situation theory is different - more akin to the mediæval notion that a name used in a statement brings what it names in the real world into the domain of the (logical) discourse. Situations may be constituents of infons; or more precisley, the objects to which an infon pertains may include situations.

Situation Semantics

Situations don't necessarily have anything to do with language, but those that do are of particular interest; in fact the abstract notion of a situation was originally developed to help in investigating how natural languages support meaningful conversation.The account of the meaning of natural language utterances coming out of this line of investigation is known as situation semantics.

I have two reasons for using situation semantics in preference to other approaches to semantics. The first could be considered aesthetic, in that I like the account it provides of the meaning of, considered as information conveyed by, natural language. The second is more pragmatic, in that situation semantics provides a convincing account of meaningful communication between humans and machines, and also between humans through the medium of stylized, restricted forms of communication such as computer system specifications. Let's look first at how situation semantics applies to natural language, using one of the situations described earlier on: I listen to two of my colleagues discussing a rugby match. The most interesting semantic issue here here is that much of what they say leaves me with pretty vague impressions of what happened, yet my colleagues find the same utterances vivid and precise. The same bits of language are conveying very different meanings to different people. How can that be?

Part of the answer is that the conversation between my colleagues draws on two situations individuated by both the speakers: the match they both watched on Saturday; and a generalized rugby game situation. The match is the described situation the generalized rugby game situation is a resource situation (you can think of this as shared knowledge used as a reference). There are three other situations that figure from a situation semantics point of view: the utterance situation is the situation of speaking the utterance whose semantics is being considered, eg “ Now the second one - that was neatly done if you like ” which is part of the discourse situation: the whole conversation about the rugby game. There is also an embedding situation, in this case the coffee break when it happened. The embedding situation can have a strong influence on the meaning-as-understood of utterances, witness the popularity with comic writers of putting a character into a situation with the wrong idea about what's going on, so the speaker and listener assign different situation types to the embedding situation of their utterances, leading to misunderstandings and comic effect.

Going back to the rugby game conversation, I hear the same words as my colleagues, yet I don't have access to the situations that give these utterances meaning to them. How can I get any meaning at all? Well, I do have access to some similar situations I can use instead. I substitute some general team ball game situation for the more precise rugby game situation, and find some connexion between what they say about their particular game and other team games that I know. So I get some information, but not much, and not precise. This illustrates the relational nature of situation semantics: the meaning is in the whole relationship between the utterance situation, the discourse situation, the embedding situation, the resource situation and the described situation. With the meaning relatively hard to put a finger on, especially where many situations are involved, it is better to focus on the information conveyed by the utterance.

Information Flow

Some more terminology: Different agents will get different information from the same utterance because they will apply (or be attuned to ) different constraints linking the utterance situation to other situations. Sometimes this is a matter of minor difference, different constraints between substantially the same situations. Sometimes different people link the same utterance situation to radically different situations they believe the utterance to be about.It's true that giving one piece of information can immediately yieldothers. A classic example here is that in gaining the information thatan unseen figure is a square, I also gain the information that it is a rectangle, that it has four right-angled corners, that it is a parallelogram ... but wait a minute! Surely my also gaining the information that it is a parallelogram depends on my relating it to my `geometry-domain'; I wouldn't casually think ``Oh yes, it's a parallelogram'' - in fact, I would very rarely, if ever, need to makethat step explicitly at all.

Just Natural Language?

Situation semantics is intimately concerned with semantics as import, i.e. as what an utterance means-to an agent, rather than identifying some abstraction to stand alone as `the meaning of' an utterance. While originallyconcerning humans only, this makes situation sematnics well suited to communications from other agents. For example, it seems right to account for a dog understanding the meaning of a signal, including a vocal signal, (eg if I put mycoat on, or stand up and say “come on Ben ”) through a simple relationship between two kinds of situation individuated or discriminated by the dog. Conversely, when my dog communicates to me that he needs a drink of water by playing noisily with his empty water dish, that succeeds through a connexion between that kind of situation, and his dish being empty when he wants a drink.

Since machines are often built to discriminate various conditions, this approach carries over nicely to utterances from machines.

Alarm Clock

Consider an alarm clock that has a face displaying the time, and can be set to ring a bell when the hands show some particular time, say 6-30am. When I wake and look at the clock, and see that it is 6am, then the clock has successfully conveyed to me the informationthat it is 6am. When the bell rings at 6-30, then the clock has successfully conveyed to me the information that it is time to get up, or more precisely that the time which I decided last night was the time I should get up this morning, has arrived. The information regarding the time shown on the clock face is conveyed through a link (a constraint) between two types of situations: S0, where the clock shows a time of 6am, and S1, it being actually 6am. Because I am aware of that constraint (I have learned to tell the time) the clock can tell me it is 6am. (However, the clock face cannot communicate successfully to a child who has not yet learned to tell the time.) Similarly, the meaning for the alarm is there because of a link between a situation of type S2 where the alarm goes off, and a situation of type S3, of its being time to get up. (Interestingly, although the meaning of the alarm could be seen as more variable, since it will vary according to whether I need to get up early that day, it has a clear and uniform connotation that I am about to get up, that in my experience is more useful as communication to a young child, or even to a dog.)

Consider the constraint which links S1 and S0. It has some connexion with how the clock works, since if the clock is not working correctly, eg is running fast, then the information that it is 6am will be misinformation - perhaps it is actually 5-45am. Yet the constraint is not dependent on the detailed working of that actual clock, for I could replace this clock with another clock with a different kind of mechanism (eg electric instead of clockwork), and the communication I am talking about would not be significantly different. Indeed, any of a number of quite different kinds of clock would do, with the form of the communication varying: a dial; a numeric display; a chime; a synthesized voice saying it is 6am . All these could be part of situations of type S0, and all can have the required link with S1. Perhaps more can be understood about this constraint by seeing how the communication can fail. One example has already been mentioned - the clock may be wrong. It is still telling the time, but not doing it right. For example, the clock may be running ten minutes fast, or it may show Tokyo time. Or I may be interpreting as a clock something which is not a clock at all, eg reading a dial as a clock when it is a barometer, or reading a numeric display as time of day when it is showing day of month. In the case of a clock being set wrong, or showing Tokyo time, I can still use it to provide me with the same information if I change the constraint so that eg a clock showing 6am, which I know to be 15mins fast, conveys to me that it is 5-45am. Similarly, if I pass a clock showing 6-30am Tokyo time, and it has a sign by it saying `Tokyo', then it can at least convey that it is half-past some hour, and if I can remember how many hours difference there is, I can work out the full local time. The other cases are hopeless. (A stopped clock belongs to this last group - a clock stopped at 6am is not actually telling me the time at all, even if it happens to be 6am when I look at it.)

Other Machines Communicating to Humans

When machines communicate to humans, they often use words and sentences, i.e. pieces of natural language. This use of language feels very different from ordinary conversations between humans (this difference is part of what is intended to be captured by the well-known Turing test for human-like cognition in a machine). Yet communications between machines and humans also have a lot in common with human-to-human communication, especially in the way they are understood by the humans involved. This is particularly clear if you think of knowledge-based systems, since many of these depend on being able to generate, algorithmically, phrases and sentences that mimic a human expert sufficiently closely to be acceptable and effective, in a limited context, to people who might otherwise have referred to an expert in person. So it could well make sense to apply a technique developed for understanding the workings of natural language discourse, to understanding meaningful communication from machine to human. In fact, situation semantics works well, and throws an interesting light on human-computer interaction.

A modern aircraft's flight control and navigation systems are designed to emit audible warnings in certain situations. Some of these use fragments of natural language, eg a computer-synthesized voice saying “ Stall ” or “ Traffic ”. (These alarms are activated by conditions computed from sensor data.) Some audible alarms have in addition a connexion between the meaning of the alarm and the meaning of a visual display; for example the audible alarm “ Traffic ” comes after a series of indicators on the (TCAS) display have indicated, with increasing (visual) urgency, that another aircraft is getting close. How to account for these synthesized words meaning virtually the same as a spoken warning, say from a flying instructor?

For these alarms to work as a meaningful communication, there needs to be the right kind of connexion between the alarm sounding, and the type of situation the alarm is drawing to the flight crew's attention. An important aspect of this is to have the alarm system functioning so that the alarm goes off only when it is justified. If it goes off unpredictably or for the wrong reason, then it stops giving the right message, simply because it is no longer linked to the intended situation. When the “ Traffic ” alarm is working as intended, then we have:
Utterance situation:

An airplane being flown.

Described situation:

Another aircraft will pass too close if both continue on present paths.(This is of course a simplification, but will serve here.)

Resource situation:

The situation of encountering a traffic alarm in flight, usually learned in simulator training.


The computation in the TCAS system, from sensor data, which detects the situation.

Applying Situation Semantics to utterances in XML

So how does this approach work when applied to XML? Here is a very simple XML document, an example from a beginners' course in XML:

  <title>All About XML</title>
     <title>What's in a Name?</title>
       The Extensible Mark-up language (XML)  
       should really have been called EML.
What is interesting here is that it is not only the example's text content that contains natural language. The quasi-natural language meanings of the element names are the key to the human readability of the XML structure. To demonstrate this, consider what happens when the element names are replaced by arbitrary tokens:
  <y>All About XML</y>
     <y>What's in a Name?</y>
       The Extensible Mark-up language (XML)  
       should really have been called EML.
This example is less readable because it is not as clear what the elements are intended to represent. What happens is that you lose the signal concerning the significance of the element content that is provided by the natural language meaning of the element name. However, you might say that in this instance, the tree structure and the natural language content of the XML elements provide lots of clues, so perhaps the loss is not too great. So, consider the following example:
Here there is just not enough information to enable me to understand what normal means. Is this normal operation of a rail service, without delays or cancellations? Within normal range for a temperature sensor? If the element names are restored, then I can grasp the meaning of this fragment much more clearly:
Situation semantics provides a neat account of what is going on when I try to understand (gain information from) these examples.

My experience of XML, together with my (overlapping but not identical) experience of information exchange using character string data, and my experience of IT applications in general, provides a good repertoire of situations that can be used as resource situations. To gain a useful understanding, the XML fragment needs to convey enough information to enable me to bring the right resource situation to bear.

In the first example, I am familiar with books expressed in marked-up text, so although it is a strangely simple example, the name `book' given to an outermost XML element is individuated by me as conveying the information that this fragment represents a book - and thsi becomes more comfortable when I remember that this is a teaching example, which are after all notorious for being unrealistically simplified. This evokes an appropriate resource situation (the usual structure and content of books) for me to interpret the rest of the fragment correctly.

In the second example, the element names give no helpful clues. However, the rhetorical form of the text and the hierarchical structure of the XML fragment evokes a less specific resource situation of reading a document that has a title and a heading, so (like my partial understanding of the rugby game conversation) I gain a less precise understanding, that this fragment is intended to represent some kind of structured text document.

In the third example, there is much less information available. I have broad experience of data that states various things are normal, and I can recognize from my experience of using XML that this XML fragment is probably intended to convey that something specific is normal, but I have no idea what.In the fourth example, this uncertainty is largely removed; the names of the elements enable me to use my experience of interoperability messaging to understand this fragment as conveying the information that (in some context given by the transaction context of the message) the the result of some glucose test is the defined term value normal .

The discussion so far has used element names as the key to identifying the appropriate resource situation for a human to understand a fragment of XML. Other simple approaches are nearly as readable, for example:

<r15 displayName="Result">
only carries the small additional burden of understanding that in this message format, r15 is the machine-readable identifier, and Result is the explanatory note that enables humans to interpret r15 more easily.

If the third example was stated to me, then the utterance situation (for example, if I were seated opposite someone who was performing a glucose test) would give me enough clues to understand exactly whose glucose test was normal, and when the test occurred. In healthcare, there is a need to identify both individual people and terminology more precisely, by using coded identifiers. In situation semantics terms, these coded identifiers enable dependable access to the resource situations that enable precise understanding of the XML - though it is also true that humans usually need some machine assistance to handle this level of precision. Adding some coding and patient identification to meet these additional requirements, our example becomes:

 <SleepPattern code="401161007">difficulty getting to sleep</SleepPattern>
 <GlucoseTest code="102659003">normal</GlucoseTest>
If element names are taken as also naming a resource situation in situation semantics terms, it becomes clear why this example is straightforward to understand as quasi natural language. TeleMed evokes the situation of telemedicine, where a patient uses a device in their home, and the device transmits data into a clincal record system, and this situation contains the concepts of patient and result. Patient evokes the general clinical situation of recording results pertaining to patients, which helps the reader understand the relationship between the person named and the test results, and the nature and role of the patient identifier. Result evokes the situation of reporting the result of a test at a particular time, which helps the reader understand why sleep pattern should be relevant in this context, and so on.

All this does depend on the reader being familiar with the context in which the XML is going to be used, and often also its specialized vocabulary (or specialized uses of ordinary words), and more generally on the reader's familiarity with the English language. I have seen XML markup in Japanese, and I did not understand anything except the overall tree structure! Conversely, I have seen XML documents where the markup was in English and the content in Japanese; in this case, although I did not understand the Japanese element content, I did understand what it was about from the English language based markup.

Why is some XML so difficult to read?


All the examples in this section (and the last example above) are real in that they were not constructed specially for this paper. None of them refer to actual individuals, and they are intended to illustrate the discussion from actual practice rather than serve as typical examples of clinical messaging. Also please note that although some readers may recognize the standard(s) on which the examples are based, the conclusions of this paper relate only to semantics and human readability of utterances in well-formed XML. I do believe that the issues discussed here have implications for the design of XML based interoperability standards, however that would need to be discussed in the context of other issues concerning efficient processing, human computer interfaces, schema design etc., that are outside the scope of this paper.

One of the frequently cited virtues of XML is that it is human-readable as well as machine processable. As you might expect, it is possible to degrade either or both of these in the design of an XML document. The last example in the previous section was designed to be human-readable by keeping the structure simple and concise, and ensuring that element names, considered as natural language, name business-level concepts that correspond to the element content. The examples that follow in this section illustrate how changes in the XML design affect readability, and provide an account of why these changes have such an effect, using situation semantics. All these examples have similar subject matter, in that they are intended to convey the result of a clinical observation in a healthcare context.

Elements named relative to a specific context of use

This fragment, which is a portion of the last example in the previous section, is a typical example of a relatively small-scope design where the element names are developed with reference to the business-level analysis of the solution being developed. Broader interoperability has been taken into account in the use of codes that conform to a recognized standard (in this case, SNOMED CT), however the XML design is specific to this particular application (which was a proof of concept regarding home-based monitoring of diabetes). A situation semantics based account of the meaning of this fragment would be very similar to that given in the previous section.

 <SleepPattern code="401161007">difficulty getting to sleep</SleepPattern>
 <GlucoseTest code="102659003">normal</GlucoseTest>

Elements named more generically

When broader (eg industry-wide) considerations of interoperability are stronger that the requirements of a particular information exchange scenario, then element names naturally become more generic, as in this example. The additonal information required to make the generic concept more specific is given as additional data, either in attributes or child elements. This in itself does not necessarily degrade readability - though it does usually increase the number of different bits of the XML that I need to look at before I know what it going on (and I also need to know which these are).

	<moodCode code="EVN"/>
	<id root="D6C6D716-6444-11D7-91A8-00C04F2ACB4F"/>
	<code code="SG0z.00" displayName="Foreign body on external eye NOS" 
		<originalText>FOREIGN BODY EYE</originalText>
		<center value="19930907"/>
	<availabilityTime value="20020214122839"/>
	<value xsi:type="CV" codeSystem="2.16.840.1.113883.6.6" code="Xa48Q" 
	displayName="Undefined (default)"/>
In situation semantics terms, the situation-type of the resource situation evoked by the element name ObservationStatement is pretty unspecific. To narrow this down, I would look at another part of the XML; in this case, the code attribute on the moodCode element within ObservationStatement; this enables me to use a more specific resource situation - recording a healthcare event. The majority of the rest of the example is similar, in that generically named elements provide additional information in their content or attributes to enable a reader who understands the format to read this fragement effectively. However, there is additional reading difficulty regarding the element id and its equally generically named attribute root, that has a value, designed for machine use, that is opaque to a human reader. Having a component that cannot be understood directly means that the human reader has two alternatives: take the time to look up what it is, or ignore it and take the risk that it is actually important to understanding some other element (the perceived difficulty in ignoring the machine-readable data is because of the general style of this fragment - given that moodCode was an essential qualifier of ObservationStatement, I may feel uneasy in case id is too). Finally, consider the element value. The displayName provides a human-readable version of the code value - so far so good. However, because there is no human-readable version of the code system name, I don't know which terminology is being used without looking up the code. And this final point is only a problem because this example uses a generic format - if it was a more specific format, I would probably have the code system in use available to me as part of the resource situation corresponding to my understanding of the solution being developed.

Overall, the additional difficulty of reading this fragment derives from:

  • Needing to assemble several pieces from the XML in order to identify the most specific resource situation that provides me with my best available understanding of each element
  • The presence of data that is irrelevant or unusable for a human reader, without human readable signals that enable the reader to ignore it comfortably.
  • Inconsistent usage of human readable equivalents

Elements named by modeling role

The main sources of difficulty for human readability of utterances in well formed XML have already emerged in the previous example. This next example has outer elements that are named even more generically from a subject matter perspective, because they have been named with reference to an information modeling methodology rather than the subject matter. In this case, the information that was conveyed relatively directly by the name ObservationStatement for the outermost element of the preceding fragment, is to be found by finding the value of an attribute on the pertinentResult element. The level of difficulty for a human reader is increased because of this, but does not differ in kind, and can be accounted for as above.

<pertinentInformation typeCode="PERT">
	<pertinentResult classCode="OBS" moodCode="EVN">
		<code codeSystem="" code="11111" displayName="glucose test"/>
		<effectiveTime value="20050224"/>
		<value xsi:type="ST">normal</value>
As a final step, consider what happens if all the element names in this last example are replaced by arbitrary tokens:
<x typeCode="PERT">
	<y classCode="OBS" moodCode="EVN">
		<z codeSystem="" code="11111" displayName="glucose test"/>
		<xx value="20050224"/>
		<yy xsi:type="ST">normal</value>
To my eyes, the only element that this final change makes significantly more difficult to understand is effectiveTime, where the indication that this is a time helps in interpreting 20050224 as 24 Feb 2005 (it evokes a resource situation to do with representing dates and times). In fact, a published standard that is closely related to this example, has as a stated design goal that the element names are irrelevant (and I would be tempted to infer that human readability is not considered important).


This investigation has been informal, nevertheless it does show that situation semantics provides a reasonable account of why examples that are, on first impression, more difficult to read for a human, should be so, by treating well formed XML as analogous to natural language.

Several directions come to mind for further development of this work:

  • Moving on from the informal discussion above, in the direction suggested in my 2001 Extreme paper [Wrightson01], towards a more formal-mathematical model of information flow through the medium of XML.
  • Looking at the role of schemas, DTDs, and underlying models (such as the HL7 RIM), in situation semantics terms.
  • Investigating further the usability in practice of different kinds of XML structures, taking into account ease of design of processing, efficiency of processing, the role of underlying models, etc.


[BarSel97] J Barwise and J Seligman, Information Flow; the Logic of Distributed Systems, Cambridge UP 1997

[Barwise89] J Barwise, Notes on Branch Points in Situation Theory, Chapter 11 of The Situation in Logic, CSLI Lecture Notes no. 17, 1989

[Devlin91] K Devlin, Logic and Information, Cambridge UP 1991

[GabbayLDS] Dov M Gabbay, Labeled Deductive Systems, Volume 1 --- Foundations, OUP 1996

[SelMos97] J Seligman and L S Moss, Situation Theory, in ed J van Benthem and A ter Meulen, Logic and Language, North-Holland 1997

[SitThApps93] P Aczel et al, eds, Situation Theory and its Applications, CSLI Lecture Notes no. 37, 1993

[Wrightson01] A Wrightson, Some Semantics for Structured Documents, Topic Maps and Topic Map Queries, Extreme Markup Langauges 2001

Semantics of Well Formed XML as a Human and Machine Readable Language

Ann Wrightson [CSW Group Ltd]