Runways, product differentiation, snap-together joints, airplane glue, and switches that really switch

C. M. Sperberg-McQueen


The closing keynote address of the 2004 Extreme Markup Languages conference.

Keywords: Markup Languages

C. M. Sperberg-McQueen

C.M. Sperberg-McQueen is a member of the technical staff at the World Wide Web Consortium; he chairs the W3C XML Schema Working Group and XML Coordination Group.

Runways, product differentiation, snap-together joints, airplane glue, and switches that really switch

C. M. Sperberg-McQueen [World Wide Web Consortium, MIT Computer Science and AI Laboratory]

Extreme Markup Languages 2004® (Montréal, Québec)

Copyright © 2004 C. M. Sperberg-McQueen. Reproduced with permission.


This version of this talk is based on a transcription made by Kate Hamilton, to whom my thanks. The notes of where the audience laughed are hers; I have tried hard to resist the temptation to add a few more. In preparing it for further distribution in written form I have recast some sentences and added notes and pointers to further reading where it seemed useful.

Introduction by Syd Bauman

I had the privilege of introducing the closing keynote speaker at Extreme Markup Languages 2002. Some of you may remember. In 2003 I asked the conference chairs if I could again enjoy this privilege. After a quick huddle Deb Lapeyre came back to me and said [flatly]: “No. We’re worried that if you do it two years in a row it’ll become a tradition.”

Doesn’t sound like a bad tradition to me!

But in truth, coming up with a series of compliments sufficient in scope to describe Michael year after year could indeed be a daunting task. [Chuckles]

I usually sum up my admiration for Michael — and make no mistake about it: I put Michael up on a pedestal where he belongs — with a single word: demigod. [Laughter] For I can come up with almost nothing bad to say about him. I’m sure — moreover I hope, I pray! — that his physician has a few choice words that remind Michael that he is merely mortal; but to us in our field he is a demigod.

Tangent: how many of you have done vanity searches — you know, you type your name into a search engine. Come on, raise your hands. How many of you have typed your name into Google? It doesn’t make any difference whether you used quotes around it or not.

Sometime around October 2001, I, at the urging of Allen Renear, conducted a vanity search. I actually think it was the first time I’d actually done so. And you know what came up? The first half-dozen hits for “Syd Bauman” on the Web in October 2001 were various people’s descriptions — not transcriptions, descriptions — of the closing keynote address at Extreme 2001 [shouts of laughter], during which Michael had used members of the audience — of which I was one — to be props in his explanation of the differences between Topic Maps and RDF. I think I held a loaf of bread. And a ribbon, or something like that.

So. All of my own writings — all of the free-as-in-free-speech software that I’ve made available — all of my photographs, of which there are a couple of hundred on the Web — all this paled in comparison to my being a prop in one of Michael’s talks! [Shouts of laughter]

While this may be insulting, it is fitting — because Michael is a demigod.

But in preparing this introduction, I couldn’t (at least not on such short notice, about thirteen months) come up with a clever poem or a song that used the word “demigod”.

But note that Michael’s talk asks the question, "Does XML have a model? Does XML have a supermodel? Does it matter?" I took this theme a little overly to heart. With apologies to Jerry Siegel, Joe Shuster, Kirk Alyn, George Reeves, Christopher Reeve, John Haymes Newton, Gerard Christopher, Dean Cain, and Tom Welling (although the last gentleman doesn’t know it yet), I give you the following introduction:

Faster than a cable modem!

More powerful than XSLT!

Able to grok ISO standards at a single glance!

Look! Approaching the podium ...

... It’s a listener!1

... It’s a session chair!

... It’s — Superspeaker!

[Laughter and applause]

Yes, it’s Superspeaker, strange visitor from a Middle High German department [shouts of laughter] who came to Extreme with thoughts and insights far beyond those of mortal geeks. Superspeaker, who can change the course of mighty committees, bend your brain with mere words, and who, disguised as C. Michael Sperberg-McQueen, mild-mannered editor for an evil consortium of Web developers [shouts of laughter], fights a never-ending battle for text, data, and the XML Way.



Before I start, I need to make sure you all understand at least vaguely what a glottal stop is.

A stop in phonology is a phoneme created by stopping (sometimes transiently, sometimes without release) a flow of air from your lungs through your mouth and/or nose. A glottal stop is a stop made with your glottis. For comparison: “t” is a dental stop because it’s made with your tongue against your teeth; “p” is a labial stop because it’s made with your lips; “k” is a palatal stop because it’s made with your palate. Your glottis is the soft part of the top of your mouth behind the palate. We have no letter for glottal stops (apostrophe is often used in informal transcriptions) because we don’t actually have phonemic glottal stops in English, and it is very difficult for native speakers of English to emit a glottal stop at the beginning of a syllable: they’re not phonemic. We do have glottal stops; you can hear them if you listen. They’re much easier to say (and to hear) between vowels: you will recognize the sound if I say, instead of “glottal stop,” “glo’al stop.”

I go into this up because otherwise it would be difficult for you to understand the story that I have been thinking I need to tell you.

When I was a graduate student in that department of Middle High German Syd mentioned, I studied Old Norse in addition to Middle High German. A lot of the secondary literature on Old Norse is in the modern Scandinavian languages, so it’s highly desirable for students of Old Norse to be able to read a modern Scandinavian language. It doesn’t much matter which one, because of course if you can read one, you can pretty much read academic prose in the other two, also. So I took Danish.

Sometime in April Claus Lund, our teacher, came in and said something, and we replied, and he said, “No, no, you’re saying it wrong: you’re saying ‘this’ and not ‘this.’” We said [dubiously] “Say that again?” And he repeated. And we still couldn’t tell what he was getting at. After a while, it became clear that he was trying to make us hear (or rather reproduce) a phonemic difference, and we weren’t hearing it. Knowing that it was phonology he was concerned with, and not some grammatical question, helped us, unfortunately, not at all.

He gave us phonetic exercise after phonetic exercise, getting redder and redder in the face, until finally, towards the end of the hour, he exploded in indignation: “I can’t believe you’ve been taking Danish for an entire year and you don’t hear a glottal stop! Glottal stops are phonemic in Danish! You should know this!” Of course, we did know it; we just hadn’t been thinking to listen for a glottal stop, because we were native speakers of English, and we are trained by our native phonology to hear right past them. (They’re not phonemic in Norwegian or Scandinavian either, so the Danes can use glottal stops to out Norwegians who are trying to pass as Danes. [Pause] My apologies to any Norwegians here who are offended by the idea that they or any of their compatriots might wish to pass as Danes.)

[At this point, the voice of Lars Marius Garshol can be heard on the tape, saying loudly “Thank you!”]

The reason I have been thinking I need to tell you this story is that I’m committed to talking about models here. And I have come to you this morning to confess a dirty secret: I feel the same way when people use the word “model” as I felt that morning. It’s as if they’re suddenly dropping into a foreign language that I just don’t quite get. There’s something odd. I don’t know what they’re talking about.

Now, this isn’t always the case. “Model” is an interesting word because it illustrates something Allen Renear was saying the other day: We can know things without being entirely certain of them. I may not be certain that what’s in this bottle is actually water, because I haven’t had it chemically analysed, but you know it’s water, I know it’s water, and if I’m thirsty I will drink it. We know it, without being certain of it? And yet, surely we would have found it plausible to say that knowledge is certain belief. We can apparently use words like “know” without necessarily being able to give a clear account of where their boundaries are. And the same is true of “model.” You will have heard me say the word “model” in sentences as if I knew what it meant and as if I expected you to know what it meant. You have certainly said “model” in my hearing without my saying, “I don’t know what you’re talking about.”

Sometimes it works.

Sometimes, however, it doesn’t. In some contexts the word “model” goes right past me, as if it were a word in a foreign language or had some meaning that other people can hear and understand and I can’t. You’re in the middle of an intense discussion, you have been explaining how you see the world, and your interlocutor grabs you by the lapels and says, “Yes, yes but: What is the model?” Or you’re discussing notations for capturing information with other people and they say, “Yes, but the problem with SGML is, it has no model.”

“It has no model” — what on earth do they mean?

Do SGML and XML have a model?

For SGML we have groves. We have the syntax defined by the spec. For XML we have the syntax of the spec, we have the Information Set, we have the Document Object Model, we have the XPath 1.0 model, we have the XPath 2.0 model — we have lots of models.2 We have more models than we wish we had. What do you mean there is no model?

“Well, Q.E.D.,” they say. “You don’t have a model. You have these stopgaps because you don’t have a model. You haven’t ever decided whether entity boundaries are significant or not. You haven’t decided when whitespace is significant and when it’s not, or whether the stuff that occurs is a sequence of characters or character information items or a sequence of strings or always a single unbroken string. You haven’t defined any operations on XML data. XML has no data model, is not a data model, has no model, is not a model.”

At this point I say: “What exactly do you mean by model? Do you mean I haven’t shown you a UML diagram? Or that I haven’t written it out in the way that books on information modeling say?”

But they say, “No, no, no. I’m not talking about some mechanical gap, I’m talking about the underlying truth that there is no model.” [Laughter]

What is a model?

It’s remarkably difficult to get anybody to define the term “model”. I spent a lot of time this summer looking at books about information modeling using Express, information modeling using XML, modeling business cases using XML, modeling XML using UML, modeling UML using XML [laughter], ....

None of them defined “model”. None!

Some of them said: “Well, an information model is: You list this, you list this, you make a list of that, and you spell them this way and you put a colon here” — that is, they effectively said, “A model is the thing I’m about to tell you how to write down.”

[Politely tenacious] I persist in the notion that somewhere, somebody, some native speaker of this putative language that I don’t fully understand has an intuitive grasp of some notion of model that is more specific than just “You don’t have it” and somewhat more general than the specifics of a particular notation for writing down models in.

But it has been hard for me to figure out what they’re talking about. So, following the advice of Wittgenstein, who said, “The meaning of a term is precisely its usage” I have spent a lot of time listening to how people use the term model.

In the usages that I don’t understand, it turns out to be a passcode. A glottal stop. It’s something that I understand and that you don’t. It marks you — me — as the outsider because I don’t understand it.

If I had to paraphrase it, I would be tempted to say, “A model in any context is most often a thing I have and you don’t.” The speaker has it, the addressee lacks it. The complaint is: “You don’t have a model and that’s why I’m not interested in your notation.” [Audience is dead silent] An attempt to arouse model envy. [Tentative laughter]

To be specific, I am conscious of hearing this most often, partly because of where I work, in phrases like “RDF has a model, and XML doesn’t.” This is why some of my colleagues firmly expect, I think even now, that XML will eventually wither and die, and everyone will use RDF. As far as I can understand, the logic runs something like this: “If I want to represent information in XML, I have to decide what things are going to be elements and what things are going to be attributes, and there’s no general rule to tell me; but everything is either an element or an attribute, and the XML spec doesn’t tell me what it means or how to process it. There’s no model. Whereas in RDF, you represent information and there is a model, because everything is either a node or an arc, or a circle or an arrow. Circles are resources, er, sorry, resources or literal values. Arrows are properties, excuse me, properties or relations between literal values or resources. You see how clear this is, [laughter] the advantage of having a model. Everything is circles and arrows.” Can we go home now? [Laughter]

It may be that I was just spoiled as a child in the same way that being a native English speaker trains you not to hear glottal stops. It may be that there’s interference from the time that I studied formal logic, where “model” has a very simple and in some sense uninteresting meaning.

Logical models

A theory, in logic, is a set of sentences. And a set of sentences in a logical notation is a theory. There’s nothing more to it. This is somewhat broader than our normal usage of the term “theory”; we would not be inclined to say that the set of sentences “Tommie is sitting on that chair” and “The earth revolves around the sun” constitutes a theory in a particularly useful sense of the word.

But describing the difference between that set of sentences and the kinds of sets of sentences about which we would feel comfortable using the term “theory” is very difficult and logicians have washed their hands of the matter, and said “Let’s just call any set of sentences a theory.” So a theory is a set of sentences in a logical notation.

An interpretation is a mapping from the terms of that notation to objects of the real world and from the predicate symbols of that notation onto relations in the real world.

A model is an interpretation under which all of the sentences of the theory are true.

If we interpret the term “Tommie” as referring to Tommie Usdin, and the term “that chair” as referring to the chair she is actually sitting on, and if we interpret “sun” and “earth” and “revolves” and so on in the conventional way, then the real world constitutes a model — an interpretation under which all the sentences are true — of the theory that I just emitted.

It’s in this sense of the term “model” that we say Gödel proved that Cantor’s continuum hypothesis is consistent with set theory. He did so by discovering a model of set theory in which the continuum hypothesis is true. Paul Cohen proved that the continuum hypothesis is undecidable in a system containing just the axioms of standard set theory, by finding another model of set theory in which it is false.3

Similarly, mathematicians in the 19th century showed that non-Euclidean geometry is consistent if Euclidean geometry is consistent, by finding a model for non-Euclidean geometry in Euclidean geometry. Within that model, Euclid’s first four postulates hold and the fifth postulate — the one about parallel lines — doesn’t hold. In that light, Bryan Thompson’s work [Thompson 2004] on fitting the query and update of semantic stores into the worldview of HTTP is model-theoretic because he has produced a mapping from the theory, the abstract formalisms of HTTP’s primitive terms onto things and operations in the real world, namely semantic stores, which is both useful and not the customary mapping.

Notice that different models of the same theory may be in widely divergent or in similar areas. Even when in similar areas, they may be twisted in a very odd way. The model of non-Euclidean geometry in Euclidean geometry doesn’t, of course, map lines to lines and planes to planes; it’s a lot more complicated than that. (Speaking of models that have unexpected relations to each other: Look carefully as you check out because many of the hotel staff are wearing “Overlap Happens” pins.4 Because they said, “Well, we have overlap too!” Hotel work of course is full of overlap. Overlap is of overriding importance and, in particular, avoiding overlap.) [Laughter]

In this understanding of models, consider a model of the solar system. Let’s let Tommie [pointing] be the sun; Syd [laughter] be Mercury; Tonya [Gaylord], you get to be Venus. Tom Passin can be earth. Now if you guys rotate around Tommie in ellipses, then you will constitute a model of the solar system. The logical view of this is: We have a theory of sentences about the solar system — that the sun is in the middle, Mercury rotates around it at a certain distance, Earth at another distance further out, and so forth. Now you have a physical model, in this case constituted by people who may or may not be induced to walk slowly in circles,5 or by a mechanical contraption with little metal balls that rotate. The fact that it is a logical model is given by the fact that the sentences are all true: there is a thing in the middle that’s called the sun, and it’s in the middle, and the others rotate around it. Another model of that theory is given by the actual solar system.

This usage troubles some people. They say: “The model is supposed to be modeling the solar system. I’d be happy to say that the theory is modeling the solar system; but it’s troublesome to say that the solar system is modeling the theory.” It seems relatively clear that this logical sense of the term “model” is not the one that people are talking about when they say, “SGML and XML don’t have a model.”

Other senses of ‘model

So what is a model?

There are a lot of senses of the word “model.” Sometimes we use the term “model” to refer to a replica or simulacrum: we have model trains, model railroads, model airplanes. There are things that are worthy of emulation: a model student, the very model of a modern major-general. There are samples or examples that may be intended more for inspection than for emulation: a model apartment, or a model house in a housing development. Or a clothes mannequin. For those of you who came hoping that I would be displaying pictures of Claudia Schiffer and Cheryl Tiegs all through the talk, I’m sorry to disappoint you: that’s not the sense of model that we can focus on here.

We also use the term “model” to denote a particular kind or type or form of thing; usually a particular kind of goods offered for sale. It’s interesting to note in this case that the manufacturer only distinguishes models if they make more than one model. The first kind of car that Henry Ford made was not called a model anything. The “Model T” was only called the Model T because there were other kinds of cars that you could get (although not from Ford at that time). The term “model” in that sense applies only if it’s distinct from something else; if it has competition.

In science we frequently refer to models when we mean a discredited theory: “the phlogiston model of fire,” or “the fluid model of heat.” Sometimes we appeal to those models or physicists appeal to those models because they simplify calculation, even though we know them to be false. So mechanical engineers will frequently act as if they thought heat were a fluid, because if you think of it as flowing you understand what kind of things you need to avoid in your mechanical designs.6

Similarly, a model may be a simplification that is made in the interests of simpler calculation. Ideal gases are a simplification. Nobody really thinks they’re billiard balls bouncing around — in particular, the molecules of an ideal gas don’t have colored stripes and numbers on them — but the billiard-ball analogy simplifies our of thinking about gas molecules and their behaviors; it provides heuristics. Astronomers calculate using the notion that stars are point masses. But stars aren’t point masses. They’re huge things! It’s just that at the scale you’re working on, you might as well ignore the fact that they have any extent at all. There was a significant change in atomic physics when scientists realized that to understand the neutron experiments that Fermi was doing in the thirties, or the fission experiments that Strassmann and Hahn did in the late thirties, they could no longer think of atomic nuclei as point masses; they had to start thinking about how big they were and how they would behave when hit by particles.

In that context it’s interesting to note that other people use the term “model” for something they don’t want to call a theory, maybe because it’s not as fully developed as they would like or maybe because it has competition. People still refer to Rutherford’s “solar-system model of the atom,” in part because it wasn’t universally accepted — there were other models to choose from, similar to the commercial analogy.

Do these things have anything at all in common?

Models, similarity, analogy, difference

I submit to you that they do. First, note that in almost all of these cases the model has something in common with the thing modeled. You have three things to consider: a model airplane, a real airplane, and a set of propositions that describe similarities. There are two wings, they’re swept back this way, their geometry is kind of like this [gestures]. Or, I have a “model train set,” I have a set of real trains and there’s a set of propositions that describe the similarities. A “train,” whatever that is, consists of a chain of linked cars; is pulled by an engine; runs on tracks; the tracks can have switches that determine whether a train goes onto track A or onto track B; and so on.

Notice in the case of model trains there are material similarities, and part of the fun is to make those material similarities as extensive as possible. So model-train hobbyists go to great lengths to make the switches really work and to make the signal lamps light up at appropriate moments when the train is coming by. They’ll landscape some little buildings, and so forth. Plastic model airplanes of the kind that I made when I was growing up were much less absorbing because they have snap-on parts or you put them together with airplane glue: there was no material similarity between that and rivets in aluminum, which is what you see when you look at a real plane.

It’s tempting to conclude — and I think a lot of people do conclude — that the more similarities there are, the better. A perfect model — the model for which we are striving — is a model about which every statement that we can make that is true about the model is true of the thing being modeled. But this is not so. If I build a model railroad of real wood and real metal, and make switches that really work, that’s cool and absorbing and interesting and fun, if I build it at a scale of 1:87 or 1:48 or whatever scale I use for my models.7 But if you build it at a scale of 1:1, and you have an engine that really pulls, a signal lamp that really lights, and so forth, then I think it’s stretching the word a bit to say that you have a model train. I would say that you have a real train. The only difference between it and other trains is that you built it.

Easy-bake ovens are a challenge. For those of you who don’t remember them, or grew up in a place where they weren’t sold, they’re little toy ovens. You put an electric lightbulb in it; the lightbulb emits the heat; there is little a tiny tray, and there are recipes, and you can bake in it — they really bake. I’m tempted to say it’s not a model oven but a toy. It’s not a model because it’s an oven, it bakes things. It may or may not be one that a chef wants to use, although when the brand turned forty last year, someone did invite a lot of chefs to develop recipes, and published the result as a book [Hoffman 2003].

I infer that in a useful usage of the term “model,” it is not enough for there to be positive analogies; there must be negative analogies as well. Something has to be different.

In practice, it’s not enough that there be something different: useful models are models where the difference is that the model is simpler, or more familiar, or easier to calculate with. One of the reasons I love the name which Anne Brüggemann-Klein and Derick Wood have given to their “caterpillar automaton” [Brüggemann-Klein and Wood 2004] is that it is so suggestive: it provides instantly a model of what the automaton can do: it can inch up and down and over along the tree, and it can sort of lean over and look and say “Oh, I’m at the first child” then it can go back and inch its way down again. So the name already suggests a model of the automaton in a way that I’m afraid the term “XPath” doesn’t quite match.8

In practice, what we seek from a model is that it be simpler or more explanatory, and part of that lies in its heuristic value. There are an infinite number of statements you can make about the thing being modeled or about the model. There is some region of known positive analogy — if something is true of the model, then it’s true of the object being modeled. There is a region of differences, and that’s where the simplifying assumptions come in, which allow us to simplify our calculations — using point masses as idealizations, and so on. Then there’s an indeterminate region of statements about which we don’t know whether they are true in both the model and the thing modeled, or not. Statements in this third region may suggest possible experiments or tests. In natural science, for example, you may have a billiard-ball model of the ideal gas, and you can ask: what happens when molecules collide? We may be able to think of ways to find out what happens, and that’s the way experimentalists work. Models have important heuristic value.9

It’s a matter of debate among philosophers of science whether that heuristic value is essential or somehow inessential, so that once you’ve created the theory, the model can be ignored and fall away. In practice it never falls away because you keep using it, if only to explain to students how people came upon the idea of running this or that experiment.

For our purposes, trying to model information, of course we’re not so much interested in models with material similarity; we’re interested in models that have a certain abstraction, in part because what we’re trying to model is abstract. So for us I think the major function of a model is to simplify things. It allows us to focus on the essential; crucially, it allows us — it requires us — to identify the essential. This is perhaps why the notion of having a model and the notion of having a semantics and an understood meaning are so closely allied that they may well be indistinguishable.

The danger of models

Notice that this makes models very dangerous, because they filter. I have hard-wired into my head from my childhood a model of phonology that leads me to filter out the glottal stop so that I don’t hear it unless I’m consciously listening for it. We have models of the way the world works that make us see certain things and not see others.

A story may illustrate how thoroughly our expectations can condition what we see, or think we see. When my uncle Lester was an adolescent he came home one afternoon and found that my grandmother had made several loaves of bread. Now, Uncle Les was an adolescent, which means that he was hungry: he ate several slices of bread, and since it was fresh it was very good, so he ate several more slices. Before he left he’d eaten half a loaf of bread. When my grandmother came home she was very unhappy, and so she scolded him.

Before I proceed with the story, in the interests of full disclosure I should point out that Les firmly denies what I am about to tell you. But the family story as relayed by my father (Les’s brother) says that the next time Les came home and found freshly baked bread cooling in the kitchen, he didn’t eat half a loaf of it. Instead he just took an entire loaf up into his room and put the little tray away so that when his mother came home she didn’t see half a loaf there. It was easier for her to reconcile what she saw with the idea that she had forgotten how many loaves she had actually baked than with the idea that Les could have taken a whole loaf of bread, so she didn’t notice, then or ever again. As I say, my uncle denies it.

Models filter for us; they’re dangerous. Notice that for most pragmatic purposes, what counts as essential varies a lot with our point of view and with our purpose. So I’m suspicious of having “a model” because I doubt that I want one model all the time. I want different models at different times. In any modeling, you don’t get to record everything — this is something all of us who teach SGML and XML have to teach our students: give up on the idea of being exhaustive. Many of our students find the lesson unwelcome. But if it’s going to be a model there will be — there should be — some things it doesn’t capture. Those are the things you don’t care about for these purposes. You have to choose; you get to choose. So among other things that is why it’s essential to bear in mind that, when we’re thinking about processing actual documents and not just processing the markup, you may frequently have to go beyond what’s marked explicitly in the document, as David Dubin and David Birnbaum said the other day [Dubin and Birnbaum 2004] or as Sam Wilmott said this morning [Wilmott 2004]. The markup is not necessarily going to exhaust everything you’re interested in.

Models in SGML and XML

So does SGML have a model? Of course it has a model. But to understand the nature of the model you have to remember the distinction between a document type definition and a document type declaration. A document type definition is the set of rules that govern the application of markup to a particular kind of document. A document type declaration is a particular syntactic form. It contains element declarations, entity declarations, and attribute declarations that capture part, but not all, of the rules that govern the application of markup to that type of document.

When XML made the DTD optional, a lot of people thought it made semantic form optional, but actually it made only the syntactic form of the DTD optional. The semantic form of the DTD it didn’t bother about: it doesn’t make it optional, it doesn’t make it required: nothing XML can do can affect the fact that it will be there. It will be there because if we are using XML in a purposeful activity, we have some set of rules that we’re following. Even if we set out to work without rules — suppose, say, that we decide to write a random-number generator and use it to decide how to tag things in a document — I’m sorry, but using a random-number generator is a form of rule following. In 1996, during the development of the first draft of XML, when it was proposed to make the (syntactic) DTD optional, Charles Goldfarb made the rather apodictic statement that every document has a DTD. A lot of people in the XML Working Group found this very surprising and amusing; there were a lot of jokes about the DTD in the mind of God. Whether there is a DTD in the mind of God or not, there are some rules that are being followed by the people doing the work. Whether everyone using a particular vocabulary is playing by the same set of rules is a different question. One of the reasons practitioners try to write down as much as we can of the document type definition, both the syntactic form and the semantic form, is to reduce unnecessary and pointless variation.

The way SGML models things is pretty much the way you would expect any formal system to model things: you identify things; they have explicit relations; and you have some mapping that tells you that, if this kind of thing occurs in the XML document, it should be interpreted as reflecting this state of affairs in the real world. There is an interpretation that maps from the elements and attributes of the vocabulary to the thing being modeled. “Aha!” say my colleagues in the Semantic Web Activity, “You admit that such an interpretation is necessary. But the XML spec and ISO 8879 don’t give any such interpretation. Q.E.D.: XML and SGML have no model. They have no semantics.”

This is, of course, encouraged by the predilection of some XML enthusiasts to say XML is just syntax, there is no semantics. That’s not a necessary state of affairs for markup languages: troff has a semantics, TeX has a semantics; Scribe has a semantics. They define a model, they have primitive semantic operations, there’s a mapping from the markup to a model. They take a stand, they have a stake in the ground. You can know what any TeX document means because you can reduce it down to primitives. You could write a formal semantics for TeX. (Well, that’s not quite true, no one has ever written a full semantics for TeX because as Donald Knuth says it would be too complicated: the macro system makes it virtually ... maybe not impossible, but so complicated as not to be worthwhile. So for all practical purposes you can’t write a formal semantics for TeX.) But if you could, what does the model of TeX get us? Basically what TeX tells us is that documents are sets of black marks on white paper, or, generalizing somewhat, sets of marks in a color to be determined at print-time on a print substrate of a color to be determined at print-time. That’s the model of document that underlies TeX, and that’s why we don’t use TeX to model topics, or to model the physical organization of books, or to model the logical organization of books. Ditto for Scribe. Ditto for troff.

In 1989 or so I heard Ted Nelson give a talk in which he remarked that Xerox had recently launched an ad campaign in which they said that Xerox would be “the document company.” They had marked the beginning of this campaign by bringing out a line of printers about as big as one of these tables that would do everything: they would accept electronic input, they would print, they would fold, they would staple, eventually they would shrink-wrap, they would probably put the postage on too. Nelson’s summary was quite apt I think: “So: Xerox has decided that documents are marks on paper? [Gestures washing hands] So much for the document company!”10

We don’t have to adopt a set of semantic primitives that restrict our attention to marks on paper. We could strive for a universal semantic substrate. Centuries of visionaries and philosophers have said: “Yes, we can have a universal semantics.” Ramon Llull, John Wilkins, perhaps the best known in a long series was Leibniz. A perfect language, as Leibniz envisaged, would allow us to say true things and only true things. If you say something false it will be because you have committed a syntactic error, in exactly the same way that algebra effectively allows us only to calculate correct results: if you get the wrong answer, it’s because you made a mechanical error. It’s a great vision.11

Centuries of their contemporaries and skeptics have said: “No! There is no universal semantics.” The apocryphal account of Ramon Llull’s death says that Llull died when he perfected his system and set off for Africa to convert the Arabs, who were, however, not persuaded and who understandably treated him as an impertinent infidel. Other sources report that this did not happen, that although he did not succeed in converting the Arabs, he was not executed for his trouble but instead went back to Mallorca and died there.

We are inclined to believe, after Gödel and others, that there is no perfect universal vocabulary. Which is why, in SGML and in XML, the model is not given by the substrate. The model is given by the vocabulary designer. That responsibility is one that we can discharge well or poorly, but it is a fundamental part of the game, and every SGML or XML application has a model whether it is written down or not.

The reason that SGML/XML are structured this way is that there is a model of the world implicit in them — implicit, not explicit in the specs. SGML came out of years of effort by people, many of them involved in publishing, printing, and technical documentation, to improve the way they were able to do their work. In a lot of the rhetoric surrounding the early adoption of SGML you will see discussions of logical markup as opposed to presentational markup; the notion that what you want is a late binding to a rendering style; and then the generalization of those ideas to say: the markup itself is not to be interpreted as imperative markup but as purely declarative markup. The goal is to enable reuse, multiple use of the data, multiple views on the data, and with it, decentralization.

Of course, the specs don’t enforce this model of doing things. They don’t require that we use logical markup. There is no violation in the spec if Terry Catapano, as an analytical bibliographer, says, “I’m not interested in having the logical structure of the document be my primary structure. The objects I want to operate on are pages and gatherings and catchwords.” He can do that (and does so: [Bauman and Catapano 1999]). And everyone who was involved in the early design of SGML is perfectly happy that he can do that. But it’s a slight deviation from their original model: the specs don’t require logical markup, they don’t forbid markup about presentation; they don’t even forbid markup about overlap. They just say: “You define it. You describe it.”

SGML works, has a model, because it encourages us to identify the things we care about and their relations, and to define a mapping from those markup constructs to the world. When vocabularies don’t do that or don’t do that well, we have a more or less explicit sense that they’re not quite right. One reason that many people have said they’re unhappy with the existing RDF XML syntax is that they feel it’s ugly. Why is it ugly? It’s ugly because it has the avowed aim of allowing us to write down what is described as a graph, with a very simple regular structure, and the XML markup proposed does not have such a simple regular structure. The structure of the XML has very little relation to the RDF graph you’re going to get out of it. This is why I’m so very happy to have Jeremy Carroll [Carroll and Stickler 2004] suggesting an XML syntax that’s beautiful by comparison, because: you can see the graph in it! The only thing that was ever beautiful about RDF is the simplicity of the graph model. Why the original working groups wanted to disguise that beauty by giving it an ugly XML syntax I have never fully understood — except, perhaps, that they believed that XML would eventually wither away because, as I said earlier, I think, they don’t believe that it has a model.

Notice, also, that the RDF XML illustrates an interesting point. Although the relationship between the model and the modeled is frequently described as one of isomorphism, I think actually that’s not quite right. I think the relationship is more likely to be a homomorphism, because the structure of the model is not always perfectly reflective of the structure of the thing being modeled. We do tend to believe that those are ugly, but we do tend to produce them from time to time.

And of course, unlike most people who are trying to define data models, we don’t define operations. We don’t define operations, I believe, because the whole point, the initial model for SGML, was to enable reuse. How am I going to enable reuse if I prescribe for you the operations you are allowed to perform? Now, before I leave this topic I should admit: This in itself a model, an idealization. I am ignoring, for purposes of this discussion — and in order to get this talk to end before four o’clock — the existence of unreliable markup, the existence of idiosyncratic markup, the existence of different schools of usage of the same vocabulary, and I am ignoring tag abuse — all of which complicate the whole idea that the vocabulary designer says what the model is.

Models of SGML and XML

But of course if Eric Miller were here he’d stand up and say, “That’s not what I mean! I mean — what about whitespace? Surely you have to decide which whitespace is significant and not. And what’s the canonical form for an XML document — how do I know whether two XML documents are identical or equal or equivalent for processing purposes? Is SGML a stream of characters or a stream of octets or a tree or a serialization of a tree? Is the whitespace around attributes significant or insignificant?” Pointing Eric or anybody else to the infoset or to any of the other data models is not really an answer because there’s too much variation in the answer. The answer is: “You decide. It depends on what you’re doing.”

This isn’t necessarily the best possible answer; and in fact quite frequently, let’s admit it, it’s a royal pain. It would almost certainly have been better to nail down an answer to some of those questions in the SGML spec, even if we said that this is just a reference model: you can do things differently if you have a good reason to.

For some purposes the variations among the Document Object Model and the Infoset and XPath 1.0 and 2.0 make a difference and they make one or the other more convenient for what you’re doing. At those times I’m happy to have the variation. Sometimes I want entity boundaries to be significant and visible and sometimes I don’t. But in many cases it’s just a variation that’s unmotivated, just noise, just a pain.

On the other hand, consider the alternative. SQL doesn’t have this kind of variation; at least I don’t perceive it to, from my vantage point as a user of SQL systems. SQL says very explicitly what’s significant and what’s not; the order of columns is significant in one particular limited way (namely that if you type a star in your select statement you’ll get them back in the order you defined them in), but the order of rows is never significant and you can never bank on it. But that’s it. They tell you exactly what’s significant and what’s not, and you can only get at SQL data through an API.

But I notice an interesting thing: the SQL vendors are scrambling to add XML support to their SQL systems much faster than the XML vendors are scrambling to add SQL support to their editors. I think that’s because users of SQL miss XML, but users of XML don’t particularly miss SQL. SQL provides one level at which you can access things, and SGML and XML provide many.

Frank Tompa used to say, “This is the point: there is no single model. There is no single level; there are several. Always use the weakest form of automaton you can to get done the work that you need.”12 For some purposes an SGML document is just an octet stream: anything that works correctly on an octet stream will do what you want. You do not need a copy program with a push-down automaton to copy SGML or XML documents: any copy that successfully handles octets will do. You can address it alternatively as a character stream; most of us prefer to do that. You can address it as a regular language in which everything is either a tag or content and you don’t care about the relations between tags. You can view it as a bracketed language [Ginsburg/Harrison 1967]. You can view it as denoting the tree structure that is induced over the bracketed language. You can view it as defining a constrained tree structure — if you put a validator into your processor. You can pay attention to pointers — IDs and IDREFS — and say it defines a directed graph structure. I can validate the directed graph structure. I can build an application data structure on top of it. You can address XML at any of those levels, and that’s important.

One of the programs that came out of the Centre for the New Oxford English Dictionary that Frank Tompa directed was a browser called lector [Raymond 1990]. Many of you will know it. One of the things about lector that used to drive me crazy was that it paid no attention to the nesting of elements. It dealt with tags, not with elements. This troubled me. I argued about it with Frank Tompa, and he said, “Why do you want to deal with elements instead of tags?” I said “Because there’s nesting in it; that’s the nature of SGML!” He said, “Yeah, there is. On the other hand, you can write a browser that doesn’t use it, and because we do not have to have a push-down stack it runs a lot faster.” That’s true: lector ran like a bat out of hell, whereas the contemporary SGML document processors, em, didn’t run like bats out of hell. [Laughter] And lector could start anywhere. As Liam Quin said the other day, “Sometimes it’s nice to be able to start in the middle instead of having to process a huge document starting at the beginning.” One of the key applications of lector was to put it together with a pre-existing search system that searched blocks of files; when it found what you were looking for, it would just hand that block to lector, and lector would start scanning. And because it didn’t care about context — it had a macro for every start tag and every end tag — and when it hit an “</etymology>” it said “go to roman.” Since etymology always occured in a roman context, that was the right thing to do. If you had an element that occurred in more than one context, sometimes you would get implausible formatting. But those who used lector quickly learned to write styles that didn’t have that characteristic, over time.

“Never use a more powerful mechanism than you have to” — surely that’s one of the most important lessons we can learn from our colleagues in computer science.

So what are the answers to the question, “Does SGML have a model?” The first answer is: Yes. It has many models. It has too many models; sorry; deal with it. The second answer, also true, is: No, SGML has no model at all. That is precisely as it should be, because it is not SGML or XML which has the model, it is the applications that have models. That is not only as it should be but as it must be. The model is not defined by ISO or W3C; it is not controlled by members of some working group; it is controlled by you. The data are owned not by software vendors, and not by ISO or W3C, but by you.

You have the responsibility; you have the authority. Go forth and become models of clarity, exemplars of the task of information modeling.

[Wild applause]



The reference is to the “Listener” ribbons surreptitiously distributed by my co-chairs Tommie Usdin and Debbie Lapeyre; the ribbons were inspired by my colleague Liam Quin’s remark “They have ribbons for speaking and for chairing sessions, but they don’t seem to have any ribbon for the most important thing I expect to do at this conference!”


The “grove” is an elaborate graph model for SGML documents (and other data) defined in ISO/IEC 10744 (HyTime) [ISO/IEC 1992], or more precisely in a draft revision of HyTime which may or may not ever have become an international standard [ISO/IEC 1996]. For SGML, “the spec” is of course ISO 8879 [ISO 1986], for XML [W3C 2004a]. The “information set” is defined by [W3C 2004b], the “document object model” by [W3C 2000] and its successors, and of course the two XPath models are made explicit in [W3C 1999] and [W3C 2004c].


The continuum hypothesis is discussed in daunting detail by a number of sites on the World Wide Web; a search for the keywords “Cantor’s continuum hypothesis Cohen’s theorem” will find many of them. One of the more accessible brief treatments may be that of [Jech 2002].


The reference is to the pins distributed at the conference by Patrick Durusau and Steve DeRose, which read “Overlhappens.”


They were not induced to walk slowly in circles, so the model had a very restricted similarity to the solar system.


The discussion in this and many of the following paragraphs follows in essential details points raised in [Hesse 1967].


My colleague Alan Kotok tells me HO scale (1:87) is the most common scale for modelers, but others (N, or 1:160; O, or 1:48; S, or 1:64) are also used.


Another reason that I love the term “caterpillar automaton” completely unrelated to its suggestiveness as a model is that it elicited from Debbie Lapeyre the quotation from Ogden Nash: “I find among the works of Schiller / No mention of the caterpillar.”


In a way, the heuristic use of models may be partly responsible for the discovery of nuclear fission. When seeking to understand Strassmann and Hahn’s results, Lise Meitner and Otto Frisch recalled Niels Bohr’s insistence that nuclei aren’t solid, but are more like drops of liquid. They visualized the process of fission by thinking of a large nucleus as resembling a large and rather wobbly droplet of liquid in which the forces holding it together (the strong nuclear force on the one hand, surface tension on the other) are just barely stronger than the forces seeking to tear it apart. It is true, however, that the droplet model does not seem to have provided any help in simplifying the necessary calculations. There are many treatments of the physics of the period accessible to non-physicists; among those which I have enjoyed the most are [Segrè 1970], [Frisch 1979], [Rhodes 1986], [Gamow 1966], and [Bernstein 2004],


About a month after this talk, The New York TImes ran a story explaining that Xerox is now dropping “The Document Company” as its identifying tagline in advertising because, well, they want to be seen as being about something more than marks on paper. You can read the story at


A good source of information on attempts to create sets of universal semantic primitives is [Eco 1993].


This point of view was conveyed most strongly in personal communications, but it is echoed to some extent in [Tompa 1989], particularly section 4 “Models of tagged text”. The article is also interesting for its clear discussion of the comparative advantages and disadvantages of inline and standoff markup.


[Bauman and Catapano 1999] Bauman, Syd, and Terry Catapano. “TEI and the encoding of the physical structure of books.” Computers and the Humanities 33 (1999): 113-127.

[Bernstein 2004] Bernstein, Jeremy. Oppenheimer: Portrait of an enigma. Chicago: Ivan R. Dee, 2004.

[Brüggemann-Klein and Wood 2004] Brüggemann-Klein, Anne, and Derick Wood. “Balanced context-free grammars, hedge grammars and pushdown caterpillar automata.” Paper at Extreme Markup Languages 2004. Montréal, August 2004. ../Bruggemann-Klein01/EML2004Bruggemann-Klein01.xml

[Carroll and Stickler 2004] Carroll, Jeremy J., and Patrick Stickler. “RDF triples in XML.” Paper at Extreme Markup Languages 2004. Montréal, August 2004. ../Stickler01/EML2004Stickler01.html

[Dubin and Birnbaum 2004] Dubin, David, and David Birnbaum. “Interpretation beyond markup.” Paper at Extreme Markup Languages 2004. Montréal, August 2004. ../Dubin01/EML2004Dubin01.html

[Eco 1993] Eco, Umberto. Ricerca della lingua perfetta nella cultura europea. Translated by James Fentress as The search for the perfect language. London: HarperCollins, 1995.

[Frisch 1979] Frisch, Otto R. What little I remember. Cambridge: Cambridge University Press, 1979; rpt. 1991.

[Gamow 1966] Gamow, George. Thirty years that shook physics: The story of quantum theory. New York: Doubleday, 1966; rpt. New York: Dover, 1985.

[Ginsburg/Harrison 1967] Ginsburg, S., and M. M. Harrison. “Bracketed context-free languages.” Journal of computer and system sciences 1.1 (1967): 1-23.

[Hesse 1967] Hesse, Mary. “Models and analogy in science.” In The encyclopedia of philosophy, ed. Paul Edwards. New York: Macmillan, Free Press; London: Collier, 1967. 5: 354-359.

[Hoffman 2003] David Hoffman. The Easy-Bake Oven Gourmet. Philadelphia: Running Press, 2003. ISBN: 0762414405.

[ISO 1986] International Organization for Standardization (ISO). 1986. ISO 8879-1986 (E). Information processing — Text and Office Systems — Standard Generalized Markup Language (SGML). International Organization for Standardization, Geneva, 1986.

[ISO/IEC 1992] International Organization for Standardization (ISO); International Electrotechnical Commission (IEC). 1992. ISO/IEC 10744:1992 (E). Information technology — Hypermedia / Time-based Structuring Language (HyTime). International Organization for Standardization, Geneva, 1992.

[ISO/IEC 1996] International Organization for Standardization (ISO); International Electrotechnical Commission (IEC). 1996. [Draft] Corrected HyTime Standard ISO/IEC 10744:1992 (E).. [n.p.]: Prepared by W. Eliot Kimber for Charles F. Goldfarb, Editor, 13 November 1996.

[Jech 2002] Jech, Thomas. 2002. “Set Theory.” In The Stanford Encyclopedia of Philosophy (Fall 2002 Edition), ed. Edward N. Zalta.

[Raymond 1990] Raymond, Darrell R. lector — An interactive formatter for tagged text. Research report CS-90-34. Waterloo, Ont.: University of Waterloo Department of Computer Science, September 1990.

[Rhodes 1986] Rhodes, Richard. The making of the atomic bomb. New York: Simon and Schuster, 1986.

[Segrè 1970] Segrè, Emilio. Enrico Fermi: Physicist. Chicago, London: University of Chicago Press, 1970.

[Thompson 2004] Thompson, Bryan, Graham Moore, Bijan Parsia, and Bradley R. Bebee. “Scalable document-centric addressing of semantic stores using the XPointer Framework and the REST architectural style.” Paper at Extreme Markup Languages 2004. Montréal, August 2004. ../Thompson01/EML2004Thompson01.html

[Tompa 1989] Tompa, Frank Wm. “What is (tagged) text?” In Dictionaries in the Electronic Age. Proceedings of the fifth annual conference of the UW Centre for the New Oxford English Dictionary, 18-19 September 1989, Oxford, England, pp. 81-93.

[W3C 1999] World Wide Web Consortium (W3C). XML Path Language (XPath) Version 1.0, ed. James Clark and Steve DeRose. W3C Recommendation 16 November 1999 Published by the World Wide Web Consortium at, November 1999.

[W3C 2000] World Wide Web Consortium (W3C). Document Object Model (DOM) level 1 specification. Published by the World Wide Web Consortium at, September 2000. W3C Recommendation.

[W3C 2004a] World Wide Web Consortium (W3C). Extensible Markup Language (XML) 1.0 (Third Edition), ed. Tim Bray, Jean Paoli, C. M. Sperberg-McQueen, Eve Maler (Second Edition), François Yergeau (Third Edition). W3C Recommendation 4 February 2004. Published by the World Wide Web Consortium at

[W3C 2004b] World Wide Web Consortium (W3C). XML Information Set (Second Edition), ed. John Cowan and Richard Tobin. W3C Recommendation 4 February 2004. Published by the World Wide Web Consortium at

[W3C 2004c] World Wide Web Consortium (W3C). XQuery 1.0 and XPath 2.0 Data Model, ed. Mary Fernández et al. W3C Working Draft 23 July 2004. Published by the World Wide Web Consortium at, 2004.

[Wilmott 2004] Wilmott, Sam. “All about pattern matching.” Paper at Extreme Markup Languages 2004. Montréal, August 2004. ../Wilmott01/EML2004Wilmott01.html

Runways, product differentiation, snap-together joints, airplane glue, and switches that really switch

C. M. Sperberg-McQueen [World Wide Web Consortium, MIT Computer Science and AI Laboratory]