The FRBR [Functional Requirements for Bibliographic Records], released by the International Federation of Library Associations and Institutions in 1998, generalizes and refines current practices and theory in library cataloging, presenting a compelling natural ontology of entities, attributes, and relationships for representing the “bibliographic universe”. The FRBR framework is extremely influential and increasingly accepted as a conceptual foundation for cataloging practice and technology in libraries and elsewhere. XML documents as defined in the W3C XML 1.0 specification, are now an important part of this bibliographic universe and it is natural to ask to which of FRBR’s “Group 1” entities does the XML document correspond. Curiously, there seem to be conflicting arguments for assigning the XML document to either of the two plausible entity categories: manifestation and expression. We believe these difficulties illuminate both the nature of the FRBR entities, and the nature of markup. We explore a conjecture that an XML document has a double aspect and that whether it is a FRBR manifestation or a FRBR expression depends upon context and intention. Such a double-aspected nature would not only be consistent with previous arguments that the meaning of XML markup varies in “illocutionary force” according to context of use, but might also help resolve an old puzzle in the humanities computing community as to whether markup is “part of” the text [buzzetti02]. However, there are alternative resolutions to explore as well and we seem to still be some distance from a full understanding of the issues.
In 1998, the IFLA [International Federation of Library Associations and Institutions] released FRBR [Functional Requirements for Bibliographic Records] [frbr98] [leboeuf03]. Using an informal entity-relationship approach, FRBR elaborated a framework of entities, attributes, and relationships for representing what library catalogers have called the “bibliographic universe”. This framework refines and extends important cataloging concepts and codifies an emerging theoretical consensus within the cataloging community.
The FRBR framework has been found natural and compelling and is increasingly reflected in cataloging practices and technology in libraries and elsewhere — international bibliographic databases (such as Worldcat) and software systems (such as Endeavor) are being “FRBRized”, and the bible of library cataloging, the AACR [Anglo-American Cataloging Rules], will be revised to reflect the FRBR framework.
XML documents (as defined in the W3C XML 1.0 specification) are now an important part of the bibliographic domain that FRBR is describing, and so it is natural to ask: to which FRBR entity does the XML document correspond?
In what follows, we argue that identifying the XML document with either of the two plausible candidates — the FRBR entities expression and manifestation — is problematic: there are plausible arguments on each side. We think that these difficulties are illuminating, not only revealing difficulties in applying FRBR, but improving our understanding of what XML documents actually are, semantically speaking.
We explore a possible resolution to the conflict that represents XML documents as having a “double aspect”. On this account, whether an XML document is a FRBR manifestation or a FRBR expression depends upon the specific context or use and the intentions of relevant persons (such as writers, readers, and editors) and cannot be determined by any intrinsic properties of the XML document itself. Such a double-aspected nature would be nicely consistent with previous arguments that XML markup varies in “illocutionary force” according to context of use [renear01]. It might also help resolve an old (and recently renewed) puzzle in the humanities computing community as to whether markup is “part of” the text. Finally, this resolution appears largely consistent with the general approach of the BECHAMEL XML Semantics Project, which locates the text as a conceptual entity independent of the syntactic XML document [renear02].
We are not entirely satisfied with this approach, however. There are a number of subtleties and complexities to explore in the application of FRBR to XML documents and we seem to still be some distance from a full understanding of the issues.
We emphasize that the discussions that follow are preliminary and tentative. We particularly note that there is considerable ongoing discussion of the FRBR expression and manifestation entities underway within the cataloging community that we have not yet been able to take fully into account.
FRBR recognizes three groups of entities and identifies attributes that characterize entities and binary relationships that hold between entities. We are concerned here with the entities in “Group 1”: work, expression, manifestation, and item.
The term “work” is already in common use in more or less the sense suggested above; “expression” seems roughly equivalent to the colloquial use of the word “text” (or “version”); manifestation corresponds closely (and intentionally) to “edition”; and “item” is roughly synonymous with “copy”. (Obviously many common English words, such as “book”, are ambiguous between two or more FRBR categories.)
As suggested by the definitions, there are relations that hold between the three adjacent pairs of Group 1 entities. A work is “realized through” an expression; an expression is “embodied in” a manifestation; and a manifestation is “exemplified by” an item.
Some relevant clarification of each entity may be obtained by considering the attributes and relationships it properly has, and the sort of variation it may undergo without loss of identity. Works have as attributes such things as authors, titles, and genres (play, biography, sonata, etc.), but not, for instance, language: translations are considered different expressions of the same work. Expressions do have their notational form as an essential characteristic — French and German translations, as noted, are different expressions, even when they realize the same work. However, expressions do not have as attributes such things as typeface and type size, or collation (using printed books as an example); differences in typeface, type size, or collation would imply differences in manifestation, and different manifestations may embody the same expression. An item may have an exhibition history, or a mark, such as a handwritten inscription, but although it is an exemplar of a manifestation it does not, strictly speaking, have a typeface. An item may of course be said to have a typeface in a derivative sense, in virtue of exemplifying a manifestation which has a typeface.
Where should the “XML Document” be placed in this framework?
Informally an XML document is generally understood to be a combination of text (or other data content) and XML markup. Formally, the XML 1.0 specification characterizes an XML document with these definitions: [xml99]
A data object is an XML document if it is well-formed as defined in this specification …
[Definition : A textual object is a well-formed XML document if:]
1. Taken as a whole, it matches the production labeled document.
2. It meets all the well-formedness constraints given in this specification.
3. Each of the parsed entities which is referenced directly or indirectly within the document is well-formed.
Where the document production is a grammar given in an extended Backus-Naur Form notation.
It is easy to see that an XML document is not a FRBR item, which is a concrete physical object, such as a physical book. And it also seems unlikely that an XML document is, itself, a FRBR work, an intellectual or artistic creation that exists independently of any particular symbolic realization. That leaves two possibilities: expression and manifestation.
FRBR characterizes an expression as “the intellectual or artistic realization of a work in the form of alphanumeric, musical, or choreographic notation …”, noting that two translations (say, French and German) are different expressions of the same work. As something defined by a formal grammar an XML document would immediately appear to be a straightforward, even exemplary, notational entity, similar to a natural language in fundamental nature, and therefore, a FRBR expression.
To confirm that XML documents are not manifestations we note that manifestation-specific features such as typeface and carrier material are not properties of any XML document per se. An XML document may be rendered in different typefaces, but it does not, itself, have any particular typeface.
On the other hand, the markup of an XML document can also be understood as functioning exactly like rendering events (font shifts, type size changes, vertical and horizontal whitespace, etc.) to effect, expedite, and disambiguate the recognition (whether by humans or computer) of the underlying textual objects. This would make the XML document seem more like a manifestation that embodies an expression; an expression that might in fact be embodied differently — for instance, with orthographically different, but semantically equivalent XML markup, or with traditional layout devices.
What about the observation that an XML document does not itself have a number of attributes that are associated with manifestations, such as typeface, collation and so on? This can be given an alternative explanation: the XML document need not be a completely determinate manifestation, but rather a class of manifestations, or a part of a manifestation [doerr03].
How can we explain these conflicting arguments? Here are the possibilities.
In what follows we explore a particular solution of the fourth kind: the concept of an XML Document does not univocally identify something which is an appropriate candidate for identification with a FRBR entity. More specifically, we argue that a particular XML document can only be assigned to one of the two candidate FRBR entity classes — manifestation or expression — with respect to certain features of its context or intended use. Apart from such a context, XML documents have a “dual aspect”, understood from one perspective an XML document is a manifestation of an expression, understood from another it is itself an expression, and specifically a “second-order” expression: an expression that realizes a work that is itself about an expression — and, specifically, about the very expression the XML document would embody if interpreted as a manifestation.
First we describe a context in which an XML document is a manifestation.
Oversimplifying a bit and wherever possible remaining neutral on philosophical topics, one may say that a work about, say, whales, consists of a body of abstract intentional acts: assertions about whales, questions about whales, warnings about whales, and so on, arranged in a rhetorical structure. When an author writes a book about whales, she creates an expression the meaning of which is this particular structure of intentional acts. She accomplishes this by creating an exemplar of a manifestation (whether we mean the authorial manuscript or the printed book doesn’t matter at this point) that in a reader will effect the apprehension of the expression that in turn has as its meaning the body of intentional objects which is the work about whales. Generally, such a manifestation will have, as well as the marks indicating linguistic characters, specific features that make the apprehension of the embodied expression reliable and efficient. In familiar printed books, these features are, most importantly, graphic devices such as changes in horizontal and vertical spacing, font shifts, color, and so on.
XML markup may also perform the same function as these graphic devices, making the apprehension and cognitive navigation of an expression efficient and unambiguous. Consider for instance an authorial manuscript about whales, prepared in an XML element set, such as TEI Lite . The author uses the markup to indicate the identities and boundaries of textual components, anticipating that subsequent processing by XML/TEI software will be able to follow these cues and process the (digital) manuscript appropriately. Here it would be a computational agent that would recognize the expression being cued by the software, but if either the author or her collaborators read the unformatted text they will also be using the markup similarly as cues to the presence and identity of textual objects. In this case, XML markup for, say, a paragraph, seems to be functioning exactly like extra vertical leading or an initial em space.
What is important to note in this scenario is the obvious fact that the author is creating a book about whales, and not about paragraphs, titles, lists, and such. That is, the author is not using markup to describe an expression, and in particular not using markup to describe the expression that realizes a work about whales. If she were she would be creating another work, on another subject — a work not about whales, but about an expression. But that is not what is happening, the author is not using markup as part of a secondary expression that asserts things about the expression that realizes the work about whales, but rather as a paralinguistic device to help effect the apprehension of the expression which is the text about whales, ensuring that that expression is easily and correctly grasped, whether by human or computational agents.
Now, to be sure, what the whaling specialist accomplishes in this act of writing a book about whales does indeed license inferences about chapters, paragraphs, and lists. But that is not the same thing as making assertions about chapters, paragraphs, and lists. The distinction is subtle, but important, and we will return to it later.
Now we describe a context in which an XML document is an expression.
Consider the situation where a scholarly textual editor is preparing, using the TEI XML vocabulary, a new critical text of an important cultural work. To elicit the right intuitions let this be an intellectually ambitious and controversial edition, and prepared making using the full variety of markup that TEI P5 provides for critical editions. The text of the culturally important work may well be about whales, but our scholarly editor is probably not a whaling specialist, and, in any case, is not herself creating or modifying a work about whales. The editor is rather creating a work about an expression that realizes a work about whales. To do this, the editor creates an expression that realizes the work that is about the expression that realizes the work about whales. The editor’s own creative achievement, we note, is not zoological; it is philological.
Looking at the phenomenology of the situation more closely, we imagine that the editor or transcriber puzzles over the pages of a physical document and, drawing on her erudition, her knowledge of the relevant history, graphic vocabulary, the content and context of the specific textual item, and so on, comes to a conclusion: “<p>” Here “<p>” itself is a bit of notation that in context is clearly being used to express an assertion about a expression. The XML document in this context is functioning as an expression, but an expression that realizes a work about an expression.
Note that in the situation just described, a scholarly editor wishing to proof-read her work, might create a convenient “pretty-printed” manifestation, with type style changes, color, whitespace, and the like foregrounding the XML/TEI markup so that the expression — the XML document — can be easily grasped. This practice, in fact common in literary encoding projects where the XML document itself must be carefully proof-read. It is natural when the XML document itself is taken as an expression, but it contrasts with the proof-reading practice common in authorial contexts. There the proofing copy would be rendered not with visible XML markup, but with the usual graphic devices which are more familiar and more cognitively efficient for recognizing, not the XML document (which in the authorial situation is a manifestation), but rather the expression which the XML document embodies.
The above account is consistent with a criticism, presented at Extreme Markup 2000, by one of the authors of this article, of the traditional classification of markup as either descriptive or procedural [renear01]. There it was argued that the traditional classification conflates two dimensions, domain and mood. While the use of markup (say “<title>”) by a transcriber/editor does indeed assert that some bit of text is a title, the use of the same markup by an author does not assert that something is a title, but rather is a performative act that creates a title. The situation is similar to the difference, identified by the ordinary language philosopher John Austin, between on the one hand, describing someone as having promised, and on the other, saying in the first person present tense, “I promise …”. The former case is a simple assertion that a promise was made; it purports to describe the world and it may be true or it may be false. But the latter case, the statement “I promise …”, is not an assertion at all. It does not purport to describe the world, and it is not appropriately characterized as “true” or “false”. Most importantly, unlike a simple assertion that a promise was made, the statement “I promise …” actually effects a promise. Authorial markup, according to this analysis, is performative; it actually creates the text rather than simply describing one.
Similarly, just as we cannot correctly classify “<title>” as to markup category without knowing the context of its use, so we cannot correctly classify an XML Document as an expression or manifestation without knowing its context as well. And the aspects of context that are relevant are the same in each case.
These differences between the two sorts of markup, and the two aspects of an XML document, have gone unnoticed up until now for several reasons.
Now we’ll borrow another thought from Austin. He says, somewhere, “… there’s the part where you say it … and the part where you take it back.”
Consistent with the Extreme “Breaking News” genre, these are, as we have said earlier, only the most preliminary and tentative thoughts on this topic. Although we find the analysis above compelling and illuminating, we also find it weak and awkward in places, as you no doubt do as well, and rather suspect that in the end it must be substantially modified, or even relinquished entirely. But rather than pursue those problems here, we’d like to take a look at some considerations that suggest alternative approaches to the problem.
We first observe that the FRBR group 1 entities conflate various levels of abstraction, even in the (ostensibly prototypical) case of paper books. Since translations from one language to another are deemed expressions of the same work, then it would appear reasonable to identify the author’s ideas at the “work” level, her words at the “expression” level, symbols representing those words at the “manifestation” level, and physical tokens (i.e., matter or energy) patterned to encode those symbols at the “item level”. Since a manifestation physically “embodies” the expression, then every physical detail of the pattern or arrangement would seem to be part of the manifestation. This interpretation is consistent with the situation of typeface and line width attributes at the manifestation level.
Transforming the mapping of symbolic to physical properties (as, for example, by transmission over a network or copying from magnetic to optical media) creates a new manifestation. In addition, attributes assigned or selected to accommodate a particular physical medium are evidently part of the manifestation; these include line widths in the case of text and display resolution in the case of images or letterforms. So far, so good, but many levels of abstraction remain to be accounted for, and finding a clear distinction between the expression and manifestation levels will prove challenging when symbolic to physical mappings are themselves symbolically encoded. The significance of this challenge speaks to the earlier observation that XML markup and formatting instructions play closely associated roles in identifying textual objects.
A work may be realized as an expression in any of a variety of forms: notation, movement, image, sound, or object. These examples suggest a tangibility similar to the manifestation level, except that the expression level relates to the content of a work, not the physical properties through which the content is symbolically represented. A system of symbols or a language may be mapped to physical properties in any number of ways, each creating a different manifestation of the expression, but any variation in how the work itself is symbolized creates a different expression.
Now suppose the first step of re-expressing a book into digital form is to translate it into Azeri. This is only the first step in arriving at a new expression:
And so on. The distinctions that separate structural markup from other approaches to documentation (e.g., specification of structure vs. specification of formatting) are not the same distinctions that motivated the classification of FRBR group 1 entities (e.g., symbol systems vs. the patterns that encode them) and so finding a comfortable place for the former among the latter proves challenging.
Further complicating the abstraction level issues (e.g., the manifestation vs. its symbolic specification) is the fact that XML markup can represent content objects that are themselves at varying levels of abstraction. Some content objects are so much a part of a work’s rhetorical structure, that one cannot imagine altering or removing them without creating a new derivative work. Examples include the chapters of a novel or the verses of a song. But other content objects seem less essential, or play an editorial or analytical role (e.g., the chapters and verses of books in the bible). Still other markup communicates document structure only indirectly via formatting instructions (boldface, linebreak, etc.).
The more we think about these things, the more complicated they seem to get.
The authors would like to thank Michael Sperberg-McQueen, Claus Huitfeldt, Kevin Hawkins, and other members of the GSLIS Electronic Publishing Research Group for insights of various kinds, although errors and confusions are of course ours alone. We will also note here that this paper is a submission in the “Breaking News” category of Extreme Markup Languages 2003 and, consistent with the intention of that category, it presents thoughts that are very recent, preliminary, and tentative, intended more to convene a discussion than to advance one. We look forward with pleasure to having our confusions analyzed, errors corrected, and conjectures refuted.
[azeri00] Azerbaijan International. About the Azeri Language and Alphabet 1998-2000. Published on the Worldwide Web at http://www.azeri.org/Azeri/az_english/aboutaz_index.html.
[buzzetti02] Buzzetti, D. “Digital Representation and the Text Model”. New Literary History 33 (2002), 61-88.
[coulmas03] Coulmas, F. C. Writing Systems: An Introduction to their Linguistic Analysis. Cambridge University Press, Cambridge, UK, 2003.
[doerr03] Doer, M., Hunter, J., and Lagoze, C. “Towards a core ontology for information integration”. Journal of Digital information 4, 1 (2003). Published on the Worldwide Web at http://jodi.ecs.soton.ac.uk/Articles/v04/i01/Doerr/doerr-final.pdf.
[frbr98] IFLA Study Group on the Functional Requirements for Bibliographic Records. Functional Requirements for Bibliographic Records: Final Report . UBCIM Publications-New Series. Vol. 19, München: K.G.Saur, 1998. Published on the Worldwide Web at http://www.ifla.org/VII/s13/frbr/frbr.pdf.
[renear01] Renear, A. “The Descriptive/Procedural Distinction is Flawed”. Markup Languages: Theory and Practice 2, 4 (2001), 411-420. Earlier version presented at Extreme Markup Languages 2000, Montréal, August 2000.
[renear02] Renear, A., Dubin, D., Sperberg-McQueen, C. M., and Huitfeldt, C. Towards a semantics for XML markup. In Proceedings of the 2002 ACM Symposium on Document Engineering (McLean, VA, November 2002). R. Furuta, J. I. Maletic, and E. Munson, eds. Association for Computing Machinery, pp. 119-126.