At Extreme 2003 a “late breaking” presentation by members of the UIUC/GSLIS Electronic Publishing Research group argued that XML Documents could not be assigned to either of the two plausible FRBR entity classes, manifestation or expression, since XML Documents appear to both have and lack some characteristics of each class. A “double aspect” theory was proposed where a single XML Document can function as either a manifestation or an expression depending on the circumstances. We now use the Guarino/Welty ontology evaluation techniques [guarinowelty00:lncs] to generalize and confirm our results, demonstrating that FRBR expression and manifestation entity types are not true types, but rather roles.
Note: In the spirit of Extreme Markup Languages “late breaking” presentations, the following is a short summary of preliminary results presented at EML 2006. Readers unfamiliar with FRBR [Functional Requirements for Bibliographic Records] or with our previous work on assigning XML documents to a FRBR entity class should refer to [renear03a:eml] for the necessary background.
The Functional Requirements for Bibliographic Records (FRBR), an entity relationship model of works, texts, editions, documents and the like, characterizes itself as “a conceptual model of the bibliographic universe” [ifla98-frbr]. Originally intended primarily to guide the development of systems for creating and managing bibliographic records, FRBR has been extremely influential and is now increasingly used more generally to support the design of systems for content management and publishing.
At Extreme Markup Languages 2003 a presentation by members of the UIUC/GSLIS [University of Illinois at Urbana-Champaign, Graduate School of Library and Information Science] Electronic Publishing Research group argued that XML Documents, in the sense explicitly and formally defined by the W3C XML standard [xml1999:recommendation] could not be assigned to either of the two plausible FRBR entity classes, manifestation or expression, since XML Documents appear to combine the characteristics that presumably distinguish these FRBR entity classes [renear03a:eml]. To account for this puzzling phenomenon a “double aspect” theory was proposed: a single XML Document can function as either a manifestation or an expression depending on the context. This resolution was not only simple and compelling, but seemed to generalize nicely, providing a broader explanatory framework for other problems in markup theory such as difficulties with the descriptive/procedural classification [renear00a:eml] [renear03:clip].
In the spring of 2006 EPRG held joint meetings with a GSLIS seminar, Topics in Knowledge Representation. The general agenda was to apply conceptual modeling strategies (and particularly logic-based ontological analysis) to document representation, exploring how serialization grammars like XML markup languages might be supplemented by higher-level abstractions. And, more particularly, we hoped develop some general architecture for design decisions in the construction of the an object-oriented Prolog environment for exploring XML markup semantics [dubin03:llc]. This workbench is part of the BECHAMEL project [sperberg00:mltp] [sperberg02:eml] [renear02:doceng].
Given these goals it was natural that we take into consideration the FRBR conceptual model of “the bibliographic universe”. In addition Ann Wrightson suggested, in a conversation at Extreme Markup Languages 2005, that we also explore the influential ontology evaluation rules proposed by Nicola Guarino and Charis Welty[guarinowelty00:lncs].
In the course of our joint seminar we were startled to realize that the FRBR entity types manifestation and expression actually appeared to fail the Guarino/Welty tests for true entity types. They seemed to be roles rather than types. That is, strictly speaking manifestation and expression are not fundamental types of things, but rather roles that types of things may have in particular circumstances. Moreover, we realized that if this were true it provided a larger framework for understanding why an XML Document cannot be, apart from social context, assigned to one or the other of the two FRBR types.
Guarino and Welty identify a number of “meta-properties” which they believe can be used to evaluate conceptual modeling decisions and ontology development. We are concerned with just one of these, rigidity.
Guarino and Welty define a rigid property as one that is “essential to all of its intances”, or in the notation of modal logic: ∀x (Φx → □Φx). (Where the quantifier ranges over individuals in all possible worlds.) Colloquially this means that if a property is rigid, then if anything has that property then it is impossible for that thing to have lacked that property or come to lose that property. Impossibility being meant in a strong "logical" sense. Or, using the model-theoretic terminology of possible worlds we would say that a property P is rigid if and only if for every x in any possible world, if x has P in that possible world then x has P in every possible world in which x exists.
Guarino and Welty give the following example.
“…we normally think of PERSON as rigid; if x is an instance of PERSON it must be an instance of PERSON in every possible world.2 The STUDENT property on the other hand, is normally not rigid; we can easily imagine an entity moving in and out of the STUDENT property while being the same individual.” [guarinowelty00:lncs]
For Guarino and Welty person is a rigid property because it is a fundamental ontological type, whereas student is a role that individuals things (each of which is of some type of other) have in certain contingent circumstances. The “ideal structure of a clean taxonomy”, they believe, will reflect these distinctions and will have types (and other rigid properties) in the “backbone” and roles (and other non-rigid properties) “hanging off” the backbone.
According to Guarino and Welty ontology evalation principles if FRBR expressions are true types, and appropriately in the “backbone” of an ontology of conceptual model, then the property of being an expression, should be rigid. Is the property of being an expression rigid? It does seem so at first glance. After all, it is difficult to conceive of a particular text as possibly not realizing the work that it does in fact realize. Is it possible that the 1851 text of Moby Dick not realize Moby Dick in the past or future? Or even in some other possible world?
On the other hand, what meaning is attached to sentences and texts is a certainly a contingent matter, based on the “collective intentionality” expressed through socio-linguistic conventions and institutions. [searle95:construction] . Might not those same sentences (i.e., that same expression) have a different meaning, or no meaning at all, in different socio-linguistic circumstances?
The argument that expressions are not types goes as follows: FRBR defines expression as the realization of a work in the form of notation, sounds, images, movement, etc. The expression which is the 1851 text of Moby Dick might in other circumstances not have realized Moby Dick, but rather realized some other work instead, or no work at all. If this is right then the property of realizing Moby Dick is not a rigid property.
Summarizing: Expressions are symbol sequences that realize works, but the same symbol sequence can realize different works, or no work at all, in different possible worlds. Therefore being an expression is not a rigid property and therefore, according to Guarino and Welty, expression is not a true type and should not occupy a “backbone” position in a conceptual model, as it does in FRBR. Being an expression (and being a particular expression) seems rather to be a role played by symbol sequences in particular socio-linguistic circumstances.
So, returning to our original problem, we first note that one should not ask whether an XML Document, a symbol sequence, is this expression or that expression, or an expression at all, except vis-à-vis a particular set of contingent social circumstances. Now let's assume that the FRBR entity type manifestation can be similarly deconstructed (as it can be). It then makes no sense to ask whether an XML Document, a symbol sequence, is an expression a manifestation except vis-à-vis a particular set of contingent social circumstances. In some circumstances in it a manifestation, in others an expression.
Of course it may be argued that symbol sequences themselves are not types either, but also roles, and so the argument is flawed. But for our purposes here that possibility is of little consequence. Whether we locate the contingency as lying between a geometric or aural pattern and a corresponding symbol sequence, or a symbol sequence and sentence, or a sentence and its meaning, or the meaning and the work, or any other possible decomposition of relevant layers of abstraction, changes the detail, but not the general strategy of the argument. All we require for the conclusion to go through is that some such distinction somewhere is contingent.
First, we conclude that an XML Document per se is neither a manifestation nor an expression. It is rather a symbol sequence (or something similar) that can play different roles in different contexts of socio-linguistic institutions and conventions. It is a manifestation in some contexts, and an expression in others.
Second, we see that FRBR may be flawed and need to be refactored.
Third, we see that we were more or less on the right track all along with the double aspect theory. The Guarino/Welty ontology evaluation scheme simply gives us a more general context for understanding the origin of the puzzle and our original resolution.
Finally, we might wonder whether the components of texts: paragraphs, lists, extracts, and the like, should be examined in a similar fashion and understood as roles of entities rather than types of entities. It looks to us like there are some rather interesting possibilities here.
This paper reports results achieved collaboratively with members of the University of Illinois GSLIS Electronic Publishing Research Group and the GSLIS Seminar on Topics in Knowledge Representation: Yunseon Choi, David Dubin (EPRG lead), Ingbert Floyd, Jin Ha Lee, Karen Medina, Allen Renear (seminar instructor) Sara Schmidt, Richard Urban, Xin Xiang, and Oksana Zavalina. However not all participants necessarily agree with all claims made in this report.
We note that to be strictly accurate the first sentence should end “in every possible world in which x exists”.
[dubin03:extreme] Dubin, D. Object mapping for markup semantics. In Proceedings of Extreme Markup Languages 2003 (Montreal, Quebec, August 2003), B. T. Usdin, Ed.
[dubin03:llc] Dubin, D., Sperberg-McQueen, C. M., Renear, A., and Huitfeldt, C. A logic programming environment for document semantics and inference. Literary and Linguistic Computing 18, 2 (2003), 225–233. (This is a corrected version of an article that appeared in 18:1 pp. 39–47).
[guarinowelty00:lncs] Guarino, N. and Welty, C. A. A formal ontology of properties. In Proceedings of the 12th European Workshop on Knowledge Acquisition, Modeling and Management 2000 R. Dieng and O. Corby, Eds. Lecture Notes In Computer Science, vol. 1937. (Springer-Verlag, London, 2000) 97–112.
[ifla98-frbr] IFLA Study Group on the Functional Requirements for Bibliographic Records.Functional Requirements for Bibliographic Records: Final Report. (FRBR) (Munchen: K. G. Saur, 1998).
[renear00a:eml] Renear, Allen H. (2000). “The Descriptive/Procedural Distinction is Flawed.”. Markup Languages: Theory and Practice. 2, 4 (2000) 411–420.
[renear02:doceng] Renear, A., Dubin, D., Sperberg-McQueen, C. M., and Huitfeldt, C. Towards a semantics for XML markup. In Proceedings of the 2002 ACM Symposium on Document Engineering (McLean, VA, November 2002), R. Furuta, J. I. Maletic, and E. Munson, Eds., Association for Computing Machinery, 2002 119–126.
[renear03:clip] Renear, A. H. Text from several different perspectives, the role of context in markup semantics. In Atti della conferenza internazionale CLiP 2003, Computer Literacy and Philology (Firenze, 4–5 December 2003). C. Nicolas and M. Moneglia, Eds. (Florence: University of Florence Press 2005).
[renear03a:eml] Renear, A. H., Phillippe, C., Lawton, P., and Dubin, D. An XML document corresponds to which FRBR group 1 entity? In In Proceedings of Extreme Markup Languages 2003, B. T Usdin and S. R. Newcomb, Eds. (Montreal, Canada, August 2003).
[searle95:construction] Searle, J. R. The Construction of Social Reality. New York: The Free Press. 1995
[sperberg00:mltp] Sperberg-McQueen, C. M., Huitfeldt, C., and Renear, A. Meaning and Interpretation of Markup. Markup Languages: Theory and Practice 2, 3 (2000), 215–234.
[sperberg02:eml] Sperberg-McQueen, C. M., Dubin, D., Huitfeldt, C., and Renear, A. Drawing inferences on the basis of markup. In Proceedings of Extreme Markup Languages 2002, B. T. Usdin and S. R. Newcomb, Eds. (Montreal, Canada, August 2002).
[xml1999:recommendation] W3C. Extensible Markup Language (XML) 1.0 (second edition) Recommendation. Published by the W3C (November 1999).