A formalism for representing milestone and hierarchical annotations together

Sylvain Loiseau

Abstract

Milestone annotation extends the expressivity of XML (allowing it to include non-hierarchically-nested units in the tree) but are hard to process. Using a representation including both annotations allows easy conversion of any part of the milestone annotation into hierarchical annotation, while converting the conflicting hierarchical annotation back into milestone annotation. This may be useful for many tasks where only a subset of the annotation needs to be expressed hierarchically: it allows benefiting from both the expressivity of milestones and the processability of XML. The common representation proposed is the annotation graph formalism (a labelled, ordered and indexed graph); it privileges the precedence relation, expressed by both annotations, over the dominance relation, expressed only by the hierarchical annotation. This presentation explains the formal properties of an annotation graph, investigate the rules to be expressed for such conversions, and show a simple algorithm for dealing with the milestone patterns defined in the Text Encoding Initiative Guidelines.

Keywords: Concurrent Markup/Overlap

A formalism for representing milestone and hierarchical annotations together

Sylvain Loiseau [LIMSI / University Paris-Sud]

Extreme Markup Languages 2007® (Montréal, Québec)

This paper is not represented in the conference proceedings.

But see the author package.