Regular Fragmentation: Treating complex textual content as markup

Simon St.Laurent

Abstract

Regular fragmentations are an approach to proecessing textual content as if it had been represented as more finely-grained markup. The XML Schema Datatypes specification, for instance, offers a number of lexically compound types among its primitive types, requiring developers to rely on extension functions or XML Schema processing to manipulate them with XSLT. Regular fragmentations allow developers to specify the application of regular expression to element content (attribute content coming soon!) using an XML-based rules syntax. An open source SAXFilter implementation allows the use of regular fragmentations in a wide variety of XML processing environments.

Keywords: SAX; Processing; Modeling

Regular Fragmentation

Treating complex textual content as markup

Simon St.Laurent [O'Reilly and Associates]

Extreme Markup Languages 2001® (Montréal, Québec)

This paper is not represented in the conference proceedings.