Natural language processing in Topic Maps

Eric Freese

Abstract

The creation of topic maps can be quite labor-intensive, and those who create topic maps are therefore powerfully motivated to apply techniques that increase their productivity.

The forthcoming version of the author's open source software package, SemanText 0.73, breaks new ground in enhancing the productivity of those who create topic maps for textual corpora. The software examines the natural language content of such corpora, identifies materials relevent to subjects, and creates topic map constructs that make these materials findable. In this way, large, complex topic maps can be constructed with relatively low human effort. Like all other topic maps, the resulting topic maps can be subsetted and/or merged with other topic maps. The SemanText 0.73 methodology for examining information and defining the rules for constructing topic map constructs, including occurrence constructs, is the subject of this late-breaking news presentaton.

Keywords: Topic Maps; Natural Language Processing; Python

Natural language processing in Topic Maps

Eric Freese

Extreme Markup Languages 2001® (Montréal, Québec)

This paper is not represented in the conference proceedings.