So why aren't Topic Maps ruling the world?

Eric Freese
efreese@datafoundry.com

Abstract

This paper raises some issues and questions intended to spark debate, raise some hackles and spur action. It may upset some people in the process, but hopefully the debate will enrich the topic map community as a whole. The hope of the author is that by bringing these items to light, smart people with good ideas and lots of energy may step forward in order to solve these problems.

Keywords: Topic Maps; Technology Adoption

Eric Freese

Eric Freese has 15 years of experience in the areas of document, information, and knowledge management with specific expertise in the development and implementation of XML technologies. His experience includes research, analysis, specification, design, development, testing, implementation, integration and management of information systems in a wide range of environments. He has significant research experience in human interface design, graphics interface development and artificial intelligence. Freese is a founding member of TopicMaps.Org, the organization that developed the XTM specification, and currently serves as the chairman of this group. He is also the chief architect and developer of SemanText, an open source application that uses topic maps to harvest and manage knowledge.

So why aren't Topic Maps ruling the world?

Eric Freese [Electronic Data Foundry]

Extreme Markup Languages 2002® (Montréal, Québec)

Copyright © 2002 Eric Freese. Reproduced with permission.

Introduction

This paper raises some issues and questions intended to spark debate, raise some hackles and spur action. It may upset some people in the process, but hopefully the debate will enrich the topic map community as a whole. The hope of the author is that by bringing these items to light, smart people with good ideas and lots of energy may step forward in order to solve these problems. The opinions expressed herein are not necessarily those of the author's employer, TopicMaps.Org, OASIS, ISO or any other organization which may have a vested interest in promoting the topic map model.

10 Things the Topic Map Community Has Done Wrong

Ok, "has done wrong" may not necessarily the best phrase to use. Perhaps "could improve upon" might be more constructive. In any case, there are several things the topic map community could certainly be doing better. The following sections will introduce these items, discuss them and possibly suggest courses of action to remedy what the author perceives as the main issue.

The Topic Map community has lost sight of its roots.

Topic navigation maps, as the ISO standard calls them, were originally intended to model back-of-book indexes. The main goal was to be able to interchange this information and provide a model where indexes could also be merged and navigated in an intelligent manner. The originators had a relatively simple goal in mind that seemed rather doable. Of course, there was also the secondary goal of actually coming up with a widespread application of HyTime. However, now you hear many of the same people bandying terms such as "global knowledge federation" around, while talking about the same subject. How did the change in focus occur and why?

Much of the topic map community has been caught up in the "knowledge management" hype that grips much of the industry right now. They seem to think that in order to gain any real acceptance, topic maps must be pushed into that space in any way possible. The author is just as guilty as anyone on this account, but has heard on numerous occasions that one way for topic maps to gain traction is to show success in its original space. After that time, by building on a successful topic navigation system to demonstrate real knowledge management in action, it would be much easier to get the desired attention.

One well-known, and seemingly successful application was the one done for the French encyclopedia, Quid (www.quid.fr). This application showed the simplicity of the model and how it could be applied to a large set of information. This application was also very much based on the original view of topic maps; an intelligent, navigable index.

The success stories have either been few or invisible. The problems have been too easy to see.

Early on at conferences there were presentations about how topic maps had been used in specific applications. However, lately very little has been said. Is this due to the fact that topic maps are so ubiquitous that individual projects no longer warrant attention? I think not. A review of the XML Cover Pages, and the web sites of the main topic map tool vendors yielded very little information about new, exciting topic map implementations. One has to assume that there are some, since these companies continue to be in business. Why don't we hear more about the new cool things being done?

Kal Ahmed presented a few successes in his talk at XML Europe in Barcelona [AHMED]. He highlighted the following:

  • the United States Internal Revenue Service uses topic maps in its 2001 Tax Products CD-ROM;
  • Creuna and Ontopia collaborated to create an open source toolkit that allow topic map information to be stored persistently in a Zope database, called OpenZTM;
  • South Africa's Council for Scientific and Industrial Research created a collaborative web application where all research and communication efforts could be aggregated;
  • Patrimoine and Mondeca manage financial information using topic maps in conjunction with a content management system and displayed to users through a portal;
  • Starbase Corporation, with the help of Ontopia, uses topic maps to integrate several disparate information systems into a single collabortive solution;
  • empolis has reengineered its content management product to use topic maps to manage most, if not all of the internal metadata within the system.
As far as the author can tell, Kal had a role in only one of these efforts. Why is it that the topic map vendors and consultants who did these projects haven't been showcasing them?

Initially, topic navigation maps that were distributed at conferences such as this were a good introduction to the topic map model. However, in some cases, the response, was "Is this it?" In other cases, the topic maps that were auto-generated from the data provided by speakers contained inconsistent (or dirty) data. The major thing they showed is that dirty data makes dirty topic maps. Or perhaps, that bad data floats to the top where it is easy to see. In any case, they were not only not very rich or useful, they were somewhat embarrassing.

A classic example of the effects of dirty data can be found on the Infoloom site containing topic maps from several past conferences (www.infoloom.com/gcaconfs/WEB/index.htm). If one looks in the author index for "Freese, Eric", an employment association is displayed with ISOGEN International. However, when the company index is searched, several different variations on "ISOGEN International" appear except for the one employing "Freese, Eric". When trying to explain the power of topic map merging to a potential user, this can be problematic.

There hasn't been a "killer app" for topic maps yet.

RDF has Dublin Core, Open Directories, RSS and DAML+OIL to show that it has the potential for widespread applicability. It is debatable whether these are "killer apps". However, they are widely used and known applications of the RDF architecture. Unfortunately, the closest thing topic maps has to any of these is NewsML. The problem with NewsML is that it doesn't even use topic maps as published in the ISO standard, but uses some structures that very vaguely resemble topic map constructs.

Topic maps needs a good PR effort.

The topic map community is showing all too well its roots in SGML. SGML was a valid concept, burdened with a reputation for being too hard. We all know that XML is merely SGML with better PR behind it. Topic maps seem to be suffering the same fate as SGML -- a lack of good PR. The splashy XTM announcement was a good start, but any momentum has fizzled into nothingness. If topic maps are good for indexes, why doesn't the indexing community embrace it? If it's good for knowledge interchange, why don't we see topic maps featured at knowledge management conferences other than those run by IDEAlliance?

Another place where a good PR effort is really needed is when the debates (or fights, or wars) between the members of the topic map community explode into public view. In many cases, this perceived instability has frightened many early adopters from using topic maps.

The topic map conceptual "package" is not complete.

It has long been understood that the topic map model needs to be completed by the creation of some companion standards. These include a conceptual model, an application model a query language, and a constraint language. On more than one occasion, the author has been told by organizations considering the use of topic maps that they are delaying adoption until the concept has been completely defined in standards.

Earlier this year, a decision was made to give standards development work back to ISO while OASIS went on with community building. While this separation of labor seems logical, it raises some questions. For example, TopicMaps.Org was created as a sort of "tiger team" to develop an XML encoding scheme for topic maps more quickly than could be done in ISO or W3C. Overall, this was a successful, and sometimes painful, experiment.

More than 18 months have passed since the announcement of the XTM specification. Just this summer, the XTM DTD was added to the ISO standard. Why have none of the anticipated companion specifications been delivered? Work has been done on them, but nothing has been completed. The reason for the delay is the all too familiar problem of limited numbers of people with limited bandwidth donating a great deal of time to the work of completing these standards. This is a prime opportunity for a few more smart people to contribute in order to speed the progress in this area.

The delay in developing and releasing companion standards has led to another common problem. Vendors are beginning to implement interim capabilities in order to fill the void. This creates the perception that interoperability of information has become secondary in importance. Even if the interim solution is based on ongoing standards development, it can introduce a fear of proprietary solutions into the community.

The topic map/RDF convergence demanded at Extreme Markup Languages 2000 has not been delivered.

Extreme Markup Languages 2000 featured a matchup between representatives from the topic map community and the RDF community. The goal was to determine which model was the better model. At the end, the consensus was that the models were very similar. The closing keynote of that conference called on both communities to end the confusion and determine how they should co-exist and let the world know.

To the knowledge of the author, this hasn't happened. Certainly there have been meetings and discussions, both formally and informally. At the Knowledge Technologies 2002 conference there was an evening session that focussed on the beginning of what could become a RDF Schema for topic maps. This appeared to be progress. However, at the same conference several papers talked about "RDF topic maps", but discussed vastly different things[RDF-TM2][RDF-TM1]. This led to even more confusion.

A great deal of confusion still exists concerning the differences and similarities between topic maps and RDF. This confusion serves neither community. It is the belief of the author that there will not be a unification of the models, as had been hoped early on. If this is, in fact, the case, then clear definitions of the key differences between the models and the selling points of each model should be developed in order for prospective users to make informed decisions.

Few, if any, tools exist to create topic maps from scratch.

A common reaction from people introduced to topic maps is "Wow! That's cool!!" However, when it comes time for them to try to use them, their response ranges from "What do we do with it?" to "How the heck do we get one?"

Although at least 3 companies have announced topic map or topic map-based products, none of them has currently announced an editor that enables users to create topic maps from scratch. The lack of an easy-to-use development interface prevents many of the possible early adopters away from tinkering with topic maps. If such an interface existed, these early adopters could become some of the best champions for the possibilities of topic maps. Granted, this is not the most effective method of topic map generation, but there needs to be a way to create even the most basic constructs in a topic map. However, hope is not lost. There are rumors that product announcements in this direction are forthcoming.

Also curious is that none of the XML editor vendors have announced modules to support XTM or RDF authoring. Why is that? They apparently haven't been caught up in any groundswell of demand for such tools.

This isn't simply an issue of which came first: the chicken or the egg? Editing tools haven't been created due to a perceived lack of market. The market has hesitated to adopt topic maps due to a lack of tools. It is the simple fact that one of them has to come first. An editing tool needs to be available in order to allow people to take advantage of the topic map management capabilities offered commercially.

Does anyone really know how scope is supposed to work? If so, why haven't they told us?

Steve Pepper raised this issue at Extreme Markup Languages 2001 [SCOPE]. It is an important issue that has received very little consideration in public forums. Scope is one of the things that differentiates topic maps and RDF. Even the RDF people acknowledge that they are missing something when they look at scope. The problem is that there is not a consensus about what scope is supposed to be and how it is supposed to work in an interchangeable fashion between topic map applications.

Some people will espouse the theory that the scope functionality is best left up to the application. As a user I cringe every time I hear that statement because it tells me that the functionality is probably underspecified. It also tells me that the likelihood of proprietary implementations of a so-called standard are quite possible, maybe even likely. There are several notable "standards" that when left up to implementations caused the creation of information that cannot be interchanged between systems, even though they are supposedly in compliance with the same standard. Doesn't this defeat the purpose of the standard in the first place?

The ability to embed XTM syntax is limited.

One of the most powerful things about RDF is that it can be embedded inside other documents, even at a small scale. The decision by the XTM committee to make topic maps a standalone data set seems to limit its applicability to merely interchange. Papers have been given with interesting possibilities for mixing XTM and SVG, for example. The problem is that the XTM definition states that a topic map within an SVG graphic must be an entire unit, rather than several smaller atomic units that can be interspersed throughout, as is possible with RDF.

The topic naming constraint is a bad idea.

The topic naming constraint is a rule within the topic model which says that if two topics have the same name in the same scope, they really represent the same topic and should therefore be merged. I have disliked this idea from the start. In SGML and XML, unique identification schemes exist to guarantee uniqueness of elements within a document. It's called an ID attribute. Why in God's name would you place some sort of limitation on the content within an information set, especially in a standard that touts the ability to merge as one of its strengths? Anyone who has looked under "Smith" in a large city phone book knows that names are not unique. Why burden a topic map developer with trying to develop some sort of bogus scope in order to prevent a merge from happening?

The authors of XTM specification had the opportunity to rectify this situation and didn't. Published subject indicators are available to provide a URI to uniquely identify the subject about which a topic speaks. This makes infinitely more sense than using names.

Conclusion

This paper has probably done one or more of the following things:

  • Made you think topic maps are a load of bull
    If so, please take another look, they really aren't. There is room for improvement, but the concepts are essentially good ones.
  • Upset you to the point of rushing the stage or throwing rotten food
    If so, please harness that energy and refocus it in nonviolent pursuits that make it impossible for me to make these statements at future conferences.
  • Made you curious about some of these statements
    Please take the time to learn more and argue the points. That's what this conference is best at.
  • Caused you to consider how you might be able to solve the issues
    There's always room in the community. Let's get to work.

If it has done one or more of the above things, I've accomplished my purpose for writing this paper.


Bibliography

[AHMED] Topic Maps - A Practical Introduction With Case Studies, Kal Ahmed, XML Europe 2002, Barcelona, Spain. http://www.idealliance.org/papers/xmle02/dx_xmle02/papers/03-05-01/03-05-01.html

[RDF-TM1] DAML and RDF Topic Maps, Nikita Ogievetsky, Knowledge Technologies 2002, Seattle, Washington, USA. http://www.cogx.com/kt2002

[RDF-TM2] Strategies for Subject Navigation of Linked Web Sites Using RDF Topic Maps, Carol Jean Godby, Knowledge Technologies 2002, Seattle, Washington, USA.

[SCOPE] Towards a General Theory of Scope, Steve Pepper and Geir Ove Gronmo, Extreme Markup Languages 2001, Montreal, Canada. http://www.ontopia.net/topicmaps/materials/scope.htm



So why aren't Topic Maps ruling the world?

Eric Freese [Electronic Data Foundry]
efreese@datafoundry.com