A metadata network for bridging people and places

Mary Nishikawa
mnishikawa@email.com

Abstract

Multinational corporations often face the challenge of finding subject-matter experts many continents away. Conventional metadata in a corporate directory provide a starting point, but the Topic Maps model provides a means of enhancing these directories by giving them the ability to merge in metadata from diverse sources, such as other external corporate directories, with PSIs [Published Subject Indicators] now being standardized by OASIS [Organization for the Advancement of Structured Information Standards] committees, as well as a method for creating associations between the information. Employee and regional office metadata of parent and service companies will be bridged with associations and use of ISO 3166-1 country PSIs that were created by the OASIS GeoLang Technical Committee.

Keywords: Topic Maps; Metadata

Mary Nishikawa

Mary provides technical support for the EDMS [ Electronic Document Management System] at Schlumberger K. K. Japan, an Oilfield Services Technology Center. She previously worked in the pharmaceutical industry as a technical editor and biochemist, contributing to research on a currently marketed AIDS drug, and was a chemistry high school teacher before that. Her studies of the Philosophy of Aesthetic Realism (http://www.aestheticrealism.org/) have been instrumental in the modeling choices she makes for building information systems.

This paper is based on the Topic Maps Published Subjects and GeoLang Technical committee work at OASIS [Organization for the Advancement of Structured Information Standards.] She is a Japan Delegate for ISO/IEC JTC1/SC34 WG3 and has recently worked on gathering requirements for the Topic Map Constraint Language.

She is also a licensed Sogetsu Teacher of Japanese Flower Arrangement and enjoys life with two active grade-school children.

A metadata network for bridging people and places1

Mary Nishikawa [Schlumberger K. K.]

Extreme Markup Languages 2003® (Montréal, Québec)

Copyright © 2003 Mary Nishikawa. Reproduced with permission.

The challenge of finding subject matter experts

Business scenario

Modern multinational corporations realize the need to extend their search worldwide to find the service providers and experts they need to complete complex tasks. Work requires expertise from many fields: a pharmaceutical house would concentrate its efforts in biomedical areas and outsource all others; a telecommunications giant may want to outsource its customer support call centers and information technology help desk services; or a brokerage house may call upon an IT services firm for help with short term E-commerce projects. These service companies become critical to the success of the operations of the larger enterprise. Once expertise is outsourced to a services company, it is a challenge for the enterprise to find the people they need and for the service company to find projects they are best suited for.

System scope

Work collaboration system

Let us consider a work collaboration system for metadata about employees, regional offices of a large corporation, and service areas of a service company. It is a system for finding experts to staff short-term projects and for experts to find projects. The Topics Map paradigm when applied will provide an indexing layer that will bridge business offices and the people working there with experts in other locations working for the service company.

Subjects for the model

The subjects for the model include company-controlled metadata about employees (name, country assignment), metadata describing business offices of the large corporation and service company, employee-controlled metadata (subjects in a Curriculum Vitae), and public metadata (ISO 3166-1 country codes). Here, a minimum number of subjects are described. The enterprise needs to perform a cost-benefit analysis to determine which subjects are absolutely required and will provide the greatest benefits. We can envision this work collaboration system providing expanded views on the relationships between people working in separate companies and living is different countries. This relationship can be established by using standard, public metadata to describe country assignments of employees and locations of regional offices and service areas.

Introduction to Topic Maps

Basic concepts

Topic Maps, put simply, is a paradigm standardized by ISO/JTC1 SC34 to represent subjects in a computer system [ISO.13250]. It is comprised of TAO [Topics, Associations, Occurrences]. These symbols in a computer are called “topics,” and the “map” is an electronic index of terms. For subjects to be represented in a computer as a topic, they need to have names, occurrences, and associations. In a topic map, these subjects can be given a unique identifier URI [Universal Resource Identifier]. Occurrences are information resources about the subject. Associations establish relationships and define the roles in the relations between subjects.

Subjects in a Topic Map PSIs [Published Subject Indicators]

In Topic Map Standard ISO 13250, a subject is defined as anything whatsoever, and a subject indicator is a resource that provides a positive unambiguous indication of the identity of the subject. When that subject indicator is published and maintained at an advertised address, it becomes a “Published Subject Indicator.” Its corresponding URI string can be read by applications, and if the URI s are the same, the subjects can be merged together with their resources.

Association and occurrence

An association and occurrence can be thought about in real-world terms. In a simple example, we have the sentence, “CompanyXYZ assigns Jane Doe to the location, India.” The association is “assignment” and can be thought of as “assigned by” or “assigns” depending on what the subject and object are. The association roles are assignee, assigner, and location. Jane is the subject playing the role of assignee, CompanyXYZ is the subject playing the role of assigner; India is the subject playing the role of location. Her CV [Curriculum Vitae] is an information resource (occurrence) which gives further information about the subject “Jane Doe.”

Essential features of Topic Maps

There are two essential features of Topic Maps that distinguish it from other knowledge representations:

  • The subjects (abstract concepts) and physical information resources (electronic files, books on a library shelf, card catalogs) can be treated as “subjects” in their own right. There is a separate method for indicating the identity of abstract concepts versus tangible resources.
  • Information about the same subjects can be merged. The information modeled in a topic map representation processed by a Topic Map engine following the rules of the ISO 13250 enables this merging. In other words, the so-called “magic” is taken care of by the programmer who has designed a software module to process the data in the syntax according to the explicit rules of the ISO 13250 Topic Maps — Data Model [Garshol.Moore.03].

Standard and corporate metadata, registries, and information resources

Before we can begin bridging information in Topic Maps, we need a centralized and standard method of defining and managing metadata throughout the enterprise. This would require a standard registry of PSIs to identity subjects.

Standard codes as PSIs

The GeoLang Technical Committee is now publishing draft PSIs based on ISO 3166-1 for countries and ISO 639 for languages, and they are now available at http://psi.oasis-open.org. For the details of how to publish these indicators, please refer to the OASIS Committee Standard on the basic requirements for published subjects [Pubsubj.03], and see the above location for new publications. An enterprise could publish all geography and regions subject indicators in one centralized registry. These indicators can be used as “identity points” in various systems in the enterprise. These PSIs are used extensively as metadata throughout the model described here.

Enterprise PSIs

Company-controlled metadata about employees

This would be the primary record for an employee, and a fixed URI for this record would play a critical role in the identity of the employee in information systems.

Here is an example for one employee:

  • Given and Family Name: Jane Doe
  • unique identifier: jd43-1234
  • stable URI that includes the identifier: http://psi.companyxyz.com/employee?id=jd43-1234
  • Name of country of assignment: India
  • stable URI for ISO 3166 country code for the country assignment of India: http://psi.oasis-open.org/geolang/iso3166/#356 2

Employee-controlled metadata of subjects

This is the personal record, such as a CV, of all of the subjects the employee wanted to publish about himself or herself within the enterprise. First of all, we would need a unique identifier for the resource itself, such as http://psi.companyxyz.com/cv?id=jd43-1234, and this CV could include any of these subjects with their corresponding URI references for the published subject indicators:

  • Work History: http://psi.companyxyz.com/cv?id=jd43-1234#work
  • Expertise: http://psi.companyxyz.com/cv?id=jd43-1234#expertise
  • Education: http://psi.companyxyz.com/cv?id=jd43-1234#education
  • Professional Affiliations: http://psi.companyxyz.com/cv?id=jd43-1234#affiliations
  • Publications and Presentations: http://psi.companyxyz.com/cv?id=jd43-1234#papers-talks
  • External Contacts: http://psi.companyxyz.com/cv?id=jd43-1234#external-contacts

Company-controlled metadata of corporate offices or service areas

The metadata can follow what was defined for employees described above:

  • Name: India-Pakistan Corporate Office
  • Unique identifier: bu5-1234
  • Stable URI that includes the identifier: http://psi.company xyz.com/office?id=bu5-1234
  • ISO 3166-1 Country PSIs of the regional office, which is India: http://psi.oasis-open.org/geolang/iso3166/#356 and Pakistan: http://psi.oasis-open.org/geolang/iso3166/#616

External PSIs

Besides metadata used in companies, other subjects will need to be standardized as PSIs if they are to be used for merging information in Topic Maps. Some are standardized subjects such as the Dublin Core Metadata Elements. Other subjects may be people we know who are not our colleagues in the workplace.

Dublin Core metadata as PSIs

There are registries of subjects on the Web that could possibly be used in topic maps. All that is required is a stable URI with a description or definition. The Dublin Core Metadata Registry (http://dublincore.org/dcregistry/index.html) contains stable URI s that could be used as PSIs; one example is the URI for the Dublin Core Metadata element “language” defined as “A language of the intellectual content of the resource.” (http://purl.org/dc/elements/1.1/language) [DCMI.02]. This PSI could be used to type the language of the CV described above.

FOAF [Friend of a Friend] RDF [Resource Description Framework] Project

Standard metadata elements are one category of subjects, and people are another. In the CV example mentioned earlier, the PSI for the subject, “external contacts,” was mentioned, but PSIs for identifying people were not. It is not easy to reach agreement on the URI that could identify a person. A personal ID, such as a social security number, cannot be used in a URI . There are all kinds of security and safety issues.

Let us say that one of the external contacts described in the CV is Libby Miller. She is working with Dan Brickley on how to express the identity of people and other metadata about them on the Web through their FOAF vocabulary [FOAF.03]. A mailbox could be used to identity a person, perhaps.

<foaf:mbox rdf:resource= "mailto:libby.miller@bristol.ac.uk"/>
However, various security issues has been encountered when working with a personal mailbox. In discussions with Libby Miller, she mentioned that it may be better to use a one-way hash for the mailbox
<foaf:mbox_sha1sum>289d4d44325...</foaf:mbox_sha1sum>
instead of using the mailbox itself. However, the FOAF method of identifying people is to use an indirect method: that is, it uses the owl:inverseFunctional or damloil:unambiguousProperty. An RDF property defined this way points to something that can be used as a unique identifier for the person [Brickley.01].

This method, while useful within the context of distributed semantic networks, is not the best solution for Topic Maps, since the latter requires a direct method for identifying a subject. The differences can be better understood by reading a paper on how to convert FOAF data into Topic Maps [Garshol.03] and another on Subject Indicators for RDF [Pepper.Schwab.03]. Returning to the idea of having a public URI to identity a person, this may be best left for further discussions.

A corporate PSI registry

After defining these PSIs, there needs to be a central location within the company to publish them. Here is a use case for a corporate PSI registry:

Figure 1: PSI Registry
[Link to open this graphic in a separate page]

This registry will be a central web location within the intranet containing standard Published Subject Indicators with their accompanying URI references, that can be used for subjects in Topic Maps. The registry must be accessible by people and by systems that can retrieve the PSI URI s. Especially in the case of PSIs used to establish the identity of people, we need to have proper security in place. Designated creators and validators will be required, too, since any updates to a URI will change important metadata about a person.

The registry also will include a way for systems to validate the URI s used in Topic Maps, to ensure that they are being used within the appropriate security level. This will ensure that a PSI for confidential information is not inadvertently placed on the public website of the company.

One this registry is in place, all kinds of merging can be done in countless ways, but also within the proper security levels of access.

Building an integrated network of people and places

Employee metadata in a Topic Map

With Published Subject Indicators defined in our corporate PSI registry, we can now use their corresponding URI references to confer identity to our subjects in an employee topic map. Since the subjects (of people) are to be related to other subjects (places) in a computer system, some data representation and a way to process the information is needed. Here, the data is represented inXTM [XML Topic Map] syntax [Pepper.Moore.01] and viewed with a topic map browser.

As mentioned, every employee in the multinational or service companies must have in the system, a unique identifier, an assigned country of work, and possibly a CV with relevant information about education, past work, expertise, and list of external contacts.

Here is an example of how an employee could be represented in a topic map. The “Given” and “Family” names together is the base name. The CV of the employee is an occurrence.

Figure 2: Jane Doe as subject in a Topic Map
[Link to open this graphic in a separate page]

More information, such as country location assignment, can be added as an association.

Figure 3: Jane Doe with country assignment
[Link to open this graphic in a separate page]

The following figure shows one employee working for the parent company and another working for the service company. Both have different country assignments.

In a topic map browser, we can have a view of employees who are assigned to a particular country. Personnel can locate people within a country and browse their CVs. Topic maps are not limited to only searching through a browser. A Topic map query language will be standardized by ISO/JTC1/SC34 WG3and will enable a standardized way of performing search queries. There are already vendor implementations of a Topic Map Query Language [tolog.03].

The information in these two topic maps have not been merged yet. Jane and Taro remain with their fellow employees who have the same country assignments. The next step is to define the corporate office and regional service areas, and include countries in these areas.

Figure 4: Jane and Taro with country assignments
[Link to open this graphic in a separate page]

Scenario

The corporate office has a large project and is looking for experts; however, they have not been able to find the right people. Company XYZ has many local regional offices, whereas the service company has fewer employees with unique skills spread over large service areas. Consider the creation of a topic map containing the parent company employees merged with employees of the service company in a particular service area.

ISO 3166-1 country added to the service area

Returning to the two employees in separate companies and living in different countries, if it is decided that the range of work in one area would include the other, then these employees may in fact have a chance to work with each other.

For example, Jane, working for the parent company, is assigned to India, and her sphere of work is the India-Pakistan corporate office. On the other hand, we have Taro, working for the service company, who is assigned to Japan, and his sphere of work is the Pacific Rim region. Since both companies signed a contract to work on projects together, it has been decided that the closest geographic region to the India-Pakistan office is the Pacific Rim region of the service company. Hence, the countries of India and Pakistan are included in the Pacific Rim sphere of service, even though they do not geographically belong there.

Countries added to work regions can be a convenient way for companies to re-categorize groups of employees, since all the information about the employees is also merged into this new region. Both companies could temporarily rename the new workplace region, if they want to, to better represent the area.

Figure 5: Jane and Taro bridged
[Link to open this graphic in a separate page]

Looking at this model for the topic map, Jane is now included in the Asia Pacific Region, since India is now included in the region through the association “contained in.” The topic map would now include Taro in her same regional workplace. On the other hand, Taro may want to browse for job openings in projects in the new countries that have been added to his regional workplace.

Merging in of other Topic Map resources

More information is being represented in XTM . These Topic Maps, if assigned standard PSIs, can be merged with our Topic Maps that contain the same PSIs. Mondial.XTM is such a topic map, converted from the Mondial database by Lars Marius Garshol. The Mondial database is a case study for information extraction and integration by Wolfgang May. It has been compiled from the geographical Web data sources listed below:

  • CIA World Factbook
  • Global Statistics collected by Johan van der Heijden
  • Additional textual sources for coordinate
  • International Atlas by Kummerly & Frey, Rand McNally, and Westermann
  • Some geographical data of the Karlsruhe TERRA database
[mondial.03]

Since countries in the XTM file are also assigned the PSIs of ISO 3166-1, all information that is given identity with these PSIs will have their resources merged with the information assigned the same PSIs in the corporate XTM files. In other words, when the countries are browsed, they will include all of the information listed above in addition to all of the employee, parent office, and service area information in the other maps.

Summary

Multinational corporations need to find a way to bridge information about their own employees with those of service companies they do business with. If a bridge can be created, then the parent companies will have an easier time finding the experts they need, and the service companies will have an easier time finding the projects that need their skills. Topic Map technologies could provide the bridge between corporate metadata and information that we publish about ourselves. Published subject indicators for geography can be used to selectively merge the content in corporate web sites with information resources on the Web. The technologies described in this paper may provide the specialty indexing, view, and ease of search for all classified content in any enterprise. Organizing, displaying, and finding information within our own companies and on the Web now may just be one of the biggest challenges facing the IT industry in the 21st century.

Notes

1.

The views and opinions of the author expressed in this paper do not necessarily state or reflect those of the corporation.

2.

This is based on a proposal of the OASIS GeoLang Technical Committee, and the identifier URI s are not finalized yet.


Acknowledgments

I would like to thank fellow members of the OASIS committees and ISO/JTC1/SC34 WG3, Lars Marius Garshol, Robert Barta, and Motomu Naito for discussions and review of the paper, and members of the OASIS GeoLang and Published Subjects Technical Committee for many discussions on how to implement Published Subjects in a corporate environment. I would also like to thank Libby Miller for a discussion about subject identity and FOAF. Finally, I would like to thank Neill A. Kipp for providing a timely editorial review.


Bibliography

[Brickley.01] Brickley, Dan. Practical RDF puzzles. 2001. http://rdfweb.org/people/danbri/2001/12/puzzle/unicorny.html.

[DCMI.02] Wagner, Harry, and Heery, Rachel. DCMI Roadmap for Development of Vocabulary Management and Schema Registry Systems. DCMI Working Draft. 18 Feb 2002. http://dublincore.org/groups/registry/DCMI-reg-roadmapv4.html.

[FOAF.03] Brickley, Dan, Miller, Libby, and rdfweb-dev listmembers. FOAF: the friend of a friend vocabulary. May 2003. http://xmlns.com/foaf/0.1/.

[Garshol.03] Garshol, Lars Marius. Living with topic maps and RDF. In XML Europe 2003 Conference Proceedings. May 2003. http://www.ontopia.net/topicmaps/materials/tmrdf.html.

[Garshol.Moore.03] Garshol, Lars Marius, and Moore, Graham. The Standard Application Model for Topic Maps. Working Draft. April 3, 2003. http://www.isotopicmaps.org/sam/sam-model/.

[ISO.13250] International Organization for Standardization. ISO/IEC 13250 Topic Maps. Second Edition. 19 May 2002. http://www.y12.doe.gov/sgml/sc34/document/0322_files/iso13250-2nd-ed-v2.pdf .

[mondial.03] May, Wolfgang. Information Extraction and Integration with Florid: The Mondial Case Study. University at Freiburg TechReport Number 131. 1999. http://www.informatik.uni-freiburg.de/~may/Mondial/.

[Pepper.Moore.01] Pepper, Steve, and Moore, Graham. XML Topic Maps (XTM) 1.0. First Edition. 6 August 2001. http://www.topicmaps.org/xtm/1.0/.

[Pepper.Schwab.03] Pepper, Steve, and Schwab, Silvia. Curing the Web’s Identity Crisis Subject Indicators for RDF. In XML Europe 2003 Conference Proceedings. May 2003. http://www.ontopia.net/topicmaps/materials/identitycrisis.html .

[Pubsubj.03] Pepper, Steve. Published Subjects: Introduction and Basic Requirements. OASIS Published Subjects Technical Committee Recommendation, 2003-06-26. http://www.oasis-open.org/committees/tc_home.php?wg_abbrev=tm-pubsubj.

[tolog.03] Garshol, Lars Marius. tolog 0.1 . Ontopia Technical Report, 10 March 2003. http://www.ontopia.net/topicmaps/materials/tolog-spec.html.



A metadata network for bridging people and places1

Mary Nishikawa [Schlumberger K. K.]
mnishikawa@email.com