MYCAREVENT: OWL and the automotive repair information supply chain: In Praise of the Noble OWL

Martin Bryan
martin.bryan@csw.co.uk
Jay Cousins
jay.cousins@csw.co.uk

Abstract

As part of its Information Society Technology program, the European Commission funded the MYCAREVENT [Mobility and Collaborative Work in European Vehicle Emergency Networks] project from October 2004 to September 2007. Involving research institutes, vehicle manufacturers, device manufacturers, roadside assistance organizations, IT suppliers and others, the project seeks to generate new business models based on innovative services for automobile drivers. MYCAREVENT's ontology is expressed in OWL (W3C Web Ontology Language). It is derived from a Generic and Integrated Information Reference Model and terminology for repair information, symptoms, and faults. This ontology is described, along with problems encountered in its development, and the ways in which OWL answers problems encountered when integrating information from the diverse participants in MYCAREVENT.

Keywords: Semantic Web; Modeling

Martin Bryan

Martin learnt the term "markup" at school more than 40 years ago, working in hot-metal in the school printing club. Since then he has followed the evolution of markup languages from the ISO standardization of proofreaders marks to the latest batch of specialized XML-based markup languages.

Whilst not the longest serving member of the documentation preparation standards committees, Martin is one of the old timers, having put in 20 years of service for ISO and related standards bodies. He currently chairs the ISO working group creating a new generation of Document Schema Definition Languages, which includes RELAX NG, Schematron and his own "baby", the Document Schema Renaming Language (DSRL).

Recent research at CSW has led Martin to study the many possible roles of ontologies. After many years of study of various languages for the modelling information, he was not expecting to find that there was yet another language about to emerge. The power of OWL for modelling and configuring data using markup languages came as a pleasant surprise.

Jay Cousins

Jay is an Senior Consultant at CSW, a company specialising in helping businesses adopt XML technologies for the creation, management, and distribution of information. Jay works in business information analysis and modelling, specializing in the development of XML based architectures such as for NewsML and AdsML.

Jay has an M.Sc. in Analysis, Design, and Management of Information Systems from the London School of Economics, and a BA (Hons) in English with Comparative Literature from the University of East Anglia. He also studied at the Universität Salzburg, Austria under the ERASMUS exchange programme.

MYCAREVENT: OWL and the automotive repair information supply chain

In Praise of the Noble OWL

Martin Bryan [CSW Informatics]
Jay Cousins [CSW Informatics]

Extreme Markup Languages 2007® (Montréal, Québec)

Copyright © 2007 Martin Bryan and Jay Cousins. Reproduced with permission.

Introduction

Vehicle repair organizations, especially those involved in providing roadside assistance, have to be able to cope with servicing a wide range of vehicles produced by different manufacturers. Each manufacturer has its own vocabulary for describing components, faults, symptoms, etc, which is maintained in multiple languages. To be able to search available repair information to find relevant information for repairing broken-down vehicles anywhere within the European Single Market, the different vocabularies used to describe different makes and models of vehicles need to be integrated.

The MYCAREVENT [Mobility and Collaborative Work in European Vehicle Emergency Networks] project has brought together European vehicle manufacturers, vehicle repair organisations, diagnostic tool manufacturers and IT specialists, including semantic web technologists, to study how to link together the wide range of information sets used to identify faults and repair vehicles. MYCAREVENT research has shown that it is possible to integrate and access information sets through a single service portal using a vocabulary to which the source terminologies of the different organisations pariticipating in the project have been mapped. This vocabulary is recorded using the W3C Web Ontology Language (OWL). The project illustrates how OWL can be used to provide a method for integrating data from multiple systems and representing that data in such a way as to support querying.

This paper:

  • briefly introduces the main features of OWL
  • outlines some key benefits of OWL
  • explains the goals of the MYCAREVENT service portal
  • explains how OWL helped us to achieve the project goals
  • summarizes the conclusions reached on the suitability of OWL as a data management tool.

Introducing the high-flying OWL

Figure 1
[Link to open this graphic in a separate page]

"for Owl, wise though he was, able to read and write and spell his own name WOL…"

A.A. Milne, Winnie-the-Pooh

The W3C Web Ontology Language [W3C OWL] has been assigned, by a dyslexic lover of Winnie-the-Pooh's high-flying friend Wol, the acronym OWL. OWL is promoted by many as "the answer to the Semantic Web". Unfortunately nobody has yet adequately defined what question Semantic Web Technology is meant to answer! This paper suggests some ways in which OWL can provide answers to problems encountered by those of us trying to integrate the vast amounts of information that are available through the World Wide Web.

What's inside OWL?

OWL allows users to define:

  1. Class hierarchies
  2. Datatype properties
  3. Object properties
  4. Individual members of classes which have specific values for properties
  5. Rules for identifying class membership.

OWL also contains facilities for restricting the cardinality of properties, identifying where properties and individuals are equivalent to or the same as one another, identifying resources that define or provide additional information about classes or individual members of classes, joining classes to create class sets, etc. Many of these facilities are based on RDF/RDFS functionality [W3C RDF/XML][W3C RDFS]. OWL is, in practice, an RDFS application, OWL definitions being wrapped in an rdf:RDF wrapper.

Class hierarchies in OWL

Classes in OWL can form polyhierarchical trees whose members inherit the properties of all parent classes. This "is-a" type of relationship is similar to that used in programming languages for defining classes of programming objects. Members of subclasses are seen as members of all their parent classes. In addition they can be assigned properties and restrictions that are specific to themselves and any subclasses they may be an ancestor to.

OWL classes can be assigned labels in multiple languages, and be declared to be disjoint with sibling classes, so that an individual cannot be asserted to be a member of two sibling classes. Interestingly OWL class relationships are defined bottom up rather than top down as the following example illustrates:

   <owl:Class rdf:ID="RepairProcedure">
    <rdfs:label xml:lang="de">Reparaturverfahren</rdfs:label>
    <rdfs:label xml:lang="en">repair procedure</rdfs:label>
    <rdfs:label xml:lang="es">procedimiento de reparación</rdfs:label>
    <rdfs:comment
     rdf:datatype="http://www.w3.org/2001/XMLSchema#string">
      Procedures for replacing or repairing faulty components
    </rdfs:comment>
    <rdfs:subClassOf>
      <owl:Class rdf:about="#RepairInformation"/>
    </rdfs:subClassOf>
    <owl:disjointWith>
      <owl:Class rdf:about="#Maintenance"/>
    </owl:disjointWith>
   </owl:Class>

The ability to add multilingual labels to classes and individuals is a feature of OWL that is key for any globally oriented project, such as those funded by the European Commission and car manufacturers, which have to present concepts to users from different language communities.

OWL Datatype Properties

OWL datatype properties are properties in which data can be stored. Property values can be datatyped using any XML Schema datatype. An enumerated list of permitted values can be used to constrain the set of permitted values. Each datatype property can be applied to one or more OWL classes, known as the domain of the property. Datatype properties can be declared to be functional (only 0 or one allowed per individual). They can additionally be constrained for a specific class and all its subclasses as requiring a specific minimum and/or maximum number of occurrences. A typical datatype property definition has the form:

   <owl:DatatypeProperty rdf:about="#transmissionType">
    <rdfs:domain rdf:resource="#BuildSpecification"/>
    <rdfs:range
     rdf:resource="http://www.w3.org/2001/XMLSchema#string"/>
    <rdfs:comment
     rdf:datatype="http://www.w3.org/2001/XMLSchema#string">
      Records the identifier for the transmission system
      (i.e. the gear type) of the vehicle as assigned by the
      vehicle manufacturer.
    </rdfs:comment>
    <rdfs:label xml:lang="en">transmission type</rdfs:label>
    <rdfs:label xml:lang="es">tipo de transmisión</rdfs:label>
    <rdfs:label xml:lang="de">Getriebeart</rdfs:label>
   </owl:DatatypeProperty>

Datatype properties can be used to store information that is derived from a spreadsheet table, database field or metadata description of a resource within an ontology.

OWL Object Properties

OWL object properties record relationships between individuals. Each object property can be used by one or more OWL classes, known as the domain of the property, and can contain a reference to previously defined individuals of one or more OWL classes, the set of classes whose individuals can be linked to being referred to as the range of the property. Object properties can be declared to be functional (only 0 or one allowed per individual). They can also be constrained to having a specific minimum and/or maximum number of occurrences. Object properties may additionally be declared to be symmetrical, if they refer to members of the same class, transitive if they can be traversed as part of a sequence of relationships, or inverse functional if they can only refer to one individual through a unique key. Each object property can have an inverse relationship associated with it that can be used to navigate relationships in both directions. An object property with an inverse relationship could be defined as:

   <owl:ObjectProperty rdf:ID="describes">
    <rdfs:domain rdf:resource="#Term"/>
    <rdfs:range>
     <owl:Class>
       <owl:unionOf rdf:parseType="Collection">
         <owl:Class rdf:about="#BuildSpecification"/>
         <owl:Class rdf:about="#Condition"/>
         <owl:Class rdf:about="#Fault"/>
         <owl:Class rdf:about="#Symptom"/>
         <owl:Class rdf:about="#Vehicle"/>
         <owl:Class rdf:about="#VehicleSystem"/>
       </owl:unionOf>
     </owl:Class>
    </rdfs:range>
    <owl:inverseOf rdf:resource="#describedBy"/>
    <rdfs:label xml:lang="en">describes</rdfs:label>
    <rdfs:label xml:lang="de">beschreibt</rdfs:label>
    <rdfs:label xml:lang="es">describe</rdfs:label>
   </owl:ObjectProperty>

Object properties allow many different types of relationships to be recorded within an ontology, so that users do not have to be constrained to the narrow range of relationships provided by taxonomies based on relationships such as broader term, narrower term or related term.

Individual Class Members

OWL individuals are members of a class which have had valid values assigned to all compulsory properties. Each individual is uniquely identified by an RDF ID attribute, or is associated with an individual through its unique ID using an rdf:about attribute. Each individual can also be assigned annotation properties that provide multilingual labels, cross-references and version information. A simple individual could be defined as:

   <TestResult rdf:ID="TestResult_1">
    <testIdentifier  rdf:datatype="http://www.w3.org/2001/XMLSchema#string">
      VW EET 1
    </testIdentifier>
    <producedBy rdf:resource="#DiagnosticTest_2"/>
    <rdfs:label xml:lang="en">Exhaust Emissions Test 1</rdfs:label>
    <rdfs:label xml:lang="de">abgastest  1</rdfs:label>
    <rdfs:label xml:lang="es">prueba de emisiones de extractor  1</rdfs:label>
    <owl:versionInfo>V2 – 2007-07-26</owl:versionInfo>
    <rdfs:seeAlso rdf:resource="#DocumentedTestProcedure_23"/>
   </TestResult>

Individuals can be used to record the type of data stored in a spreadsheet row or database record within an ontology.

Rules for Identifying Class Membership

Two types of rules are used within OWL: necessary rules that restrict class properties and necessary and sufficient rules that identify a set of properties that are required if an individual is to be a member of a class. Rules are used to identify which inferred classes an individual should be assigned to. A typical restriction could be defined as:

    <owl:Class rdf:ID="VW">
     <owl:equivalentClass>
      <owl:Class>
       <owl:intersectionOf rdf:parseType="Collection">
        <owl:Restriction>
         <owl:onProperty rdf:resource="#make"/>
         <owl:hasValue rdf:datatype="&xsd;string">Volkswagen</owl:hasValue>
        </owl:Restriction>
        <owl:Class rdf:about="#Vehicle"/>
       </owl:intersectionOf>
      </owl:Class>
     </owl:equivalentClass>
     <rdfs:label xml:lang="es">Volkswagen</rdfs:label>
     <rdfs:label xml:lang="de">Volkswagen</rdfs:label>
     <rdfs:label xml:lang="en">Volkswagen</rdfs:label>
    </owl:Class>

Necessary and sufficient rules allow classes to be subdivided into subsets of individuals that can be used by specific applications.

Why deploy OWL?

OWL is a natural format for expressing object oriented programmable information in XML. I am not going to attempt to explain the benefits of XML here. If you chose to come to Extreme Markup Languages you will already know them as well as I do. But what are the benefits of being able to express programmable information in XML?

Programmable information differs from the sort of sequentially ordered information we normally find in XML documents in that it consists of a set of discrete pieces of information that are related to each other without having a predefined order in which they should be viewed.

When object oriented techniques are used, programmable information consists of a set of object classes, each class having a set of datatype properties associated with it, and a set of relationships between classes. This is exactly what OWL provides. OWL does not attempt to record any methods for handling objects: that is the domain of programming languages and is not a significant requirement when exchanging information. OWL provides a platform independent method of moving information from one programming environment to another.

OWL also allows rules to be defined that allow subclass membership to be inferred from the values of specific properties. This means that large sets of resources can be sub-classified into searchable sets of information that can be conveniently displayed to users.

Because OWL is defined as an RDFS application, it can make use of the standard RDFS functionality for providing multilingual labels for objects, and for identifying related individuals (rdfs:seeAlso). When combined with the owl:sameAs extension, these functions made it possible to use OWL to develop a multilingual terminology that can link together problem information recorded in a wide range of languages by European vehicle manufacturers as part of the EC-funded MYCAREVENT IST project (http://www.mycarevent.com). The rest of this paper illustrates how OWL was deployed within MYCAREVENT.

What is MYCAREVENT?

The IST Mobility and Collaborative Work in European Vehicle Emergency Networks (MYCAREVENT) project started in October 2004 and will run until September 2007. It brings together 5 research institutes and universities from different research areas (FIR, AU, UH-EDM, IKAT, ETHZ), 4 car manufacturers (BMW, VW, DC, Fiat), 1 on-board diagnostic device manufacturer (Omitec), 2 independent roadside assistant organizations (RAC, RACC), 1 mobile solutions provider (Care2Wear), 1 telecommunication solutions provider (Telefónica), 2 e-business consulting companies (MUL Services, CSW Group) , 1 standards body (DIN) and 3 IT suppliers (European Microsoft Innovation Centre, EURO IT&C, ESG).

The automotive market has become one of the most important and complex industries in the EU, due to the rapid development and change in electronics, electrics, software and hardware used to create modern vehicles.

The EU automotive aftercare market generates a turnover of about €84 billion a year; automotive replacement parts account for around half of this figure, some 45% of which is supplied by independent aftermarket (IAM) suppliers. The 210 million motorists in the EU spend on average approximately €5000 during the average vehicle's lifetime on repair and maintenance.

Major service providers in this sector are franchised dealers (120,000 dealers employing 1.5 million people) and independent repair shops (160,000 garages employing about 600,000 people). In addition, 18,000 roadside service vehicles fulfil 14 million missions a year.

With the introduction of the EU Block Exemption regulation in 2002, service providers have a legal right to access different kinds of repair information, training material and tools from all manufacturers supplying vehicles to Europe. Under the new regulations repairers cannot be forced to use original spare parts. Only if repair costs arise which are covered by the vehicle manufacturer, for example warranty work, free servicing and vehicle recall work, can the vehicle manufacturer insist on the use of original spare parts. Other than that, matching quality spare parts of the manufacturers or of independent suppliers can be used.

MYCAREVENT has brought together a wide range of partners to establish a Europe model of excellence that leverages innovative applications and state-of-the-art technologies to make the repair information supply chain more transparent, competitive and lucrative. It is developing and implementing new applications and services which can be seamlessly and securely accessed by mobile devices. These provide manufacturer specific car repair information that can be used to repair problems identified by Off/On Board Diagnostic (OBD) systems, mechanics or vehicle owners.

The breakdown information is presented in different languages to enable vehicle repairers to interact with service portals of independent service suppliers as well as those of car manufacturers in the language they best understand.

Using these solutions and features, the MYCAREVENT service portal is providing new business opportunities to service suppliers.

Three pilot scenarios have been designed as a proof-of-concept for the innovative techniques adopted for the MYCAREVENT portal:

  1. Pilot I shows solutions for OEM workshops and OEM roadside technicians; for roadside technicians, remote access to the MYCAREVENT service portal is provided so that they can retrieve repair instructions for specific repair cases from a specific manufacturer.
  2. Pilot II also offers the concept of remote access to car repair information from multiple manufacturers through an integrated MYCAREVENT service portal. This pilot focuses on providing independent workshops and roadside assistance services with information from a range of manufacturers.
  3. Pilot III demonstrates the concept of Driver Self-help Services. The driver is provided with access to the MYCAREVENT service portal to enable him help himself in situations where a breakdown can easily be solved with just a little advice.

The role of ontologies in MYCAREVENT

The MYCAREVENT Ontologies workpackage was responsible for developing the theoretical models, data structures and terminology sets used by other workpackages to create a Service Portal for accessing repair information. The workpackage brought together data modelling specialists, implementers and content providers (including motor manufacturers and roadside assistance organisations) to create a set of ‘information artefacts’ which can be used throughout MYCAREVENT.

The MYCAREVENT Ontologies workpackage has produced:

  1. A Generic and Integrated Information Reference Model [GIIRM], providing a high-level common conceptual model of the MYCAREVENT mobile service world, expressed in UML
  2. A set of W3C XML Schemas [Schemas] derived from the GIIRM, used for the representation of data in messages, metadata and interfaces
  3. Terminology for populating the GIIRM, enabling repair information, symptoms and faults to be described in an agreed way [Terminology]
  4. A W3C Web Ontology Language (OWL) ontology [Ontology], derived from the GIIRM and the terminology, in which data sources can be registered, and accessed, by the MYCAREVENT applications.

Detailed descriptions of all four components can be obtained from http://www.mycarevent.com/deliverables.aspx.

Taken together, these structures support a model whereby the user establishes a search query when their client accesses the project’s presentation tier, the presentation tier passes the query to the portal and the portal routes the queries to the appropriate back-end services. The information structures are used for representation of information and messaging between architectural components, providing standard interfaces for project developers.

To explain how this works, we begin with the GIIRM and then show how other artefacts are derived from and relate to this core model.

Figure 2
[Link to open this graphic in a separate page]

Conceptual model of the information items and relationships in the automotive repair information domain

Figure 2 shows the logical model used to describe the top-level information domain, identifying the principal objects and identifying the key relationships that "connect" repair information and fault descriptions. The main information object types are: Product (a Vehicle within MYCAREVENT), BuildSpecification, System (a VehicleSystem within MYCAREVENT), Condition, Symptom, DiagnosticTest, TestResult, Fault, RepairInformation, Term, Terminology and HumanLanguage. These object types define architectural forms from which more specific classes of information can be defined.

The information used to populate this model is externalized as ISO/IEC 14662 Open-edi Reference Model conformant Information Bundles1, each bundle representing a specific unit of business information required to support the functionality of the MYCAREVENT portal. An Information Bundle can itself be decomposed in to the specific units of business information that make the bundle; each of these units is a Semantic Component2 of the Information Bundle. Conceptually, an Information Bundle is an object and Semantic Components are its characteristics (properties); characteristics can be attributes, composite attributes, and relationships with other objects.

In MYCAREVENT Information Bundles represent the concept of a set of information exchanged between a user and the MYCAREVENT portal during interactions with the portal. Each Information Bundle contains the business information required to support a defined interaction between a user and the MYCAREVENT portal. Information Bundles enable the vehicle with the fault to be identified, the known fault or problem symptom to be described, and solutions to the described faults and problems to be returned as a list from which the user can choose the solutions they require.

The information model specifies how the values of an attribute are to be recorded in an lSO/lEC 11179-5:2005 conformant "representation form" that specifies the set of valid attribute values that an attribute is permitted to take. Depending on the complexity of the instance data to be recorded, attribute values can be represented as atomic values using datatypes or as non-atomic values using composite datatypes. Composite datatypes are used in contexts where an attribute value cannot be recorded as a single atomic value but needs to be recorded as a set of attribute values. Conceptually, a representation form can be either a composite datatype or a datatype.

An OWL ontology is used to formally represent the relationships between the concepts of the GIIRM, and to support the sharing of information across the portal. The relationships expressed between the GIIRM objects as specified in the GIIRM conceptual model form the core set of relationships defined in the ontology. The composite datatype attributes in the GIIRM model are represented as classes in the ontology, with the relationships between classes represented as object properties; simple datatype attributes are represented as datatype properties of classes. Figure 3 illustrates how the model is reflected in the ontology once the terminology-related components have been removed from the diagram.

Figure 3
[Link to open this graphic in a separate page]

Representation of the conceptual model within an ontology

In the ontology each of the conceptual components provided in the GIIRM is subclassed to provide a working set of classes. These subclasses can be further subclassed as required to create a working class hierarchy. For example, the TestResult class in the GIIRM has two subclasses in the ontology, SystemReadinessTest and DiagnosticTroubleCode, the second of which is further subdivided into ISO15031DTC and ManaufacturerSpecificDTC subclasses.

In some cases classes can be inferred from the properties of individuals. For example, the individuals of the BuildSpecification class record the build specifications for vehicles (e.g. the VW Polo 1.8 GTI). Although there is no asserted hierarchy to build specifications, there is the ability to infer hierarchical classes of them based on the properties of the vehicle. For example, since the VW Polo 1.8 GTI has properties identifying its manufacturer as a Volkswagen (the make property) and its model name as Polo (the model property) it could be automatically inferred to be a member of the both 'PoloBuildSpecification', and 'VWBuildSpecification' virtual classes.

Figure 4
[Link to open this graphic in a separate page]

Representation of classes and their properties within Protégé

Figure 4 shows how information relating to classes and properties used in the MYCAREVENT ontology are displayed in the Protégé OWL editor. The classes defined in the GIIRM are shown in the class hierarchy in the left-hand window. Classes that have been subclassed are indicated by an arrow next to their name. In the diagram the VehicleSystem class has been specialized into 8 subclasses, Body, Cooling and Heating, Electrical Equipment, Running Gear, Security, Transmission, Vehicle Breaking Systems and Vehicle Engine, each of which have been further specialized. In the case of the Cooling and Heating subclass a further level of classification has been applied to differentiate components used for Air Conditioning, Coolant Temperature Sensors, Heating, Radiation and Ventilation.

In the top right pane the name of the class has been defined in three languages using rdfs:label elements with xml:lang attributes to differentiate their application. In the central pane the relationships associated with the currently selected class are identified as being object properties associated with class members. It can be seen from this that the model used for the ontology is directly derived from the GIIRM but extends it as required to provide user-friendly navigation of available options through the displayed class hierarchy.

Managing Ontologies

While the representation of the GIIRM model within an OWL ontology proved to be fairly easy, once it was decided that we would not try to map the part-whole relationships required for a vehicle component mereology within OWL, the management of individuals within OWL repositories proved to be much more problematical.

The first problem was how to obtain data that could be used to populate the ontology. Manufacturers do not have OWL files they could supply: they have databases that contain some of the information, which could sometimes be joined to provide details of all the relationships that could be recorded in the ontology. But each manufacturer has a different set of databases requiring a different set of joins. By introducing a set of standardized Information Bundles that could be used to populate specific classes within the ontology we were able to decouple the processes of information generation from those of information dissemination.

The second problem we had was that the same identifiers could be allocated to different Information Bundles by different manufacturers. This could be simply overcome by using XML namespaces to qualify identifiers supplied by different manufacturers. The trick here was to use rdf:about attributes in place of rdf:ID identifiers. These allow the identifiers assigned to individuals to be namespace qualified, e.g. VW:BuildSpecification1234, using a namespace assigned to each manufacturer, rather than having to be unique across the whole ontology.

A third problem was how to manage the constant stream of updates to the ontology caused by revisions to vehicle model specifications, component availability, etc. Here the concept of a distributed virtual ontology came into play. Individuals supplied by a given manufacturer are stored in separate repositories and linked to the ontology model by use of nested sets of owl:imports statements. Using this technique the latest updates for a particular class of information could be maintained in a separate repository, which would be referenced from the ontology.

The use of nested sets of ontologies has additional advantages to the project. By identifying the make and model of car for which a fault has been reported and, where possible, the vehicle subsystem that was faulty, it becomes possible to restrict searches to specific parts of the repository, rather than having to search a single ontology containing details of hundreds of models made by dozens of manufacturers. Not only could this significantly increase the speed of searches it could also greatly improve the accuracy of the results, ensuring that users were provided only with links that were relevant to their problem.

A problem still remains, however, whenever information available for fault repair is generic in nature. For example, roadside repair organisations have prepared fault finding documents that allow mechanics to identify faults across a wide range of vehicles when conditions such as 'Engine Won't Start' or 'All Lights Failed' occur. Rather than trying to maintain the system in such a way that each new document has to be linked manually to all instances of vehicle build specification we need a way to link such information to all members of the class.

While OWL allows relationships between individuals and classes, use of this option makes it impossible to apply reasoning based on the use of Description Logics (DL) to such relationships. To be able to infer class membership you need to restrict your applications to the OWL DL subset of OWL, which means that you cannot include relationships between individuals and classes. As we needed to be able to infer class membership within our large ontology to provide navigable subsets from large classes, it was important that we did not require individuals to link to classes.

To overcome this problem we were able to create a set of privileged individuals to identify a generic instance of a particular class. Each class/subclass can be assigned a single generic member, A:ClassName, in a reserved namespace. When the reserved namespace is encountered in an individual name the Advanced Query System (AQS) could be programmed to understand that the relationship in question was actually applicable to all members of the class, not just to a single individual, although this meant that some individuals have to be treated differently from others. By adopting this simple to apply rule we would be able to retain the power of being able to reason using DL-aware reasoning engines while still being able to apply relationships to all members of a class.

MYCAREVENT Information Bundles provide a consistent format for recording information relating to individuals. By storing information from different manufacturers in different files, and separately importing these into the ontology that was being used to query the data, we were able to avoid one of the most annoying features of Protégé, its propensity to restructure the information presented to it when rewriting files. Protégé makes the greatest possible use of OWL's flexibility by nesting definitions whenever it first encounters a reference to them. This can lead to definitions being nested many levels deep. Such nesting makes searching for information within the XML file extremely difficult. One thing that we found early was that converting the OWL file to a canonical format made it much easier to process. What is frustrating, however, is that Protégé has a canonical form of the file, but you can only use this to display the XML, not to store it! Regrettably Protégé does not provide any way of saving this representation, other than copying and pasting it to another file.

Conclusions

OWL provides a simple way of representing data about programming objects in XML. It enables the recording of both the attributes of an object and the relationships between individual members of a class in the form of RDF triples without having to adopt the extremely basic subject, predicate, object representation defined in the RDF standard. Further, OWL allows us to move to an ontology language available in three flavours – Lite, DL, and Full. An RDF ontology is in OWL Full and is computationally incomplete. With OWL we have the power to limit our expressiveness so as to stay within computational bounds and so take advantage of DL tools.

OWL provides a simple yet efficient representation of information that is easily understood by both programmers and data users. It clearly identifies the datatype of each piece of information it stores. It clearly identifies each information object, and each reference to an information object that forms a relationship between two objects. Where required, OWL enables the automatic generation of bi-directional relationships.

OWL, unfortunately, allows nested constructs for representing data models, and provides multiple ways of describing the same thing. For example, to declare a property as being functional (only one occurrence allowed) you can either declare it using the owl:FunctionalProperty element or use an rdf:type property to declare that a datatype or object property is a functional property. In this latter case the type declaration can be made in a separate statement about the property, rather than in its main definition. This multiplicity of formats makes the processing of OWL data models problematic when using XSLT. It becomes much easier to process OWL data modules if they are stored in a canonical format without any nested definitions.

OWL provides facilities for importing files from multiple sources, allowing virtual ontologies to be created using data from a number of suppliers. Namespaces can be used to qualify identifiers and references to identifiers to ensure that clashes do not occur when the same identifier is used by different information suppliers.

OWL uses the RDFS sameAs construct to identify equivalence of individuals and provides a equivalentClass construct that can be used to identify relationships between classes. This means that when manufacturers assign different identifiers to the same products, or different names to the same type of information, the relationships between the two sets of entries can be recorded in a file that is outside of either of the information sets maintained by their suppliers. This file can be maintained as part of the virtual ontology while being version managed in a content management system.

Further areas of research have been identified as a result of the project. The issue of tooling and abbreviation of the OWL RDF/XML has added unwanted complexity as noted above with regard to XSLT - a canonical exchange format needs further research. Ontology design with the ontology modularised to allow subsetting has proved benefical for data management, but further research is needed to fully identify the boundaries between DL and Full so that ontology subsetting can be aligned with computational boundaries to ensure we can reason on sets of data with clearly defined boundaries. The problems of dealing with other types of relationships, such as part-of and adjacent-to, which do not fit neatly into is-a hierarchies, must also be addressed before we can provide all reasoning required for the project within an ontology controlled environment.

Notes

1.

Information Bundles are defined in ISO/IEC 14662 Open-edi Reference Model as "The formal description of the semantics of the recorded information to be exchanged by parties in the scenario of a business transaction. The Information Bundle models the semantic aspects of the business information. Information bundles are constructed using Semantic Components."

2.

Semantic Components are defined in ISO/IEC 14662 Open-edi Reference Model as “A unit of information unambiguously defined in the context of the business goal of the business transaction. A Semantic Component may be atomic or composed of other Semantic Components.”


Acknowledgments

The Advanced Query Service (AQS) was developed at CSW by Inigo Surguy. Integration of the schemas, ontology and AQS into the MYCAREVENT service portal was undertaken at MUL under the guidance of Dr. Frank Schönherr. Development of the ontology, information bundles and MYCAREVENT services was made possible thanks to the collaboration of too many of our co-workers to make specific reference to here, under the overall technical management of Anthony Scott of CSW.


Bibliography

[GIIRM] MYCAREVENT Deliverable 3.2 Generic and Integrated Reference Model, http://www.mycarevent.com/deliverables.aspx

[Ontology] MYCAREVENT Deliverable 3.5, Section 6: Ontology data structures for use by MYCAREVENT services Data Structure and XML Structure Definitions for MYCAREVENT , http://www.mycarevent.com/deliverables.aspx

[Schemas] MYCAREVENT Deliverable 3.5, Section 5: XML Schemas for Information Bundles, Data Structure and XML Structure Definitions for MYCAREVENT , http://www.mycarevent.com/deliverables.aspx

[Terminology] MYCAREVENT Deliverable 3.4 Terminology and Methodology for Populating the Generic Information Model, http://www.mycarevent.com/deliverables.aspx

[W3C OWL] OWL Web Ontology Language Reference, http://www.w3.org/TR/owl-ref/

[W3C RDF/XML] RDF/XML Syntax Specification (Revised), http://www.w3.org/TR/rdf-syntax-grammar/

[W3C RDFS] RDF Vocabulary Description Language 1.0: RDF Schema, http://www.w3.org/TR/rdf-schema/



MYCAREVENT: OWL and the automotive repair information supply chain

Martin Bryan [CSW Informatics]
martin.bryan@csw.co.uk
Jay Cousins [CSW Informatics]
jay.cousins@csw.co.uk