Definition of the
CIDOC
Conceptual Reference Model

Version 5.0.1

March 2009


 

 Table of Contents

  1. Initial Page
  2. Introduction
    1. Objectives of the CIDOC CRM
    2. Scope of the CIDOC CRM
    3. Compatibility with the CRM
      1. Utility of CRM compatibility
      2. The Information Integration Environment
      3. CRM-Compatible Form
      4. CRM Compatibility of Data Structure.
      5. CRM Compatibility of Information Systems.
      6. Compatibility claim declaration.
    4. Applied Form
      1. Terminology
      2. Property Quantifiers
      3. Naming Conventions
    5. Modelling principles
      1. Monotonicity
      2. Minimality
      3. Shortcuts
      4. Disjointness
      5. About Types
      6. Extensions
      7. Coverage
    6. Examples
  3. The Entity and Property List
  4. APPENDIX

Definition of the CIDOC Conceptual Reference Model

Introduction

 

This document is the formal definition of the CIDOC Conceptual Reference Model (“CRM”), a formal ontology intended to facilitate the integration, mediation and interchange of heterogeneous cultural heritage information. The CRM is the culmination of more than a decade of standards development work by the International Committee for Documentation (CIDOC) of the International Council of Museums (ICOM). Work on the CRM itself began in 1996 under the auspices of the ICOM-CIDOC Documentation Standards Working Group. Since 2000, development of the CRM has been officially delegated by ICOM-CIDOC to the CIDOC CRM Special Interest Group, which collaborates with the ISO working group ISO/TC46/SC4/WG9 to bring the CRM to the form and status of an International Standard.

Objectives of the CIDOC CRM

The primary role of the CRM is to enable information exchange and integration between heterogeneous sources of cultural heritage information. It aims at providing the semantic definitions and clarifications needed to transform disparate, localised information sources into a coherent global resource, be it within a larger institution, in intranets or on the Internet.

 

Its perspective is supra-institutional and abstracted from any specific local context. This goal determines the constructs and level of detail of the CRM.

 

More specifically, it defines and is restricted to the underlying semantics of database schemata and document structures used in cultural heritage and museum documentation in terms of a formal ontology. It does not define any of the terminology appearing typically as data in the respective data structures; however it foresees the characteristic relationships for its use. It does not aim at proposing what cultural institutions should document. Rather it explains the logic of what they actually currently document, and thereby enables semantic interoperability.

 

It intends to provide an optimal analysis of the intellectual structure of cultural documentation in logical terms. As such, it is not optimised to implementation-specific storage and processing aspects. Rather, it provides the means to understand the effects of such optimisations to the semantic accessibility of the respective contents.

 

The CRM aims to support the following specific functionalities:

  • Inform developers of information systems as a guide to good practice in conceptual modelling, in order to effectively structure and relate information assets of cultural documentation.
  • Serve as a common language for domain experts and IT developers to formulate requirements and to agree on system functionalities with respect to the correct handling of cultural contents.
  • To serve as a formal language for the identification of common information contents in different data formats; in particular to support the implementation of automatic data transformation algorithms from local to global data structures without loss of meaning. The latter being useful for data exchange, data migration from legacy systems, data information integration and mediation of heterogeneous sources.
  • To support associative queries against integrated resources by providing a global model of the basic classes and their associations to formulate such queries.
  • It is further believed, that advanced natural language algorithms and case-specific heuristics can take significant advantage of the CRM to resolve free text information into a formal logical form, if that is regarded beneficial. The CRM is however not thought to be a means to replace scholarly text, rich in meaning, by logical forms, but only a means to identify related data.

 

Users of the CRM should be aware that the definition of data entry systems requires support of community-specific terminology, guidance to what should be documented and in which sequence, and application-specific consistency controls. The CRM does not provide such notions.

 

By its very structure and formalism, the CRM is extensible and users are encouraged to create extensions for the needs of more specialized communities and applications.

Scope of the CIDOC CRM

The overall scope of the CIDOC CRM can be summarised in simple terms as the curated knowledge of museums.

 

However, a more detailed and useful definition can be articulated by defining both the Intended Scope, a broad and maximally-inclusive definition of general application principles, and the Practical Scope, which is expressed by the overall scope of a reference set of specific identifiable museum documentation standards and practices that the CRM aims to encompass, however restricted in its details to the limitations of the Intended Scope.

 

The Intended Scope of the CRM may be defined as all information required for the exchange and integration of heterogeneous scientific documentation of museum collections. This definition requires further elaboration:

 

  • The term “scientific documentation” is intended to convey the requirement that the depth and quality of descriptive information that can be handled by the CRM should be sufficient for serious academic research. This does not mean that information intended for presentation to members of the general public is excluded, but rather that the CRM is intended to provide the level of detail and precision expected and required by museum professionals and researchers in the field.
  • The term “museum collections” is intended to cover all types of material collected and displayed by museums and related institutions, as defined by ICOM[1]. This includes collections, sites and monuments relating to fields such as social history, ethnography, archaeology, fine and applied arts, natural history, history of sciences and technology.
  • The documentation of collections includes the detailed description of individual items within collections, groups of items and collections as a whole. The CRM is specifically intended to cover contextual information: the historical, geographical and theoretical background that gives museum collections much of their cultural significance and value.
  • The exchange of relevant information with libraries and archives, and the harmonisation of the CRM with their models, falls within the Intended Scope of the CRM.
  • Information required solely for the administration and management of cultural institutions, such as information relating to personnel, accounting, and visitor statistics, falls outside the Intended Scope of the CRM.

 

The Practical Scope[2] of the CRM is expressed in terms of the current reference standards for museum documentation that have been used to guide and validate the CRM’s development. The CRM covers the same domain of discourse as the union of these reference standards; this means that data for data correctly encoded according to these museum documentation standards there can be a CRM-compatible expression that conveys the same meaning.

 

Compatibility with the CRM

Utility of CRM compatibility

The goal of the CRM is to enable the integration of the largest number of information resources. Therefore it aims to provide the greatest flexibility of systems to become compatible, rather than imposing one particular solution.

Users intending to take advantage of the semantic interoperability offered by the CRM may want to make parts of their data structures compatible with the CRM. Compatibility may pertain either to the associations by which users would like their data to be accessible in an integrated environment, or to the contents intended for transport to other environments, allowing encoded meaning to be preserved in a target system

The CRM does not require complete matching of all user documentation structures with the CRM, nor that systems should always implement all CRM concepts and associations; instead it leaves room both for extensions, needed to capture the full richness of cultural information, and for simplifications, required for reasons of economy.

Furthermore, the CRM provides a means of interpreting structured information so that large amounts of data can be transformed or mediated automatically. It does not require unstructured or semi-structured free text information to be analysed into a formal logical representation. In other words, it does not aim to provide more structure than users have previously provided. The interpretation of information in the form of free text falls outside the scope of compatibility considerations. The CRM does, however, allow free text information to be integrated with structured information.

The Information Integration Environment

 

The notion of CRM compatibility is based on interoperability. Interoperability is best defined on the basis of specific communication practices between information systems. Following current practice, we distinguish the following types of information integration environments pertaining to information systems:

    1. Local information systems. These are either collection management systems or content management systems that constitute institutional memories and are maintained by an institution. They are used for primary data entry, i.e. a relevant part of the information, be it data or metadata, is primary information in digital form that fulfils institutional needs.
    2. Integrated access systems. These provide an homogeneous access layer to multiple local systems.  The information they manage resides primarily on local systems. We distinguish between:

a.    Materialized access systems, which physically import data provided by local systems, using a data warehouse approach. Such systems may employ so-called metadata harvesting techniques or rely on data submission. Data may be transformed to respect the schema of the access system before being merged. 

b.    Mediation systems, [Gio Wiederholt] which send out queries, formulated according to a virtual global schema, to multiple local systems and then collect and integrate the answers. The queries may be transformed to a local schema either by the mediation system or by the receiving local system itself.

Local systems may also import data from other systems, in order to complement collections, or to merge information from other systems. An information system may export information for migration and preservation.

Compatibility with the CRM pertains to one or more of the following data communication capabilities or use cases:

    1. data falling within the scope of the CRM can be exported from an information system into an encoded form without loss of meaning with respect to CRM concepts;
    2. data falling within the scope of the CRM can be transformed into another encoded form without loss of meaning with respect to CRM concepts;
    3. data falling within the scope of the CRM can be imported from an encoded form into an information system without loss of meaning with respect to CRM concepts;
    4. data falling within the scope of the CRM that is contained in an information system can be queried and retrieved exhaustively in terms of CRM concepts, subject to the expressive power of a particular query language.

 

Any declaration of CRM compatibility must specify one or more of the above use cases. System and data structure providers shall not declare their products as “CRM compatible” without specifying the appropriate use cases as detailed below.

In the context of this chapter, the expression “without loss of meaning with respect to the CRM concepts” means the following: The CRM concepts are used to classify items of discourse and their relationships. By virtue of this classification, data can be understood as propositions of a kind declared by the CRM about real world facts, such as “Object x. forms part of: Object y”. In case the encoding, i.e. the language used to describe a fact, is changed, only an expert conversant with both languages can assess if the two propositions do indeed describe the same fact. If this is the case, then there is no loss of meaning with respect to CRM concepts. Communities of practice requiring fewer concepts than the CRM declares may restrict CRM compatibility with respect to an explicitly declared subset of the CRM.

Users of this standard may communicate CRM compatible data, as detailed below, with data structures and systems that are either more detailed and specialized than the CRM or whose scope extends beyond  that of the CRM. In such cases, the standard guarantees only the preservation of meaning with respect to CRM concepts. However, additional information that can be regarded as extending CRM concepts may be communicated and preserved in CRM compatible systems through the appropriate use of controlled terminology. The specification of the latter techniques does not fall under the scope of this standard. Communities of practice requiring extensions to the CRM are encouraged to declare their extensions as CRM-compatible standards.

CRM-Compatible Form

The CRM is a formal ontology which can be expressed in terms of logic or a suitable knowledge representation language. Its concepts can be instantiated as sets of statements that provide a model of reality. We call any encoding of such CRM instances in a formal language that preserves the relations between the CRM classes, properties and inheritance rules  a “CRM-compatible form”. Hence data expressed in any CRM-compatible form can be automatically transformed into any other CRM-compatible form without loss of meaning. Classes and properties of the CRM are identified by their initial codes, such as “E55” or “P12”. The names of classes and properties of a CRM-compatible form may be translated into any local language, but the identifying codes must be preserved. A CRM-compatible form should not implement the quantifiers of CRM properties as cardinality constraints for the encoded instances. Quantifiers may be implemented in an informative way, or not at all. Statements that violate quantifiers should be treated as alternative knowledge.

Any encoding of CRM instances in a formal language that preserves the relations within a consistent subset of CRM classes, properties and inheritance rules is regarded a “reduced CRM-compatible form”, if:

    • all the conditions applicable to a CRM compatible form are respected;

    • the subset does not violate the rules of subsumption and inheritance;
    • any instance of the reduced CRM-compatible form is also a valid instance of a (full) CRM compatible form
    • the subset contains at least the following concepts:

 

E1

CRM Entity

E2

-

Temporal Entity

E4

-

-

Period

E5

-

-

-

Event

E7

-

-

-

-

Activity

E11

-

-

-

-

-

Modification

E12

-

-

-

-

-

-

Production

E13

-

-

-

-

-

Attribute Assignment

E65

-

-

-

-

-

Creation

E63

-

-

-

-

Beginning of Existence

E12

-

-

-

-

-

Production

E65

-

-

-

-

-

Creation

E64

-

-

-

-

End of Existence

E77

-

Persistent Item

E70

-

-

Thing

E72

-

-

-

Legal Object

E18

-

-

-

-

Physical Thing

E24

-

-

-

-

-

Physical Man-Made Thing

E90

-

-

-

-

Symbolic Object

E71

-

-

-

Man-Made Thing

E24

-

-

-

-

Physical Man-Made Thing

E28

-

-

-

-

Conceptual Object

E89

-

-

-

-

-

Propositional Object

E30

-

-

-

-

-

-

Right

E73

-

-

-

-

-

-

Information Object

E90

-

-

-

-

-

Symbolic Object

E41

-

-

-

-

-

-

Appellation

E73

-

-

-

-

-

-

Information Object

E55

-

-

-

-

-

Type

E39

-

-

Actor

E74

-

-

-

Group

E52

-

Time-Span

E53

-

Place

E54

-

Dimension

E59

Primitive Value

E61

-

Time Primitive

E62

-

String

 

Property id

Property Name

Entity – Domain

Entity - Range

P1

is identified by (identifies)

E1 CRM Entity

E41 Appellation

P2

has type (is type of)

E1 CRM Entity

E55 Type

P3

has note

E1 CRM Entity

E62 String

P4

has time-span (is time-span of)

E2 Temporal Entity

E52 Time-Span

P7

took place at (witnessed)

E4 Period

E53 Place

P10

falls within (contains)

E4 Period

E4 Period

P12

occurred in the presence of (was present at)

E5 Event

E77 Persistent Item

P11

   -   had participant (participated in)

E5 Event

E39 Actor

P14

   -   -   carried out by (performed)

E7 Activity

E39 Actor

P16

   -   used specific object (was used for)

E7 Activity

E70 Thing

P31

   -   has modified (was modified by)

E11 Modification

E24 Physical Man-Made Thing

P108

   -  -    has produced (was produced by)

E12 Production

E24 Physical Man-Made Thing

P92

   -   brought into existence (was brought into existence by)

E63 Beginning of Existence

E77 Persistent Item

P108

   -  -    has produced (was produced by)

E12 Production

E24 Physical Man-Made Thing

P94

   -   -   has created (was created by)

E65 Creation

E28 Conceptual Object

P93

   -   took out of existence (was taken out of existence by)

E64 End of Existence

E77 Persistent Item

P15

was influenced by (influenced)

E7 Activity

E1 CRM Entity

P16

   -   used specific object (was used for)

E7 Activity

E70 Thing

P20

had specific purpose (was purpose of)

E7 Activity

E7 Activity

P43

has dimension (is dimension of)

E70 Thing

E54 Dimension

P46

is composed of (forms part of)

E18 Physical Thing

E18 Physical Thing

P59

has section (is located on or within)

E18 Physical Thing

E53 Place

P67

refers to ( is referred to by)

E89 Propositional Object

E1 CRM Entity

P75

possesses (is possessed by)

E39 Actor

E30 Right

P81

ongoing throughout

E52 Time-Span

E61 Time Primitive

P82

at some time within

E52 Time-Span

E61 Time Primitive

P89

falls within (contains)

E53 Place

E53 Place

P104

is subject to (applies to)

E72 Legal Object

E30 Right

P106

is composed of (forms part of)

E90 Symbolic Object

E90 Symbolic Object

P107

has current or former member (is current or former member of)

E74 Group

E39 Actor

P127

has broader term (has narrower term)

E55 Type

E55 Type

P128

carries (is carried by)

E24 Physical Man-Made Thing

E73 Information Object

P130

shows features of (features are also found on)

E70 Thing

E70 Thing

P140

assigned attribute to (was attributed by)

E13 Attribute Assignment

E1 CRM Entity

P141

assigned (was assigned by)

E13 Attribute Assignement

E1 CRM Entity

P148

has component (is component of)

E89 Propositional Object

E89 Propositional Object

 

CRM Compatibility of Data Structure

 

A data structure is export-compatible with the CRM if it is possible to transform any data from this data structure into a CRM-compatible form without loss of meaning. Implicit concepts may be present in elements of the data structure that are not supported by the CRM. As long as these concepts can be encoded as instances of E55 Type (i.e. as terminology) and attached unambiguously to their respective data items with suitable properties, the data structure is still regarded as export compatible.

Note that not all CRM concepts may be represented by elements of an export-compatible data structure. All data from export-compatible data structures can be transported in a CRM-compatible form. In particular any CRM compatible form or reduced CRM-compatible form is export-compatible with the CRM.

A data structure is import-compatible with the CRM if it is possible to automatically transform any data from a CRM-compatible form into this data structure without loss of meaning, simply on the basis of knowledge about the data structure elements being used. This implies that a data record transformed into this data structure from a CRM-compatible form can be transformed back into the CRM-compatible form without loss of meaning. Note that the back-transformation into a CRM-compatible form may result in a data record that is semantically equivalent but not identical with the original.

Any CRM-compatible form is automatically import-compatible with the CRM. Note that an import-compatible data structure may be semantically richer than the CRM. It may contain elements that, through the use of a transformation algorithm, can be made to correspond to CRM concepts or specializations thereof or that contain elements with meanings that fall outside the scope of the CRM. However, it must not contain elements that overlap in meaning with CRM concepts and which cannot be subsumed via transformation by a CRM concept other than E1 CRM Entity and E77 Persistent Item. 

Import-compatible data structures may be used to transport data for applications that require concepts that lie beyond the scope of the CRM, as well as data from any export-compatible data structure. Note that, in general, applications may make use of data from a CRM import-compatible data structure that has been exported into a CRM compatible form by semantic reduction to CRM concepts, i.e. by generalizing all subsumed concepts to the most specific CRM concept applicable, and by discarding elements that fall outside the scope of the CRM.

A data structure is partially import-compatible with the CRM if the above holds for a reduced CRM-compatible form.

 

CRM Compatibility of Information Systems

An information system is export-compatible with the CRM if it is possible to export all user data from this information system into an import-compatible data structure. This capability is the recommended kind of CRM-compatibility for local information systems.

An information system is partially export compatible if it is possible to export all user data from this information system into a partially import-compatible data structure. This is not the recommended kind of CRM-compatibility, but it may not be feasible for legacy systems to acquire a higher level of CRM compatibility without unreasonable effort. This reduced level of CRM compatibility is nonetheless highly useful.

Note that there is no minimum requirement for the classes and properties that must be present in the exported user data. Therefore it is possible that the data may pertain to instances of just a single property, such as E21 Person. P131 is identified by: E82 Actor Appellation.

An information system is import-compatible with the CRM if it is possible to import data encoded in a CRM-compatible form and to access the data in a manner equivalent to and homogeneous with all generic data of this system that fall under the same concepts. This capability is considered as the normal kind of CRM compatibility for integrated access systems that physically copy source data in a data warehouse style (materialized access systems).

An information system is partially import-compatible with the CRM if it is possible to import data encoded in a reduced CRM-compatible form and to access the data in a manner equivalent to and homogeneous with all generic data of this system that fall under the same concepts. Depending on the functional requirements, it makes sense for integrated access systems to offer access services of reduced complexity by being only partially import-compatible with the CRM.

Note that it makes sense for integrated access systems to import data from extended data structures by semantic reduction to CRM defined concepts.

Note that local information system providers may choose to make their systems import-compatible with the CRM to be import-compatible with the CRM in order to exchange data, for example in the case of museum object loans or for system migration purposes. Communities of practice may choose to agree on import compatibility for extended data structures.

Some local information systems are likely to focus on specialized subject areas, such as inscriptions. For these specialized systems, the ability to import a specific data structure is recommended. This should be export-compatible with the CRM, and encompass the concepts that are required by the subject matter (“dedicated import compatibility”).

An information system is access-compatible with the CRM if it is possible to access the user data in the information system by querying with CRM classes and properties so that the meaning of the answers to the queries corresponds to the query terms used. It is not regarded as a reduction of compatibility if access is limited to data deemed to be exchanged.

An information system is partially access-compatible with the CRM if it is possible to access the user data in the information system by querying with a consistent subset of CRM classes and properties, corresponding to a reduced CRM-compatible form, so that the meaning of the answers to the queries corresponds to the query terms used.

An access-compatible system may be export-compatible with respect to the query answers. Note that it may make sense for an access-compatible content management system to return only content items in response to queries rather than being export compatible.

 

 

 

fig. 1: Possible data flow between different kinds of CRM-compatible systems and data structures

 

Fig. 1 shows a symbolic representation of some of the data flow patterns defined above between different kinds of CRM-compatible systems and data structures. In this figure it is assumed that the Local System B exports data into a CRM export-compatible data structure, which implies that it can be exported into a CRM-compatible form or any other CRM import-compatible data structure. Therefore Local System B is export-compatible with the CRM. For Local System A, the figure symbolizes the case where the exported data contain elements that correspond to specializations of the CRM or fall out of its scope.

Compatibility claim declaration

 

A provider of a data structure or information system claiming compatibility with the CRM has to provide a declaration that describes the kind of compatibility and, depending on the kind, the following additional information: 

    • For  export-compatible data structures:

The subset of CRM concepts directly instantiated by any possible data in this data structure after transformation into a CRM-compatible form.

    • For export-compatible systems:
      1. A declaration of configurable user data elements, if any, that are not semantically restricted to  a CRM Concept (other than E1 CRM Entity or E77 Persistent Item).
      2. User data elements or units that are not exported.
      3. The subset of CRM concepts directly instantiated by any possible data exported from the system after transformation into a CRM-compatible form.
    • For partially or dedicated import-compatible systems:

The subset of CRM concepts under which data can be imported into the system.

    • For  access-compatible systems:
      1. The query language by which the system can be queried.
      2. The subset of CRM concepts directly instantiated by any possible query answers exported from the system after transformation into a CRM-compatible form.
      3. For partially access-compatible systems, the subset of CRM concepts by which the system can be queried.

 

The provider should be able to demonstrate the claim with suitable test data. The provider should be able to demonstrate its claim according to certain procedures included in any applicable certificate practice related statement.

The provider should either make evidence of these procedures publicly available on the Internet on a site nominated by the ISO community of use, so that any third party is able to verify the claim with suitable test data, or acquire a certificate by a certification authority (CA).

A trusted third party recognised and authorised by a competent regulatory authority to act as a CA in this practice area, should be able to verify the credentials of the provider applying for such certificate and thus, of its claim with suitable test data, before issuing the certificate so that the users can trust the information in the CA certificates.

The CA will grant the provider of the certified system the right to use the “CRM compatible” logo.

 

Applied Form

The CRM is an ontology in the sense used in computer science. It has been expressed as an object-oriented semantic model, in the hope that this formulation will be comprehensible to both documentation experts and information scientists alike, while at the same time being readily converted to machine-readable formats such as RDF Schema, KIF, DAML+OIL, OWL, STEP, etc. It can be implemented in any Relational or object-oriented schema. CRM instances can also be encoded in RDF, XML, DAML+OIL, OWL and others.

 

Although the definition of the CRM provided here is complete, it is an intentionally compact and concise presentation of the CRM’s 86 classes and 137 unique properties. It does not attempt to articulate the inheritance of properties by subclasses throughout the class hierarchy (this would require the declaration of several thousand properties, as opposed to 137). However, this definition does contain all of the information necessary to infer and automatically generate a full declaration of all properties, including inherited properties.

Terminology

The following definitions of key terminology used in this document are provided both as an aid to readers unfamiliar with object-oriented modelling terminology, and to specify the precise usage of terms that are sometimes applied inconsistently across the object oriented modelling community for the purpose of this document. Where applicable, the editors have tried to consistently use terminology that is compatible with that of the Resource Description Framework (RDF)[3], a recommendation of the World Wide Web Consortium. The editors have tried to find a language which is comprehensible to the non-computer expert and precise enough for the computer expert so that both understand the intended meaning.

 

Class

A class is a category of items that share one or more common traits serving as criteria to identify the items belonging to the class. These properties need not be explicitly formulated in logical terms, but may be described in a text (here called a scope note) that refers to a common conceptualisation of domain experts. The sum of these traits is called the intension of the class. A class may be the domain or range of none, one or more properties formally defined in a model. The formally defined properties need not be part of the intension of their domains or ranges: such properties are optional. An item that belongs to a class is called an instance of this class. A class is associated with an open set of real life instances, known as the extension of the class. Here “open” is used in the sense that it is generally beyond our capabilities to know all instances of a class in the world and indeed that the future may bring new instances about at any time (Open World). Therefore a class cannot be defined by enumerating its instances. A class plays a role analogous to a grammatical noun, and can be completely defined without reference to any other construct (unlike properties, which must have an unambiguously defined domain and range). In some contexts, the terms individual class, entity or node are used synonymously with class.

 

For example:

Person is a class. To be a Person may actually be determined by DNA characteristics, but we all know what a Person is. A Person may have the property of being a member of a Group, but it is not necessary to be member of a Group in order to be a Person. We shall never know all Persons of the past. There will be more Persons in the future.

 

subclass

A subclass is a class that is a specialization of another class (its superclass). Specialization or the IsA relationship means that:

  1. all instances of the subclass are also instances of its superclass,
  2. the intension of the subclass extends the intension of its superclass, i.e. its traits are more restrictive than that of its superclass and
  3. the subclass inherits the definition of all of the properties declared for its superclass without exceptions (strict inheritane), in addition to having none, one or more properties of its own.

 

A subclass can have more than one immediate superclass and consequently inherits the properties of all of its superclasses (multiple inheritance). The IsA relationship or specialization between two or more classes gives rise to a structure known as a class hierarchy. The IsA relationship is transitive and may not be cyclic. In some contexts (e.g. the programming language C++) the term derived class is used synonymously with subclass.

 

For example:

Every Person IsA Biological Object, or Person is a subclass of Biological Object.

Also, every Person IsA Actor. A Person may die. However other kinds of Actors, such as companies, don’t die (c.f. 2).

Every Biological Object IsA Physical Object. A Physical Object can be moved. Hence a Person can be moved also (c.f. 3).

 

superclass

A superclass is a class that is a generalization of one or more other classes (its subclasses), which means that it subsumes all instances of its subclasses, and that it can also have additional instances that do not belong to any of its subclasses. The intension of the superclass is less restrictive than any of its subclasses. This subsumption relationship or generalization is the inverse of the IsA relationship or specialization.

In some contexts (e.g. the programming language C++) the term parent class is used synonymously with superclass.

 

For example:

“Biological Object subsumes Person” is synonymous with “Biological Object is a superclass of Person”. It needs fewer traits to identify an item as a Biological Object than to identify it as a Person.

 

intension

The intension of a class or property is its intended meaning. It consists of one or more common traits shared by all instances of the class or property. These traits need not be explicitly formulated in logical terms, but may just be described in a text (here called a scope note) that refers to a conceptualisation common to domain experts. In particular the so-called primitive concepts, which make up most of the CRM, cannot be further reduced to other concepts by logical terms.

 

extension

The extension of a class is the set of all real life instances belonging to the class that fulfil the criteria of its intension. This set is “open” in the sense that it is generally beyond our capabilities to know all instances of a class in the world and indeed that the future may bring new instances about at any time (Open World). An information system may at any point in time refer to some instances of a class, which form a subset of its extension.

 

scope note

A scope note is a textual description of the intension of a class or property.

Scope notes are not formal modelling constructs, but are provided to help explain the intended meaning and application of the CRM’s classes and properties. Basically, they refer to a conceptualisation common to domain experts and disambiguate between different possible interpretations. Illustrative example instances of classes and properties are also regularly provided in the scope notes for explanatory purposes.

 

instance

An instance of a class is a real world item that fulfils the criteria of the intension of the class. Note, that the number of instances declared for a class in an information system is typically less than the total in the real world. For example, you are an instance of Person, but you are not mentioned in all information systems describing Persons.

For example:

The painting known as the “The Mona Lisa” is an instance of the class Man Made Object.

 

An instance of a property is a factual relation between an instance of the domain and an instance of the range of the property that matches the criteria of the intension of the property.

 

For example:

“The Louvre is current owner of The Mona Lisa” is an instance of the property “is current owner of”.

 

property

A property serves to define a relationship of a specific kind between two classes. The property is characterized by an intension, which is conveyed by a scope note. A property plays a role analogous to a grammatical verb, in that it must be defined with reference to both its domain and range, which are analogous to the subject and object in grammar (unlike classes, which can be defined independently). It is arbitrary, which class is selected as the domain, just as the choice between active and passive voice in grammar is arbitrary. In other words, a property can be interpreted in both directions, with two distinct, but related interpretations. Properties may themselves have properties that relate to other classes (This feature is used in this model only in order to describe dynamic subtyping of properties). Properties can also be specialized in the same manner as classes, resulting in IsA relationships between subproperties and their superproperties.

In some contexts, the terms attribute, reference, link, role or slot are used synonymously with property.

 

For example:

“Physical Man-Made Thing depicts CRM Entity” is equivalent to “CRM Entity is depicted by Physical Man-Made Thing”.

 

subproperty

 

A subproperty is a property that is a specialization of another property (its superproperty). Specialization or IsA relationship means that:

  1. all instances of the subproperty are also instances of its superproperty,
  2. the intension of the subproperty extends the intension of the superproperty, i.e. its traits are more restrictive than that of its superproperty,
  3. the domain of the subproperty is the same as the domain of its superproperty or a subclass of that domain,
  4. the range of the subproperty is the same as the range of its superproperty or a subclass of that range,
  5. the subproperty inherits the definition of all of the properties declared for its superproperty without exceptions (strict inheritance), in addition to having none, one or more properties of its own.

 

A subproperty can have more than one immediate superproperty and consequently inherits the properties of all of its superproperties (multiple inheritance). The IsA relationship or specialization between two or more properties gives rise to the structure we call a property hierarchy. The IsA relationship is transitive and may not be cyclic.

Some object-oriented languages, such as C++, have no equivalent to the specialization of properties.

 

superproperty

 

A superproperty is a property that is a generalization of one or more other properties (its subproperties), which means that it subsumes all instances of its subproperties, and that it can also have additional instances that do not belong to any of its subproperties. The intension of the superproperty is less restrictive than any of its subproperties. The subsumption relationship or generalization is the inverse of the IsA relationship or specialization.

 

domain

The domain is the class for which a property is formally defined. This means that instances of the property are applicable to instances of its domain class. A property must have exactly one domain, although the domain class may always contain instances for which the property is not instantiated. The domain class is analogous to the grammatical subject of the phrase for which the property is analogous to the verb. It is arbitrary, which class is selected as the domain and which as the range, just as the choice between active and passive voice in grammar is arbitrary. Property names in the CRM are designed to be semantically meaningful and grammatically correct when read from domain to range. In addition, the inverse property name, normally given in parentheses, is also designed to be semantically meaningful and grammatically correct when read from range to domain.

 

range

The range is the class that comprises all potential values of a property. That means that instances of the property can link only to instances of its range class. A property must have exactly one range, although the range class may always contain instances that are not the value of the property. The range class is analogous to the grammatical object of a phrase for which the property is analogous to the verb. It is arbitrary, which class is selected as domain and which as range, just as the choice between active and passive voice in grammar is arbitrary. Property names in the CRM are designed to be semantically meaningful and grammatically correct when read from domain to range. In addition the inverse property name, normally given in parentheses, is also designed to be semantically meaningful and grammatically correct when read from range to domain.

 

inheritance

Inheritance of properties from superclasses to subclasses means that if an item x is an instance of a class A, then

  1. all properties that must hold for the instances of any of the superclasses of A must also hold for item x, and

all optional properties that may hold for the instances of any of the superclasses of A may also hold for item x.

 

strict

inheritance

Strict inheritance means that there are no exceptions to the inheritance of properties from superclasses to subclasses. For instance, some systems may declare that elephants are grey, and regard a white elephant as an exception. Under strict inheritance it would hold that: if all elephants were grey, then a white elephant could not be an elephant. Obviously not all elephants are grey. To be grey is not part of the intension of the concept elephant but an optional property. The CRM applies strict inheritance as a normalization principle.

 

multiple

inheritance

Multiple inheritance means that a class A may have more than one immediate superclass. The extension of a class with multiple immediate superclasses is a subset of the intersection of all extensions of its superclasses. The intension of a class with multiple immediate superclasses extends the intensions of all its superclasses, i.e. its traits are more restrictive than any of its superclasses. If multiple inheritance is used, the resulting “class hierarchy” is a directed graph and not a tree structure. If it is represented as an indented list, there are necessarily repetitions of the same class at different positions in the list.

For example, Person is both, an Actor and a Biological Object.

 

endurant, perdurant

“The difference between enduring and perduring entities (which we shall also call endurants and perdurants) is related to their behaviour in time. Endurants are wholly present (i.e., all their proper parts are present) at any time they are present. Perdurants, on the other hand, just extend in time by accumulating different temporal parts, so that, at any time they are present, they are only partially present, in the sense that some of their proper temporal parts (e.g., their previous or future phases) may be not present. E.g., the piece of paper you are reading now is wholly present, while some temporal parts of your reading are not present any more. Philosophers say that endurants are entities that are in time, while lacking however temporal parts (so to speak, all their parts flow with them in time). Perdurants, on the other hand, are entities that happen in time, and can have temporal parts (all their parts are fixed in time).” (Gangemi et al. 2002, pp. 166-181).

 

shortcut

A shortcut is a formally defined single property that represents a deduction or join of a data path in the CRM. The scope notes of all properties characterized as shortcuts describe in words the equivalent deduction. Shortcuts are introduced for the cases where common documentation practice refers only to the deduction rather than to the fully developed path. For example, museums often only record the dimension of an object without documenting the Measurement that observed it. The CRM allows shortcuts as cases of less detailed knowledge, while preserving in its schema the relationship to the full information.

 

monotonic

reasoning

Monotonic reasoning is a term from knowledge representation. A reasoning form is monotonic if an addition to the set of propositions making up the knowledge base never determines a decrement in the set of conclusions that may be derived from the knowledge base via inference rules. In practical terms, if experts enter subsequently correct statements to an information system, the system should not regard any results from those statements as invalid, when a new one is entered. The CRM is designed for monotonic reasoning and so enables conflict-free merging of huge stores of knowledge.

 

disjoint

Classes are disjoint if the intersection of their extensions is an empty set. In other words, they have no common instances in any possible world.

 

primitive

The term primitive as used in knowledge representation characterizes a concept that is declared and its meaning is agreed upon, but that is not defined by a logical deduction from other concepts. For example, mother may be described as a female human with child. Then mother is not a primitive concept. Event however is a primitive concept.

Most of the CRM is made up of primitive concepts.

 

Open World

The “Open World Assumption” is a term from knowledge base systems. It characterizes knowledge base systems that assume the information stored is incomplete relative to the universe of discourse they intend to describe. This incompleteness may be due to the inability of the maintainer to provide sufficient information or due to more fundamental problems of cognition in the system’s domain. Such problems are characteristic of cultural information systems. Our records about the past are necessarily incomplete. In addition, there may be items that cannot be clearly assigned to a given class.

In particular, absence of a certain property for an item described in the system does not mean that this item does not have this property. For example, if one item is described as Biological Object and another as Physical Object, this does not imply that the latter may not be a Biological Object as well. Therefore complements of a class with respect to a superclass cannot be concluded in general from an information system using the Open World Assumption. For example, one cannot list “all Physical Objects known to the system that are not Biological Objects in the real world”, but one may of course list “all items known to the system as Physical Objects but that are not known to the system as Biological Objects”.

 

complement

The complement of a class A with respect to one of its superclasses B is the set of all instances of B that are not instances of A. Formally, it is the set-theoretic difference of the extension of B minus the extension of A. Compatible extensions of the CRM should not declare any class with the intension of them being the complement of one or more other classes. To do so will normally violate the desire to describe an Open World. For example, for all possible cases of human gender, male should not be declared as the complement of female or vice versa. What if someone is both or even of another kind?

 

query containment

Query containment is a problem from database theory: A query X contains another query Y, if for each possible population of a database the answer set to query X contains also the answer set to query Y. If query X and Y were classes, then X would be superclass of Y.

 

interoperability

Interoperability means the capability of different information systems to communicate some of their contents. In particular, it may mean that

  1.  two systems can exchange information, and/or
  2.  multiple systems can be accessed with a single method.

 

Generally, syntactic interoperability is distinguished from semantic interoperability. Syntactic interoperability means that the information encoding of the involved systems and the access protocols are compatible, so that information can be processed as described above without error. However, this does not mean that each system processes the data in a manner consistent with the intended meaning. For example, one system may use a table called “Actor” and another one called “Agent”. With syntactic interoperability, data from both tables may only be retrieved as distinct, even though they may have exactly the same meaning. To overcome this situation, semantic interoperability has to be added. The CRM relies on existing syntactic interoperability and is concerned only with adding semantic interoperability.

 

semantic interoperability

Semantic interoperability means the capability of different information systems to communicate information consistent with the intended meaning. In more detail, the intended meaning encompasses

  1. the data structure elements involved,
  2. the terminology appearing as data and
  3. the identifiers used in the data for factual items such as places, people, objects etc.

 

Obviously communication about data structure must be resolved first. In this case consistent communication means that data can be transferred between data structure elements with the same intended meaning or that data from elements with the same intended meaning can be merged. In practice, the different levels of generalization in different systems do not allow the achievement of this ideal. Therefore semantic interoperability is regarded as achieved if elements can be found that provide a reasonably close generalization for the transfer or merge. This problem is being studied theoretically as the query containment problem. The CRM is only concerned with semantic interoperability on the level of data structure elements.

 

property quantifiers

We use the term property quantifiers for the declaration of the allowed number of instances of a certain property that an instance of its range or domain may have. These declarations are ontological, i.e. they refer to the nature of the real world described and not to our current knowledge. For example, each person has exactly one father, but collected knowledge may refer to none, one or many.

 

universal

The fundamental ontological distinction between universals and particulars can be informally understood by considering their relationship with instantiation: particulars are entities that have no instances in any possible world; universals are entities that do have instances. Classes and properties (corresponding to predicates in a logical language) are usually considered to be universals. (after Gangemi et al. 2002, pp. 166-181).

 

Property Quantifiers

Quantifiers for properties are provided for the purpose of semantic clarification only, and should not be treated as implementation recommendations. The CRM has been designed to accommodate alternative opinions and incomplete information, and therefore all properties should be implemented as optional and repeatable for their domain and range (“many to many (0,n:0,n)”). Therefore the term “cardinality constraints” is avoided here, as it typically pertains to implementations.

 

The following table lists all possible property quantifiers occurring in this document by their notation, together with an explanation in plain words. In order to provide optimal clarity, two widely accepted notations are used redundantly in this document, a verbal and a numeric one. The verbal notation uses phrases such as “one to many”, and the numeric one, expressions such as “(0,n:0,1)”. While the terms “one”, “many” and “necessary” are quite intuitive, the term “dependent” denotes a situation where a range instance cannot exist without an instance of the respective property. In other words, the property is “necessary” for its range.

 

many to many (0,n:0,n)

Unconstrained: An individual domain instance and range instance of this property can have zero, one or more instances of this property. In other words, this property is optional and repeatable for its domain and range.

 

one to many

(0,n:0,1)

 

An individual domain instance of this property can have zero, one or more instances of this property, but an individual range instance cannot be referenced by more than one instance of this property. In other words, this property is optional for its domain and range, but repeatable for its domain only. In some contexts this situation is called a “fan-out”.

 

many to one

(0,1:0,n)

An individual domain instance of this property can have zero or one instance of this property, but an individual range instance can be referenced by zero, one or more instances of this property. In other words, this property is optional for its domain and range, but repeatable for its range only. In some contexts this situation is called a “fan-in”.

 

many to many, necessary (1,n:0,n)

An individual domain instance of this property can have one or more instances of this property, but an individual range instance can have zero, one or more instances of this property. In other words, this property is necessary and repeatable for its domain, and optional and repeatable for its range.

 

one to many, necessary

(1,n:0,1)

 

An individual domain instance of this property can have one or more instances of this property, but an individual range instance cannot be referenced by more than one instance of this property. In other words, this property is necessary and repeatable for its domain, and optional but not repeatable for its range. In some contexts this situation is called a “fan-out”.

 

many to one, necessary

(1,1:0,n)

An individual domain instance of this property must have exactly one instance of this property, but an individual range instance can be referenced by zero, one or more instances of this property. In other words, this property is necessary and not repeatable for its domain, and optional and repeatable for its range. In some contexts this situation is called a “fan-in”.

 

one to many, dependent

(0,n:1,1)

 

An individual domain instance of this property can have zero, one or more instances of this property, but an individual range instance must be referenced by exactly one instance of this property. In other words, this property is optional and repeatable for its domain, but necessary and not repeatable for its range. In some contexts this situation is called a “fan-out”.

 

one to many, necessary, dependent

(1,n:1,1)

An individual domain instance of this property can have one or more instances of this property, but an individual range instance must be referenced by exactly one instance of this property. In other words, this property is necessary and repeatable for its domain, and necessary but not repeatable for its range. In some contexts this situation is called a “fan-out”.

 

many to one, necessary, dependent

(1,1:1,n)

An individual domain instance of this property must have exactly one instance of this property, but an individual range instance can be referenced by one or more instances of this property. In other words, this property is necessary and not repeatable for its domain, and necessary and repeatable for its range. In some contexts this situation is called a “fan-in”.

 

one to one

(1,1:1,1)

An individual domain instance and range instance of this property must have exactly one instance of this property. In other words, this property is necessary and not repeatable for its domain and for its range.

 

The CRM defines some properties as being necessary for their domain or as being dependent from their range, following the definitions in the table above. Note that if such a property is not specified for an instance of the respective domain or range, it means that the property exists, but the value on one side of the property is unknown. In the case of optional properties, the methodology proposed by the CRM does not distinguish between a value being unknown or the property not being applicable at all. For example, one may know that an object has an owner, but the owner is unknown. In a CRM instance this case cannot be distinguished from the fact that the object has no owner at all. Of course, such details can always be specified by a textual note.

Naming Conventions

The following naming conventions have been applied throughout the CRM:

  • Classes are identified by numbers preceded by the letter “E” (historically classes were sometimes referred to as “Entities”), and are named using noun phrases (nominal groups) using title case (initial capitals). For example, E63 Beginning of Existence.
  • Properties are identified by numbers preceded by the letter “P,” and are named in both directions using verbal phrases in lower case. Properties with the character of states are named in the present tense, such as “has type”, whereas properties related to events are named in past tense, such as “carried out.” For example, P126 employed (was employed in).
  • Property names should be read in their non-parenthetical form for the domain-to-range direction, and in parenthetical form for the range-to-domain direction.
  • Properties with a range that is a subclass of E59 Primitive Value (such as E1 CRM Entity. P3 has note: E62 String, for example) have no parenthetical name form, because reading the property name in the range-to-domain direction is not regarded as meaningful.
  • Properties that have identical domain and range are either symmetric or transitive. Instantiating a symmetric property implies that the same relation holds for both the domain-to-range and the range-to-domain directions. An example of this is E53 Place. P122 borders with: E53 Place. The names of symmetric properties have no parenthetical form, because reading in the range-to-domain direction is the same as the domain-to-range reading. Transitive asymmetric properties, such as E4 Period. P9 consist of (forms part of): E4 Period, have a parenthetical form that relates to the meaning of the inverse direction.
  • The choice of the domain of properties, and hence the order of their names, are established in accordance with the following priority list:
    • Temporal Entity and its subclasses
    • Thing and its subclasses
    • Actor and its subclasses
    • Other

Modelling principles

 

The following modelling principles have guided and informed the development of the CIDOC CRM.

Monotonicity

Because the CRM’s primary role is the meaningful integration of information in an Open World, it aims to be monotonic in the sense of Domain Theory. That is, the existing CRM constructs and the deductions made from them must always remain valid and well-formed, even as new constructs are added by extensions to the CRM.

 

For example:

One may add a subclass of E7 Activity to describe the practice of an instance of group to use a certain name for a place over a certain time-span. By this extension, no existing IsA Relationships or property inheritances are compromised.

 

In addition, the CRM aims to enable the formal preservation of monotonicity when augmenting a particular CRM compatible system. That is, existing CRM instances, their properties and deductions made from them, should always remain valid and well-formed, even as new instances, regarded as consistent by the domain expert, are added to the system.

 

For example:

If someone describes correctly that an item is an instance of E19 Physical Object, and later it is correctly characterized as an instance of E20 Biological Object, the system should not stop treating it as an instance of E19 Physical Object.

 

In order to formally preserve monotonicity for the frequent cases of alternative opinions, all formally defined properties should be implemented as unconstrained (many: many) so that conflicting instances of properties are merely accumulated. Thus knowledge integrated following the CRM serves as a research base, accumulating relevant alternative opinions around well-defined entities, whereas conclusions about the truth are the task of open-ended scientific or scholarly hypothesis building.

 

For example:

El Greco and even King Arthur should always remain an instance of E21 Person and be dealt with as existing within the sense of our discourse, once they are entered into our knowledge base. Alternative opinions about properties, such as their birthplaces and their living places, should be accumulated without validity decisions being made during data compilation.

Minimality

Although the scope of the CRM is very broad, the model itself is constructed as economically as possible.

 

·       A class is not declared unless it is required as the domain or range of a property not appropriate to its superclass, or it is a key concept in the practical scope.

·       CRM classes and properties that share a superclass are non-exclusive by default. For example, an object may be both an instance of E20 Biological Object and E22 Man-made Object.

·       CRM classes and properties are either primitive, or they are key concepts in the practical scope.

·       Complements of CRM classes are not declared.

Shortcuts

Some properties are declared as shortcuts of longer, more comprehensively articulated paths that connect the same domain and range classes as the shortcut property via one or more intermediate classes. For example, the property E18 Physical Thing. P52 has current owner (is current owner of): E39 Actor, is a shortcut for a fully articulated path from E18 Physical Thing through E8 Acquisition to E39 Actor. An instance of the fully-articulated path always implies an instance of the shortcut property. However, the inverse may not be true; an instance of the fully-articulated path cannot always be inferred from an instance of the shortcut property.

 

The class E13 Attribute Assignment allows for the documentation of how the assignment of any property came about, and whose opinion it was, even in cases of properties not explicitly characterized as “shortcuts”.

Disjointness

Classes are disjoint if they share no common instances in any possible world. There are many examples of disjoint classes in the CRM.

 

A comprehensive declaration of all possible disjoint class combinations afforded by the CRM has not been provided here; it would be of questionable practical utility, and may easily become inconsistent with the goal of providing a concise definition. However, there are two key examples of disjoint class pairs that are fundamental to effective comprehension of the CRM:

 

·       E2 Temporal Entity is disjoint from E77 Persistent Item. Instances of the class E2 Temporal Entity are perdurants, whereas instances of the class E77 Persistent Item are endurants. Even though instances of E77 Persistent Item have a limited existence in time, they are fundamentally different in nature from instances of E2 Temporal Entity, because they preserve their identity between events. Declaring endurants and perdurants as disjoint classes is consistent with the distinctions made in data structures that fall within the CRM’s practical scope.

·       E18 Physical Thing is disjoint from E28 Conceptual Object. The distinction is between material and immaterial items, the latter being exclusively man-made. Instances of E18 Physical Thing and E28 Conceptual Object differ in many fundamental ways; for example, the production of instances of E18 Physical Thing implies the incorporation of physical material, whereas the production of instances of E28 Conceptual Object does not. Similarly, instances of E18 Physical Thing cease to exist when destroyed, whereas an instance of E28 Conceptual Object perishes when it is forgotten or its last physical carrier is destroyed.

About Types

Virtually all structured descriptions of museum objects begin with a unique object identifier and information about the "type" of the object, often in a set of fields with names like "Classification", "Category", "Object Type", "Object Name", etc. All these fields are used for terms that declare that the object belongs to a particular category of items. In the CRM the class E55 Type comprises such terms from thesauri and controlled vocabularies used to characterize and classify instances of CRM classes.  Instances of E55 Type represent concepts (universals) in contrast to instances of E41 Appellation which are used to name instances of CRM classes.

 

E55 Type is the CRM’s interface to domain specific ontologies and thesauri. These can be represented in the CRM as subclasses of E55 Type, forming hierarchies of terms, i.e. instances of E55 Type linked via P127 has broader term (has narrower term). Such hierarchies may be extended with additional properties.

 

For this purpose the CRM provides two basic properties that describe classification with terminology, corresponding to what is the current practice in the majority of information systems. The class E1 CRM Entity is the domain of the property P2 has type (is type of), which has the range E55 Type. Consequently, every class in the CRM, with the exception of E59 Primitive Value, inherits the property P2 has type (is type of).  This provides a general mechanism for simulating a specialization of the classification of CRM instances to any level of detail, by linking to external vocabulary sources, thesauri, classification schema or ontologies.

 

Analogous to the function of the P2 has type (is type of) property, some properties in the CRM are associated with an additional property. These are numbered in the CRM documentation with a ‘.1’ extension. The range of these properties of properties always falls under E55 Type. Their purpose is to simulate a specialization of their parent property through the use of property subtypes declared as instances of E55 Type. They do not appear in the property hierarchy list but are included as part of the property declarations and referred to in the class declarations. For example, P62.1 mode of depiction: E55 Type is associated with E24 Physical Man-made Thing. P62 depicts (is depicted by): E1 CRM Entity.

 

The class E55 Type also serves as the range of properties that relate to categorical knowledge commonly found in cultural documentation. For example, the property P125 used object of type (was type of object used in) enables the CRM to express statements such as “this casting was produced using a mould”, meaning that there has been an unknown or unmentioned object, a mould, that was actually used. This enables the specific instance of the casting to be associated with the entire type of manufacturing devices known as moulds. Further, the objects of type “mould” would be related via P2 has type (is type of) to this term. This indirect relationship may actually help in detecting the unknown object in an integrated environment. On the other side, some casting may refer directly to a known mould via P16 used specific object (was used for).  So a statistical question to how many objects in a certain collection are made with moulds could be answered correctly (following both paths through P16 used specific object (was used for) - P2 has type (is type of) and P125 used object of type (was type of object used in). This consistent treatment of categorical knowledge enhances the CRM’s ability to integrate cultural knowledge.

 

In addition to being an interface to external thesauri and classification systems E55 Type is an ordinary class in the CRM and a subclass of E28 Conceptual Object. E55 Type and its subclasses inherit all properties from this superclass.  Thus together with the CRM class E83 Type Creation the rigorous scholarly or scientific process that ensures a type is exhaustively described and appropriately named can be modelled inside the CRM. In some cases, particularly in archaeology and the life sciences, E83 Type Creation requires the identification of an exemplary specimen and the publication of the type definition in an appropriate scholarly forum. This is very central to research in the life sciences, where a type would be referred to as a “taxon,” the type description as a “protologue,” and the exemplary specimens as “original element” or “holotype”.

 

Finally, types, that is, instances of E55 Type and its subclasses, are used to characterize the instances of a CRM class and hence refine the meaning of the class.  A type ‘artist’ can be used to characterize persons through P2 has type (is type of).  On the other hand, in an art history application of the CRM it can be adequate to extend the CRM class E21 Person with a subclass E21.xx Artist. What is the difference of the type ‘artist’ and the class Artist? From an everyday conceptual point of view there is no difference. Both denote the concept ‘artist’ and identify the same set of persons. Thus in this setting a type could be seen as a class and the class of types may be seen as a metaclass.  Since current systems do not provide an adequate control of user defined metaclasses, the CRM prefers to model instances of E55 Type as if they were particulars, with the relationships described in the previous paragraphs.

 

Users may decide to implement a concept either as a subclass extending the CRM class system or as an instance of E55 Type. A new subclass should only be created in case the concept is sufficiently stable and associated with additional explicitly modeled properties specific to it. Otherwise, an instance of E55 Type provides more flexibility of use. Users that may want to describe a discourse not only using a concept extending the CRM but also describing the history of this concept itself, may chose to model the same concept both as subclass and as an instance of E55 Type with the same name. Similarly it should be regarded as good practice to foresee for each term hierarchy refining a CRM class a term equivalent of this class as top term. For instance, a term hierarchy for instances of E21 Person may begin with “Person”.

 

Extensions

Since the intended scope of the CRM is a subset of the “real” world and is therefore potentially infinite, the model has been designed to be extensible through the linkage of compatible external type hierarchies.

 

Compatibility of extensions with the CRM means that data structured according to an extension must also remain valid as a CRM instance. In practical terms, this implies query containment: any queries based on CRM concepts should retrieve a result set that is correct according to the CRM’s semantics, regardless of whether the knowledge base is structured according to the CRM’s semantics alone, or according to the CRM plus compatible extensions. For example, a query such as “list all events” should recall 100% of the instances deemed to be events by the CRM, regardless of how they are classified by the extension.

 

A sufficient condition for the compatibility of an extension with the CRM is that CRM classes subsume all classes of the extension, and all properties of the extension are either subsumed by CRM properties, or are part of a path for which a CRM property is a shortcut. Obviously, such a condition can only be tested intellectually.

Coverage

Of necessity, some concepts covered by the CRM are less thoroughly elaborated than others: E39 Actor and E30 Right, for example. This is a natural consequence of staying within the CRM’s clearly articulated practical scope in an intrinsically unlimited domain of discourse. These ‘underdeveloped’ concepts can be considered as hooks for compatible extensions.

 

The CRM provides a number of mechanisms to ensure that coverage of the intended scope is complete:

  1. Existing high level classes can be extended, either structurally as subclasses or dynamically using the type hierarchy.
  2. Existing high level properties can be extended, either structurally as subproperties, or in some cases, dynamically, using properties of properties which allow subtyping.
  3. Additional information that falls outside the semantics formally defined by the CRM can be recorded as unstructured data using E1 CRM Entity. P3 has note: E62 String.

 

In mechanisms 1 and 2 the CRM concepts subsume and thereby cover the extensions.

 

In mechanism 3, the information is accessible at the appropriate point in the respective knowledge base. This approach is preferable when detailed, targeted queries are not expected; in general, only those concepts used for formal querying need to be explicitly modelled.

 

Examples

fig. 2 reasoning about spatial information

 

The diagram above shows a partial view of the CRM, representing reasoning about spatial information. Five of the main hierarchy branches are included in this view: E39 Actor, E51 Contact Point, E41 Appellation, E53 Place, and E70 Thing. The relationships between these main classes and their subclasses are shown as arrows. Properties between classes are shown as green rectangles. A ‘shortcut’ property is included in this view: P59 has section (is located on or within) between E53 Place and E18 Physical Thing is a shortcut of the path through E46 Section Definition. In some cases the order of priority for property names has been modified in order to facilitate reading the diagram from left to right.

 

As can be seen, an instance of E53 Place is identified by an instance of E44 Place Appellation, which may be an instance of E45 Address, E47 Spatial Coordinates, E48 Place Name, or E46 Section Definition such as ‘basement’, ‘prow’, or ‘lower left-hand corner.’ An instance of E53 Place may consist of or form part of another instance of E53 Place, thereby allowing a hierarchy of physical ‘containers’ to be constructed.

 

An instance of E45 Address can be considered both as an E44 Place Appellation–a way of referring to an E53 Place–and as an E51 Contact Point for an E39 Actor. An E39 Actor may have any number of instances of E51 Contact Point. E18 Physical Thing is found on locations as a consequence of being created there or being moved there. Therefore the properties P53 has former or current location (is former or current location of) (and P55 has current location (currently holds) are regarded as shortcuts of the fully articulated paths through the respective events. P55 has current location (currently holds) is a subproperty of P53 has former or current location (is former or current location of). The latter is a container for location information in the absence of knowledge about time of validity and related events.

 

An interesting aspect of the model is the P58 has section definition (defines section) property between E46 Section Definition and E18 Physical Thing (and the corresponding shortcut from E53 Place to E19 Physical Object). This allows an instance of E53 Place to be defined as a section of an instance of E19 Physical Object. For example, we may know that Nelson fell at a particular spot on the deck of H.M.S. Victory, without knowing the exact position of the vessel in geospatial terms at the time of the fatal shooting of Nelson. Similarly, a signature or inscription can be located “in the lower right corner of” a painting, regardless of where the painting is hanging.

 

fig. 3 reasoning about temporal information

 

This second example shows how the CRM handles reasoning about temporal information. Four of the main hierarchy branches are included in this view: E2 Temporal Entity, E52 Time-Span, E77 Persistent Item and E53 Place.

 

The E2 Temporal Entity class is an abstract class (i.e. it has no instances) that serves to group together all classes with a temporal component, such as instances of E4 Period, E5 Event and E3 Condition State.

 

An instance of E52 Time-Span is simply a temporal interval that does not make any reference to cultural or geographical contexts (unlike instances of E4 Period, which took place at a particular instance of E53 Place). Instances of E52 Time-Span are sometimes identified by instances of E49 Time Appellation, often in the form of E50 Date.

 

Both E52 Time-Span and E4 Period have transitive properties. E52 Time-Span has the transitive property P86 falls within (contains), denoting a purely incidental inclusion, whereas E4 Period has the transitive property P9 consists of (forms part of) that supports the decomposition of instances of E4 Period into their constituent parts. For example, the E52 Time-Span during which a building is constructed might falls within the E52 Time-Span of a particular government, although there is no causal or contextual connection between the two instances of E52 Time-Span; conversely, the E4 Period of the Chinese Song Dynasty consists of the Northern Song Period and the Southern Song Period.

 

Instances of E52 Time-Span are related to their outer bounds (i.e. their indeterminacy interval) by the property P82 at some time within, and to their inner bounds via the property P81 ongoing throughout. The range of these properties is the E61 Time Primitive class, instances of which are treated by the CRM as application or system specific date intervals that are not further analysed.

 


 

 



[1] The ICOM Statutes provide a definition of the term “museum” at http://icom.museum/statutes.html#2

[2] The Practical Scope of the CIDOC CRM, including a list of the relevant museum documentation standards, is discussed in more detail on the CIDOC CRM website at http://cidoc.ics.forth.gr/scope.html

[3] Information about the Resource Description Framework (RDF) can be found at http://www.w3.org/RDF/