cancel
Showing results for 
Search instead for 
Did you mean: 
resplin
Elite Collaborator
Elite Collaborator

Obsolete Pages{{Obsolete}}

The official documentation is at: http://docs.alfresco.com



Content Modeling
Type
Overview




Table of Contents


Introduction


This page describes the Repository Data Dictionary Service.  It's a service for managing metadata about Content Types, Aspects, and other Repository concepts.


Levels of Meta Data


Discussing meta data can become confusing as there are many levels of metadata.  The following diagram provides the terms that will be used from now on.

Description

The Data Dictionary service is associated with levels M1 and M2.  It provides a metamodel (M2) that allows Content Modellers to define their own Content Models (M1). 

Note: The NodeService is associated with level M0.

Note: Level M3 is unlikely to be supported directly by the Repository.  This level is usually associated with meta-model interchange.  E.g. convert UML to JSR-170 model


Requirements


What needs to be described?


  • Content Model
  • Classifications?
  • Services
  • Security
  • ...

Administration


  • Definition
    • Indirectly through Client Application User Interface (for end-users)
    • Tools (for administrators, developers)
    • Import / Bootstrap from XML
    • Metadata Package (for install/de-install)
  • Scope
    • Space (e.g. Folder, Category?)
    • Workspace??
    • Repository wide
    • Multiple Repositories
  • Storage
    • Repository Server vs. Seperate Data Dictionary Server
    • Export / Import between Repositories

Standards


  • Interchange
    • XMI??
    • UML??
  • Interoperability with other content management systems

M2 Content Meta-model


TODO: Meta-data for content oriented nodes (e.g. mime-type etc)

The Content Meta-model is described as follows:

Description

Notes:


  • NAME property type is used throughout to provide Namespace qualification of definitions.

Property Types


Property Types are the set of primitive Types upon which all other higher-level Types are built.  The M2 Content Meta-model (above) is itself constructed with them.

The supported Property Types are:


  • Any - Undefined
  • Name - Name qualified by Namespace
  • Guid - Globally unique ID
  • Text -
  • Content - unlimited length (text or binary)
  • Date -
  • DateTime -
  • Boolean -
  • Int -
  • Long -
  • Float -
  • Double -
  • Category - Location within Classification

The Repository will support a fixed set of Property Types.  In the future, the Repository will support the ability to define a new a property type (based on an existing definition) with specific value constraints e.g. positive number

Future Types may include:


  • Char
  • Duration

Value Constraints


TBD


End-User Modelling


Note: This section represents current understanding of how to support end-user notions of defining structure as defined here by PHH, LB, JN.  As such changes are expected over the next week or so.

Although a Content Model is constructed of Types and Aspects, it's still possible to present a different view for end-users who may wish to define structures in a looser fashion.  This could be accomplished by separating a Node definition into three distinct parts:


  • Space (Folder etc) - End User / Content Modeller Definition

Each Space defines a list of Aspects (primarily pre-defined) to be inherited by each instance that is created or linked into the Space.  Some Aspects may be explicitly defined by the Space (and thus private to that Space and its sub-spaces).  Within the User Interface, the user is picking from a pre-defined list of Aspects such as Translatable & Dublin Core, or defining a new simple Aspect such as a custom property list.


  • Type/Aspect Model - Content Modeller / Advanced Administrator Definition

Content definitions including properties, children and relationships and behaviour.  Definitions are named and scoped by namespace.  A Space may itself act as a namespace (or have an associated namespace).  Advanced definition may only be possible through a dedicated User Interface (restricted to modeller role)?.


  • Node Instances - Repository Maintained

Each Node is aware of its Type and applied Aspects.  The Type is specified on creation.  Aspects are either inherited on creation or manually assigned by the user at any time (choose from a pick list...).

Nodes inherit Aspects from their container(s) (in the following order) on Node creation:


  1. Type
  2. Workspace
  3. Folder / Category / Space
  4. Role??

Out-of-the-box Content Model (M1)


Using constructs defined in the M2 level Model, the Repository will provide out-of-the-box, Types and Aspects for:


M2 Classification Meta-model


TBD


M2 Service Meta-model


TBD


M2 Security Meta-model


TBD


Data Dictionary APIs


There are several public access points (for a Repository client) to meta-data:


  1. Dedicated Data Dictionary Service API
  2. Node Service API (navigational access to meta-data in the form of Nodes)
  3. Search Service API (query access to meta-data)

Access points 2 & 3 are provided using the Node Handles and Store Protocols pattern.

Note: The plan is to provide access point 1 in the first 4 month deliverable.


Data Dictionary Service


Dictionary Reference


A Node Reference is a handle onto an item within the Repository.  The Data Dictionary provides a specialised Node Reference known as a DDRef which has specific knowledge of Data Dictionary store protocols and provides convenience constructors for creating references to Data Dictionary items.

 public class DDRef extends NodeRef
{
    public DDRef(QName metaName)
    {
       super(new StoreRef('datadictionary', 'local'), metaName.toString());
    }
   

    public QName getQName()
    {
       ...
    }
}

Example references (human readable):

 datadictionary://local/{namespaceURI}file       (Type)
datadictionary://local/{namespaceURI}version    (Aspect)
datadictionary://local/{namespaceURI}file/name  (Property)

Services that accept a Node Reference can also accept a DDRef.  This is particularly useful for the Node and Search Service where we can register Data Dictionary implementations against the Data Dictionary Store protocol.


Data Dictionary 'Read' API


 DDRef[] getTypes();


DDRef[] getTypes(String namespace);


DDRef[] getAspects();


DDRef[] getAspects(String namespace);


ClassDefinition getClass(DDRef className)


TypeDefinition getType(DDRef typeName)


AspectDefinition getAspect(DDRef aspectName)


TypeDefinition getAnonymousType(DDRef type, DDRef[] aspects)


PropertyDefinition getProperty(DDRef property)


PropertyDefinition getProperty(DDRef class, String propertyName) 


AssociationDefinition getAssociation(DDRef association)


AssociationDefinition getAssociation(DDRef class, String associationName)


BehaviourDefinition[] getBehaviours()

Data Dictionary 'Write' API


TODO: Define


Packaging (Install/De-Install)


TBD


Data Dictionary Implementation


Possible Approaches


The Data Dictionary may be implemented:


  1. As a set of content nodes utilising the NodeService for persistence
  2. As a custom schema utilising Hibernate for persistence

Proposed Approach


Propose to go with Option 2.  Primarily because it allows us to develop Data Dictionary as a core building block thus skipping any chicken and egg bootstrap issues in the future, but also, allows for custom performance tuning as it's expected this service will be heavily used within Repository services.

To start with, the Dictionary will support:


  1. Bootstrap of xml model definitions held in file

Followed by:


  1. The persistence of model definitions in the Repository, allowing run-time creation (and possibly run-time modification)

Architecture


TODO: Provide diagram (service -> dao etc)




Dictionary Issues & Resolutions


Property / Association Naming


Issue: How do we identify a Property or Association?

Property and Association names are qualified names in their own right.  An organisation may wish to use any number of namespaces to organise property and association names within their model.  However, it's likely that only one namespace will be used, as an organisation will have complete control over their model naming, therefore avoiding clashes with appropriate local naming.

As a rule, a qualified name should not be re-used when it has been assigned to a property/association on an Aspect.  This prohibits a name clash with a type definition when an aspect is introduced to a node.  We can enforce this at design-time (first release - should).

When working with a property or association on a node, the Qname is specified.  The QName may or may not be provided in the context of the node type or node aspect.  Ignoring residual definitions (see below), either approach will resolve to a single definition in the dictionary.


Refining Property / Association Definitions in a Class Hierarchy


Issue: We cannot override property or association definitions in a class hierarchy.

This will be supported.  A sub-class can override the definition of a super-class property or association by providing a definition of the same qualified name.  There will be restrictions on overridden values e.g. mandatory property value cannot be relaxed, target class of association can only be sub-type of overridden target class.  Some values may not be overridden at all.  As such, I'm considering having an explicit PropertyOverride/AssociationOverride construct in the meta-model.


Clarification of Association / Child Definitions


Issue: Is that a type name or instance name?

Unlike JSR-170, all associations in the data dictionary are typed.  Child associations are a special kind (stereotype) of association which represent a parent/child relationship i.e. an aggregate.  Multiple child association types may exist for a given type/aspect.  The node service allows for access to all children (regardless of child association type) or the children of a specific child association type.

Each association in the meta-model is named; the name representing the association type.  For example, a folder may have the following associations: sub-folders (aggregate, 1-many-folder), files (aggregate, 1-many-file), visible-to (non-aggregate, many-many-users).  Following typical OO constructs, the dictionary will allow the definition of source and target cardinality.  We could also support a list of target classes where any of them are valid).

Child associations will also support the following additional meta-data to describe constraints on the parent/child instances themselves:


  1. requiredChildName (string) - optional
  2. allowDuplicateChildName (boolean) - default F   (Q: should this be duplicate within association type, or the parent node as a whole? Undefined in JSR-170)

Note: Non aggregate associations do not have a child instance name in our Node model and so the above meta-data is not required.

Note: Child name only refers to the name in the node path - not a property (called name) that may or may not exist on the node.


Residual Definitions (available later)


It's not clear how residual definitions map into our meta-model.  One approach is as follows...

First, I don't think we need the notion of a residual child (association) definition.  If a bag of ad-hoc child associations is to be made available (to who, I don't know - the app. developer - unlikely - or end-user?), then a child association type called whatever (e.g. residual and possibly even residual2) can be defined.

Second, I think that residual properties are really only useful on types.  Why?  App. developers are unlikely to need them, they're working with a stricter model - however, an end-user may want to attach ad-hoc properties to their content and so this feature is made available at the type level, not aspect.  Of course, an ad-hoc property bag can be implemented other ways.

So, I propose we support residual properties by adding the following to Type in our meta-model:


  1. allowResidualProperties  boolean default F
  2. residualPropertyTypes  PropertyType (1..*)  (list of one or more allowed property types)
  3. residualPropertiesIndexed  boolean default F

This approach means there are no '*' (or nameless) property or association definitions in the meta-model.  Residual property names will not find an associated property definition.


Node support for Property Type


Issue: Is there a quick and definitive answer to what type is this property?

If we're not already, I think we should consider storing the property type with the property value in our node schema.  It's often required, but also, it may not be same as the property type defined in the dictionary - cases are: ANY type and residual properties who don't have a definition at all.


Dictionary API


The direct PropertyDefinition lookup will be removed.  Lookup must be indirectly via Class or Aspect definition.


Formalising Content Type (idea)


Issue: Should we sub-type or use a property discriminator to describe the various 'types' of content e.g. SOP, Rule, Query, Order

Idea:

We extend our meta-model to include the notion of Content Type (which itself is derived from Type).  Additional meta-data e.g. required mime-type, xml schema... is specified on the Content Type.

Actual types of content are modelled using the dictionary (as if types) with each type supporting specific meta-data.  Sub-typing may be used (as with types); a root content content-type may exist in the domain model.

The list of available content types is provided by the dictionary.

So, effectively, we use sub-typing (type qname) to discriminate between content types, but formalise the notion of a content type in our meta-model.


Proposed Updated Meta-Model


Description


Appendix 1: M1 UML Notation


When developing content domain models (M1) using UML, the following notation is a useful guideline for ensuring that M2 concepts are mapped in a consistent way.

Description