I've been trying to develop a good solid understanding of the data model used by Alfresco, particularly with respect to the RM component. I'm interested in the opportunity to use Alfresco/RM as the core of a strategy to transition from disparate software and processes for managing our corporate information, both electronic and physical form, to a single repository - a true content management system. I'm coming at this from a local government perspective, so please excuse my limited perspective/examples.
Looking at the RM model, and the dod5015 example, I'm having trouble understanding why non-electronic records were implemented as a distinct Type, rather than as an optional aspect to rma:record. To recap my understanding:
- A 'Type' is an absolute requirement of every entity in the system. An entity can only have one Type and direct Type-conversion is presumably disallowed (as it would cause information loss). It enforces collection of a specific and non-variable set of information, either key data extracted from the content in a structurally useful, reusable fashion (applicant name, permit number, etc.) or some specific mandatory metadata (secretary who keyed in the document vs. document signatory of record). Presumably in an ideal situation 'Types' would have a direct correlation to documented business processes (bylaw documents, dog license applications, water service connections, employment applications, capital works projects, correspondence etc.).
- An 'Aspect' is an optional addition to potentially any entity in the system. It also allows collection of a specific and non-variable set of information, but presumably from more of a metadata perspective than being based on content extraction (eg. geospatial coordinates). An Aspect has the advantage of not necessarily being confined to application to a single Type (eg. as a form of optional metadata) but rather allows metadata to be applied laterally throughout the content repository. Eg. dog license applications, water service connections, and capital works projects are all natural candidates for a geospatial aspect, employment applications are not, and bylaw documents may or may not be depending on the subject of the bylaw. This enables managing disparate information based entirely on the aspect (eg. tell me everything about this neigbourhood)
- 'Tags' are an entirely fluid addition to any entity that provides a mechanism for 'clustering' conceptually related items. Tags have the advantage that zero-to-many can be applied to an entity. This allows a single piece of content to be associated with multiple concepts (eg. a single piece of correspondence could potentially be tagged as associated with an instance of a bylaw as well as an instance of a water service connection and an instance of a capital works project). This enables clearly identifying single instances of information that 'belong to' or have relation to multiple business processes.
- 'Content' is the actual bit-stream that is 'attached' to an instance of a Type in the database that manifests the entity being referred to by that instance.
So, assuming all of the above are consistent with the design intent for Alfresco/RM, consider the process of applying Alfresco to an existing mixed document and records environment (using the fictional namespace 'lgma').
Likely even within a single entity Type there is a fairly deep collection of documents, potentially spanning multiple software versions and back all the way into stacks of paper (eg. cm:content->rma:record->lgma:buildingpermit). The variety of electronic content is already nicely handled by generating PDF renditions of the original content to facilitate indexing and retrieval, but at the moment the rma content model implies that the physical instances of are somehow informationally distinct (eg. cm:content->rma:nonElectronicDocument->lgmahysicalbuildingpermit ?). Presumably it would be more useful to have all like entities in one Type and simply have null content (or a placeholder) attached for physical items that have no electronic rendition available.
What happens then when the organisation finally finds money to scan a few years of the back catalog of purely paper content? Presumably the relevant instances of lgmahysicalbuildingpermit in the database would be destroyed and re-instanced as lgma:buildingpermit… but why? It's the same content, only the rendition has changed and the 'physical' aspect has been shed.
Similarly, it's extremely unlikely that an organisation would ever be able to convert 100% of it's paper holdings - what about all those lovely signed historical Bylaws with gold-leaf seals affixed? Alfresco can clearly add value by allowing a scanned rendition of the item to be attached as content and serve as a non-authoritative/convenience reference, effectively eliminating the access overhead associated with the original paper, but again the information in those persistent paper documents is not different from other instances of the same Type just because it's on paper.
The same arguments can be applied to retention schedule processing, accession procedures etc.
So… what have I misunderstood that adds value to having rma:nonElectronicDocument as a Type that's distinct from rma:record, rather than just implementing 'physicality' as an aspect?
Looking at the ongoing activity in this particular forum I sense I'm in the wrong place. Is there somewhere else I could go to try and explore the above topic?