Obsolete Pages{{Obsolete}}
The official documentation is at: http://docs.alfresco.com
The Transfer Service came into existence in version 3.3 of Alfresco. Transfer Service 3.3 Its purpose is to provide a means of pushing information out of an Alfresco core repository ('DM') to configured targets. The transfer service is accessible as a bean named 'TransferService' that is defined, along with other related beans, in the transfer-service-context.xml
Spring context file.
In version 3.4, the Transfer Service is a subsystem with an API that offers the following features:
As one might expect, the Transfer Service comprises two major parts: the part that is responsible for sending information from the source repository and the part that is responsible for receiving information in the target repository. The source repository pushes information to the target repository over a network transport. In 3.4 there is support for the use of HTTP and HTTPS across the network. Connections needed for a transfer always originate from the source.
Through the Transfer Service it is possible to create and persist information about any number of Transfer Targets. A Transfer Target records sufficient information about the target system to enable the service to establish an authenticated connection to it. Each transfer target record in the source repository is placed in a transfer target group. Currently there is just one transfer target group defined (called 'Default') and there is no means of creating new ones. The service is likely to be extended in the future to allow the management of transfer target groups.
Each transfer target is named, and the name must be unique within the transfer target group that contains it. Some operations on the TransferService interface allow for a transfer target name to be supplied but not the name of a transfer target group. In these cases the default transfer target group is assumed.
In order to transfer information to one of the configured transfer targets you simply create a Transfer Definition and pass it to the Transfer Service along with the name of the transfer target that you want to transfer to. A transfer definition identifies what should be transferred and has the potential to include some directives about how it should be transferred. In 3.3 a transfer definition comprises simply a collection of NodeRef objects. Note that it is acceptable for this collection to include NodeRefs of nodes that are in the archive store. When the target repository receives such a NodeRef during a transfer the corresponding node will be deleted.
When the transfer service receives a request for a transfer to be made, the first thing it does is export a snapshot of the nodes that are included in the transfer. This snapshot contains all the nodes' properties, but not the content of any properties of type d:content
- instead the relevant content URLs for the content files are included in the snapshot. This makes the snapshot relatively lightweight and quick to generate. Once the snapshot has been created, the transfer service makes contact with the specified target and initiates a transfer. Any given target can receive just one transfer at a time currently, as this ensures that conflicts can't occur. The target puts in place a lock (a node named '.lock' beneath the Data Dictionary/Transfers folder) and returns a unique identifier for the transfer, and the source then starts transmitting the snapshot.
As the target system receives the snapshot it streams it into a local staging area on disk. After having sent the snapshot, the transfer service then works out which content items are required and sends the necessary content files over. These are batched up in groups - one of the goals of the design was to minimize the 'chattiness' of the transfer protocol - and staged on the target system's local disk.
Once the snapshot and associated content files have been transmitted, the transfer service asks the target system to commit the data to its repository. At this point, the receiver on the target system parses the received snapshot and reproduces the contained information in its local repository. This is done in three stages by default - the first writing the nodes and their properties, the second dealing with associations and the third dealing with sync mode delete. It is possible to add new stages into this process if desired. When receiving a node, the receiver tries to resolve a corresponding node in the target repository based first on the node ref and then on the node's path. When a node is transferred that does not have a corresponding node in the target repository (either by path or by node ref) then a new node is created that has the same node ref as the transferred node.
Throughout the commit process, a record is written to a node in the target repository that lists what is being done. This node, stored in the 'Inbound Transfer Records' folder beneath 'Data Dictionary/Transfers', also has a few properties on it that records the transfer status. The name of this transfer record node is the date/time stamp that the transfer started. After the transfer has completed this transfer report is pulled back to the source system and written as the 'destination transfer report' which is a sibling of the 'client transfer report' placed below the transfer target.
On the source end of the transfer, the caller may choose whether the transfer should be carried out synchronously (transfer
) or asynchronously (transferAsync
). Whichever version is used, the caller may optionally provide one or more callback objects (implementing the TransferCallback
interface). As the transfer proceeds these objects are notified of progress by events being passed to their processEvent operation. One of these events (TransferEventBegin
) contains the transfer identifier, and, once received, this can be used by the caller to cancel an 'in-flight' transfer.
As well as the interfaces and mechanisms needed to actually carry out the transfer, there are also a few classes intended to help build the set of nodes that the caller wants to transfer. The relevant interfaces are NodeCrawlerFactory, NodeCrawler, NodeFinder, and NodeFilter (all in the package org.alfresco.service.cmr.transfer
). There is one implementation of each of the NodeCrawlerFactory and NodeCrawler interfaces (the standard NodeCrawlerFactory is a bean named 'NodeCrawlerFactory'). There are a couple of NodeFinder implementations that enable associations to be traversed (child and peer), and one NodeFilter implementation that enables content of given classes (types and aspects) to be included and excluded from the node crawl. It's simple to add new finders and filters to provide custom behaviour that meets a particular need.
Note that the interface exposed by the target repository (the receiver) should be considered an internal interface. It is liable to change over time, and no effort will be made to retain backwards compatibility.
As mentioned above, when requesting a transfer it is possible to supply a collection of TransferCallback
objects. The TransferCallback
interface defines one operation:
void processEvent(TransferEvent event);
As the transfer proceeds, events are raised and passed to each of the callback objects. The classes of events that can be raised are:
START
, SENDING_SNAPSHOT
, SENDING_CONTENT
, PREPARING
, COMMITTING
, SUCCESS
, and ERROR
. The state of the transfer is always available from any event via its getTransferState
operation.getPosition
that indicates where the process is up to at the moment and getRange
that indicates where the process has to get to before it is complete. Note that the value of the range can change as the process proceeds.getException
that can be used to help determine the cause of the problem.Alfresco 3.4 contains an aspect, trx:transferred, that indicates that a node has been transferred via the transfer subsystem.
It contains two fields, the repository id of the 'originating' system which is the repository that the node is first created and the 'from' repository id which is the repository id of the system that transferred the node to the local repository.
The basic property sheet for this aspect is included with the configuration of Alfresco Explorer.
A UI feature of Share presents the option to edit a transferred node on the originating instance of Share rather than the local repository.
This is an implementation detail that may change in future versions of alfresco.
Alfresco 3.4 contains an aspect, trx:alien that contains a multi-valued property of which repositories 'invade' the local repository. See below for more information.
Alfresco 3.4 adds a new 'mode' of transfer called 'sync mode'. There is a boolean flag on the transfer definition to specify whether transfer is sync mode or not.
Sync mode adds extra processing to infer by the absence of an association between the parent node and child node that a child node should be deleted.
Sync Mode Transfer Slide 1.GIF
In the example of the screenshot above when node A1 is transferred there is an association between A1 and A2 so A2 remains, however there is no association between A1 and A3 so A3 is deleted.
However although the requirement above sounds simple, what happens if there are associations to content that was not trasferred or was transferred from a different repository? For example if an 'images' folder is transferred and then content is added from the local repository. If transfer is not careful then sync mode transfer will incorrectly delete content that does not exist on the transferring repository.
The first part of the solution is to mark all transfered nodes with an aspect (trx:transferred) which says which repository the transfered node is from. So now transfer can determine whether nodes to delete that are from the sending system. Transfer will not delete nodes that are not from the transferring repository.
In the example of the screenshot above the node B3 is a local node. So transfer of A1 must not delete B3.
The implementation of sync mode introduces the concept of 'Alien' nodes which have been 'invaded' by another repository. In general, Alien nodes cannot be deleted by the transfer service. There is an aspect trx:alien that tracks which repositories have invaded a node. In the screenshot above nodes A1 and B3 are marked as aliens since B3 is an 'invader' even though it is a local node.
With multiple repositories transferring content in a hub and spoke system you can end up with more complex scenarios.
Transfer Service Multi invasion.gif
In the example above B1 and B6 are local nodes. Howeber B6 is an invader since it is a child of a transferred node, C2, that has come from repository C. This example is also complicated by the fact that C2 has a node transferred from repository A. So node C2 is invaded by both repository A and repository B.
If sync mode transfer determines that a folder should be deleted but can't delete the folder since it contains alien content then what happens?
The behaviour is that transfer service has to leave this folder in place but 'prune' all content that is 'from' the transferring system. The other content is left alone.
So in the example above if the node A1 is transferred after A2 has been deleted then node A2 should be deleted and all the children (A4, A5, A6, A7, A8) cascade deleted. However the presence of alien node B10 means that A2 must remain since it is the parent of B10 whic must not be deleted. And the children A4, A5, A6, A7 and A8 be pruned.
The classes and interfaces that comprise the public API to the Transfer Service are located in the org.alfresco.service.cmr.transfer
package. The core of the implementation is in the org.alfresco.repo.transfer
package and its sub-packages manifest
, report
, and script
. Log levels can be adjusted for these packages if more or less log information is desired from the transfer mechanisms.
The transfer service stores files that control and monitor the operation of the transfer service in the Transfers' space in the Data Dictionary.
Contains the transfer target definitions that specify where transfers go to. There is a 'group' level below the Transfer target Groups folder which is/will be used for classifying different sets of transfer targets.
At the moment (3.4) there is only a single group called 'default group'. Add your transfer targets to the 'default group' either through the TransferService API or by creating a 'folder' using Alfresco Explorer or Alfresco Share. There is a rule defined on the transfer groups folder to specialize the type of any folder created within it.
Space used during processing of a transfer.
On the client side of transfer, transfer reports are created as children of the transfer target. Use Alfresco Share or Alfresco Explorer to view them.
Stores the transfer reports for transfers that have come into this system.
Not used and removed from future versions of Alfresco.
TransferTarget target = transferService.create('The Other Repo');
target.setEndpointProtocol('https');
target.setEndpointHost('other.repo.example.com');
target.setEndpointPath('/alfresco/service/api/transfer');
target.setUsername('remoteperson');
target.setPassword('password'.toCharArray());
transferService.saveTransferTarget(target);
Note that a transfer target must be committed into the repository before it can be used for a transfer.
You can also create a transfer target through Alfresco Explorer or Alfresco Share. Simply create a folder in the Company_Home/Data_Dictionary/Transfer/Transfer Targets/Default Group. A rule will run to specialize the node type to trx:transferTarget. The new node contains the properties you can fill in through the user interface to set up your target.
//This example walks a tree of nodes starting at a given root node (assumed to be known already). It traverses
//only associations of type 'cm:contains' (therefore, presumably, the root node is of type cm:folder (or subtype))
NodeCrawler crawler = nodeCrawlerFactory.getNodeCrawler();
crawler.setNodeFinders(new ChildAssociatedNodeFinder(ContentModel.ASSOC_CONTAINS));
Set<NodeRef> nodesInTree = crawler.crawl(rootNode);
//This snippet uses the target name and set of nodes used in the previous examples.
TransferDefinition transferDef = new TransferDefinition();
transferDef.setNodes(nodesInTree);
NodeRef transferReportNode = transferService.transfer('The Other Repo', transferDef);
When a node is transferred, a package of information about it is sent from the source repository to the target repository. Among other things, that information includes:
This information is used by the transfer receiver in the target repository to work out where the transferred node should be placed and whether a 'corresponding node' already exists in that location. This is done in the following way:
In the case where the inbound node is initially determined to be an 'orphan', this status is continuously checked during the course of that transfer. If its parent node appears later on in the same transfer then the orphan is re-parented. Note that orphans are not permitted to remain following a transfer. If an orphan's parent node does not appear during the same transfer then the transfer will fail.
This section identifies features that aren't in the transfer service yet, but are known about as potential future enhancements. If you have suggestions then please do add them here.