cancel
Showing results for 
Search instead for 
Did you mean: 

Moving large datasets around Alfresco

bengrah
Champ on-the-rise
Champ on-the-rise
Hi all,

Got a question around moving large datasets around Alfresco. We're in the middle of migrating large datasets into our on-premise installation, these can be around 500Gb on average. We migrate the data in using the Bulk FS tool that comes with Alfresco and it works quite well for the most part.

Ocasionally we may import files into the wrong area, and we need to move the folders into the correct area after the data has been imported into Alfresco.

This is where we get stuck. For example, right now we've got an 18Gb folder we need to move. This is our experience through the various interfaces:

<ul>Explorer: Cutting and Pasting the folder from one place to another appears to work (the page appears to load as standard as it's executing the command) but does not give any indication of time it'll take to complete. After half an hour there is no evidence that the move has taken place (we had a different window open to the target directory)</ul>
<ul>Share: Using the Move to… command on the folder starts the move process, along with the Moving files… prompt. This prompt disappears after 1-2 minutes. There's no indication of a file being moved. As above, no data appeared in the target directory</ul>
<ul>CIFS: Far too slow, and occasionally we're getting timeouts here</ul>
<ul>WebDAV: Same as CIFS</ul>
<ul>FTP: Appears to make no progress after half an hour of waiting</ul>

We'll most likely just have to re-import the data back into Alfresco only this time in the correct location, but it's data migrations are not the only context that we encounter the above problems.

Does anyone else have similar experiences? Or better yet, seen any solutions / using any solutions?

I do appreciate I am talking about relatively big data sets here, so there's a certain degree of expectation.

4 REPLIES 4

mrogers
Star Contributor
Star Contributor
You should be able to move nodes around the repo without affecting the content properties.  That will be much, much better than re-importing all the content again.

I'm not sure how that surfaces in the user interface(s) but you can copy a content property from one node to another which effectively links more than one node to the same content.

Equially rename over ftp or cifs should be quick since its just a rename,  not a reload.

bengrah
Champ on-the-rise
Champ on-the-rise
It's not so much content properties is the problem, we've got a folder with 18 Gb that isn't moving when we try to through the various interfaces. It's nothing to do with content properties or renaming of anything.

mdutoo
Champ on-the-rise
Champ on-the-rise
Hi

As mrogers said, folders don't "move", they get renamed (save if you use a remote interface such as CIFS, WebDav etc. through a Windows Explorer). So use the Alfresco UI to rename your folder and it should take no time and appear to be moved when you go back to your Windows Explorer / CIFS / WebDav etc. window.

But in case you want to reimport it all, the most flexible (set per-file metadata) and efficient (per-file transaction) way is the Alfresco ETL Connector for Talend :

http://knowledge.openwide.fr/Main/AlfrescoETLConnector

Regards

kaushik_joshi
Champ in-the-making
Champ in-the-making
Hi , We have set Alfresco and created user name as per mail ID. Now co. decided to integrate with AD and successfully done. e.g local user "test1@adtest.local" and our AD also we have same Name "test1@adtest.local" but both password are different. Now from both different password we can access Profile. Now we need to stop local database user login and move all data in AD login. Need correct process to move data. We have more than 100 user which data we need to migrate so kindly help to complete task.