cancel
Showing results for 
Search instead for 
Did you mean: 

Batch import of metadata / aspects on Alfresco?

trelofysikos
Champ in-the-making
Champ in-the-making
Hi,

I would like to batch import documents on the Alfresco repository.
I m able to do this with CIFS/ FTP/ WEBDAV.
These documents have certain metadata which i store in a database.
How can I give properties / aspects easily(btach) on my files?

Is it possible to batch import metadata or aspects on Files allready in Alfresco ?
30 REPLIES 30

groutal
Champ in-the-making
Champ in-the-making
Hey,

Absolutely! Check out the Talend tutorials available on the website: http://www.talendforge.org/tutorials/menu.php

bwakkie
Champ in-the-making
Champ in-the-making
ps as far as I understood does Talend only support alfresco version 2.x if I read this correctly.

groutal
Champ in-the-making
Champ in-the-making
You might want to check the component page on the Talendforge website. The latest update on the component was made on Talend version 3.1.

Component page:
http://www.talendforge.org/components/index.php

freebsdboy
Champ in-the-making
Champ in-the-making
This is rather strange.. the process and hoops to go through to migrate data.  Anyone who deploys Alfresco needs to do this.

There is  no prebuild acp file creator? or no one has shared theirs?

I'm walking through the steps to create the acp or metadata extract using Talend, a very cool product.
Using tutorial http://www.talendforge.org/tutorials/tutorial.php?idTuto=3#step53

But the turotial is generic.  So what creates the import file on a Linux server?
a good old ls -l /tmp/importmetadata.file
or
find . -print | xargs ls -gilds >/tmp/importmetatdata.file

rliu
Champ in-the-making
Champ in-the-making
freebsdboy,

You are absolutely correct. Most customers using Alfresco will most likely be migrating documents, metadata, content, etc. from an existing CMS or some form of document management system. My approach was to re-engineer the exported Alfresco Content Package (ACP):

1. Define the content model as needed.
2. Deploy to environment where you can create some test content or upload a document of some kind that would be representative of the migrated content.
3. Do an export of your test data.
4. Review the schema from the XML that was generated from the export.
5. Create a program that can generate an XML that reflects the XML file that was exported.
6. Package into an ACP.
7. Import into Alfresco.

This was not the easiest of solutions, but I think it was a good approach. I wrote a program that generated the XML and other content items. Build a stable repeatable process along the way as it'll require a lot of patience and testing. My ACP file that I used to import worked in various Alfresco Labs 3.0 stack. I was able to package 600 documents in its respective space. Though, 600 was all I had, I believe this approach can be repeated for larger volume.

Best of luck!

freebsdboy
Champ in-the-making
Champ in-the-making
Care to share the work you've done?

Hey this is open source - we all shouldn't have to redo the same thing.

pmonks
Star Contributor
Star Contributor
It's not an all-singing, all-dancing content migration tool (OpenMigrate is a better choice if that's the requirement), but I've put a simple filesystem import process into Google Code at http://code.google.com/p/alfresco-bulk-filesystem-import/.  This package is intended to simply and efficiently handle the basic use case of importing a large number or size of files & folders from local disk into Alfresco.

Please take the time to read the readme file [1] before using it.  The current version is still under heavy development and has a number of major functional gaps (eg. no support for loading metadata yet, although that work is in progress) that will preclude it for specific use cases, but for the basic case of quickly loading in a large number of assets it already works well.

Cheers,
Peter

[1] http://code.google.com/p/alfresco-bulk-filesystem-import/source/browse/trunk/README.txt

souhaieb
Champ in-the-making
Champ in-the-making
Is there any support for metadata loading within the last version of BFSIT

norgan
Champ in-the-making
Champ in-the-making
Hi Rich liu,
what where the major problems you had to solve ? Would this offer itself to be written as to be so generic to be of public use ?

i think about something like XSD transformation or atleast a XML defintion of the dataformat, which is to be imported (worst case - from an excel file or CSV file) Maybe this could go with Talend or become an alfresco forge project ?

Regards, Norgan

rliu
Champ in-the-making
Champ in-the-making
It'll take a little time to generify the code and make it extensible. As soon as I have something I feel worthy of publishing, I will provide it to the community. In the meantime, the actions taken as I described should allow you to migrate data into Alfresco (inclusive of custom properties).