cancel
Showing results for 
Search instead for 
Did you mean: 

Document batch-upload

mikechenosky
Champ in-the-making
Champ in-the-making
Hi, does Alfresco support batch-upload of documents to its document library?
if so, how?
If not, can someone suggest how to get around it?
Thanks,
mike
15 REPLIES 15

mikeh
Star Contributor
Star Contributor
Hi,

FTP/CIFS/WEBDAV can all be used for batch update - also a user posted the code in these forums for a ZIP file import action where you upload a single ZIP containing all the files you want and it gets exploded out into the Alfresco repository.

Please see the wiki for CIFS and/or FTP configuration.

Thanks,
Mike

kevinr
Star Contributor
Star Contributor
Batch import via ZIP file upload has been added to the web-client UI for 2.1.

Thanks,

Kevin

gerald_quimpo
Champ in-the-making
Champ in-the-making
Hi Kevin, or anyone else,

Batch import via ZIP file upload has been added to the web-client UI for 2.1.

How do I do that (operationally, in the UI)?  I've been looking around and it's not clear to me how to do it.

Gerald

tommorris
Champ in-the-making
Champ in-the-making
Off the top of my head, you can do this for WCM projects.
When you browse your sandbox for a particular web-project, the 'Create' icon drop-down list has a 'Bulk Import' option.
I think kevnr might mean that. Did you mean bulk import for regular DM spaces?

I did notice that the code for the regular ACP import action, can also import ZIP files…
So you can try a rule that kick starts that action for ZIP file-types.

Tom

gerald_quimpo
Champ in-the-making
Champ in-the-making
Did you mean bulk import for regular DM spaces?

yes, what I meant was bulk uploading many documents (in my prototype/test,
it's around 10,000 small (less than 512 bytes each) XML files.  I don't need aspects,
title, description, etc to be set.  I just need to push many files into the repository.  

I find that writing files to alfresco via the CIFS or NFS share is very slow.  In my
prototype/test,  writing to a raw disk filesystem takes 7 seconds or so.  Writing the
same  files to CIFS takes 34 minutes or so.  Deleting those files (or, actually,
the folder in which they reside and specifying all files and folders inside it to be
deleted too) is also very slow (I'm not timing it, but it's been around 3 minutes now,
while I've been writing/editing this post).

This is just my desktop development server, one drive, 2G of RAM, tomcat and
alfresco have 1G of that.  but the overhead is very high.  I'd be happy with writes,
reads and deletes being 5 times slower than to the raw filesystem.  Even
10 times slower would be OK (my 10,000 files would be written in 70 sec).
But almost 300 times slower is a problem.  Is this normal?  Or are there well known
performance tuning tips that I might missed?

I did notice that the code for the regular ACP import action, can also import ZIP files…
So you can try a rule that kick starts that action for ZIP file-types.

Ok.  I'll try to look into that.  In fact, though, I'm eventually going to have to do this
in a program (users will upload a single large file, the program, or rule+action will
take that large file and split it into many individual records/xml documents).  So
I'm looking at the web interface mainly as a proof of concept, that high speed
bulk uploading is possible at all.  Then I'd wrap whatever the right technique is
in code.

Thanks for your reply!

Gerald

ttsherpa
Champ in-the-making
Champ in-the-making
Hi everyone,

I'm quite new to alfresco, but I have a testing installation that seems to work, and while doing a batch upload of several thousands very small text files into Alfresco Enterprise in one of our LInux servers I notice the extremely slow speed you mention (100 times slower than a raw file write!!!!). Is this normal? Did you find any solution or explanation to the problem? I use FTP but I guess that similar bad results were obtained with zip files in the web interface, or with cifs.

What do you think?

TIA

mdutoo
Champ on-the-rise
Champ on-the-rise
Hi

Don't be so sure, I didn't do a comparison on Alfresco 3, but on 2.x CIFS was notoriously faster than FTP…

lfcohen
Champ in-the-making
Champ in-the-making
We are also facing severe problems. We use WebDAV and need to migrate a lot's of files and folders, basically, a full structure into our Alfresco.

Not only the transfer is painfully slow even on the local network where the server is located, as well as doing it remotely.

And a major problem is when we try to send a structure with folders and nested folders and more nested folders, regardless if they have files or not, we get "internal server error 500" and major problems!

We are using both the native WebDAV "client" on Mac and Windows, as well as CrossFTP (using WebDEV protocol) on both platforms as well..

Any clue what might be doing this ?????

Thanks for any feedback!

Leonardo

mmurphy
Champ in-the-making
Champ in-the-making
I'm giving Alfresco a trial run and was hoping to give a recommendation to use if for our company's internal document management and collaboration.  To say my experience so far has been  "frustrating" would be an understatement.  Anyway, I'm trucking along hoping it will all come together.

My problem as it relates to this thread is almost exactly the same as described above.  FTP and CIFS are both very slow copying files.  200 MB took almost 30 minutes on a Gig Ethernet network.  I tried copying the files to the server in windows then connected to the localhost share and copied the files that way into the repository.  Local drive into the repository took about the same amount of time as over the network, close to 30 minutes.

This really isn't acceptable for a production environment.  I have almost 2 TB of data to potentially upload into this repository, A task that will take about 2 and a half years at this rate.