cancel
Showing results for 
Search instead for 
Did you mean: 

Alfresco 3.2 is very slow with operatins on many small files

dirkvdzee
Champ in-the-making
Champ in-the-making
Hi all,

We are currently evaluating Alfresco for our organisation and some of our customers.

Everything is working fine and robust except for one thing;
When copying more than 1000 small files to the Alfresco server via CIFS, the average speed goes down to less than 58KB p/s over a local LAN.
Eventually even connection errors sometimes appear.
Afterwards, deleting the files from the server is also very slow, although somewhat faster then copying.
Copying a 300MB file happens at a speed of 9.9MB p/s, which is really fast.
We thought about missing indexes on tables (meta-data), but monitoring the database server showed very short running queries.
Another factor can be the priority of the Lucene indexing process.

The performance is so bad that it is a real show stopper.
We have tried everything we could think about.
Does anyone have some clues for us?

Thanks in advance,
Dirk


Below you find an extended description of the things we tested:

Configuration:
The test machine is an Intel E2180 Pentium Dual core 2GHz 800Mhz FSB processor with 2GB internal memory and 160 GB SATA harddisk.
It is running Ubuntu server 9.04 including all fixes.
Alfresco 3.2 is an Out-Of-The-Box (alfresco-community package) installation with CIFS and SharePoint enabled.

Tests:
There is one simple test we do. We copied a many (mostly) small files via CIFS to a newly created folder on Alfresco.
* A bandwidth monitor showed an large increase in speed when a big file was being copied and almost a flat line with smaller files.
* Deleting the files from Alfresco is somewhat faster, but still very slow.
* The performance is better (but not great) when copying the files from Alfresco to a local machine.
* As stated before, one big 300MB file copies with an average speed 9.9MB p/s.

We installed Samba, tested the speed and got the following (normal) figures.
The 300MB file is copied at 10.3MB p/s and a few hundred smaller files give an average of 7MB p/s.
As you can see, when streaming one big file there is very little difference between Alfresco and Samba.
Of course file access under Alfresco will always be slower when compared to Samba, but not to that extend.

Other things we tried and found no (positive) differences:
* Relocating the database to external database server
* Different (new) content stores
* Local access (mount) versus network access
* The FTP protocol was slower

Finally we compared a completely different software configuration. We booted the same machine with Windows XP and installed Alfresco 3.2.
Copying the 300MB file revealed a speed of 6.9MB p/s.
This is 30% slower compared to Linux, but this is probably due to the speed difference between Linux and Windows.
Although with slower performance, the other test gave the same picture as the Ubuntu configuration.
10 REPLIES 10

dirkvdzee
Champ in-the-making
Champ in-the-making
It is surprising to see that there is no answer because since this previous post I found several posts of people having exactly the same problem also without a solution.
On the other hand it does seem to work for (most?) others.

Quick recap:
We have an out of the box Alfresco 3.2 Community installation on Ubuntu 9.04 (and Windows).
The hardware is more than adequate and we used optimized parameters for JVM memory.

The problem is the very, very slow performance of handling multiple files in CIFS during one copy.
Normally, I think, “out the box” should (and probably will) give adequate performance, so there must be a (small?) problem.
We have searched further, did some new tests and came up with new information.

Tests
We started a test to see if the problem was definitely related to the number of files and not the size.
Here is a comparison between copying small and large files:
-   4 files of 5.5MB (total 22MB) were copied in 17 seconds
-   18 small files with a combined size of 134KB were copied in 32 seconds
This confirms the problem lies in adding files (not the size).

After this we tried the Alfresco JLAN server. This is the Open Source Java CIFS server that’s used in Alfresco.
We installed JLAN (just unzip), made a few changes in XML configuration file and started the process on the command line (so no Tomcat).

We did the same performance test:
-   4 files of 5.5MB (total 22MB) were copied in 12 seconds
-   18 small files with a combined size of 134KB were copied in less than 2 seconds
This works fine. So the problem is not in JLAN server.

Remaining factors are either Tomcat (jsvc service) or something in Alfresco (but not the JLAN part).

We really want to use Alfresco.
Anyone have ideas what we can do or test next?
Can we rule out Tomcat as source of the problem someway?

loftux
Star Contributor
Star Contributor
Just throwing in some ideas
-Try a different database, compared to just using jlan you have lots of db inserts. For community you have the option to try postgresql instead of mysql, available as a separate download.
-Check I/O, is your index located on a really fast disk? Are you running on virtual machines, and are they properly configured for I/O performance?

troton
Champ in-the-making
Champ in-the-making
Same problem with the ubuntu installation out of the box, any news?

dirkvdzee
Champ in-the-making
Champ in-the-making
No virtualization is used and the disk is sata 7200 with a capacity of 500GB.
We did try to install on a completely different (pretty fast) machine, again with the same problem.

Also, we relocated the database on a external (fast) database server, but without any improvement. During the copying operation, the utilization of the database server was very low.
This was expected because monitoring of the database server revealed few queries (there are no users on the system during the copying operation).

Because of the lack of reaction of official Alfresco team members (and others) in this and several similar posts, I think it is likely that there is no real solution (other than perhaps buying very expensive hardware).
So the status is that we are really stuck with Alfresco and, unfortunately, probably need to look at other options. :cry:

troton
Champ in-the-making
Champ in-the-making
I Have tested all the Alfresco 3.x community versions and all of them have several issues with CIFS. We have currently installed the 2.1 version and qe can´t upgrade to 3.x branch becouse this issue!!!

jpbarba
Champ in-the-making
Champ in-the-making
Yes, I have tested several Alfresco Community versions on windows, and there are always problems with CIFS.
Normally, the system halt for a minute during a copy. I haven't detected this problem with Enterprise version.
The Community version 3.2 on Linux works perfectly (CIFS functionality)

Greetings

mroessler
Champ in-the-making
Champ in-the-making
We experience the same poor performance on tests of 3.2r CE. Using Mysql as the database, running on a high-end (IO spread across multiple 15k SAS drives, 24GB RAM, 8 processor) machine. Copying a few hundred small text files (2MB total) into Alfresco over a CIFS share accessed on a gigabit LAN takes approximately 15 minutes. Copying one large (5GB) file does not produce the poor performance results. This is a big concern.

rogier_oudshoor
Champ in-the-making
Champ in-the-making
Actually, this difference is not just there with CIFS - it also happens with Webdav. Alfresco has a per-file overhead since the file isn't just copied on the local hard drive, there are also DB nodes created and search indexes to add. And yes, all of it happens in the same transaction before commit. There are also stability issues too when importing large batches of content at once using Windows clients.

So yes, the number of files is more important then the average filesize. And yes, it's core to Alfresco.

loftux
Star Contributor
Star Contributor
This issue in Jira might be related https://issues.alfresco.com/jira/browse/ETHREEOH-1553
Maybe try this suggestion to disable change notifications in CIFS.
Could you try adding the following line to the file-servers.xml, or custom config, where the Alfresco filesystem is defined :-

   <config evaluator="string-compare" condition="Filesystems">
      <filesystems>
         
         <!– Alfresco repository access shared filesystem –>
         <filesystem name="${filesystem.name}">
            <store>workspace://SpacesStore&lt;/store>
            <rootPath>/app:company_home</rootPath>

            <disableChangeNotification/> <!– Add this line –>

            ..
         </filesystem>
      </filesystems>
   </config>