performance problems when too many files in one directory

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
04-29-2010 08:29 AM
I have a real performance problem when user/s tries to open directory (either through CIFS or through WEB interface) with more than 1000 files in it (it takes like 20 seconds). I have MySQL DB and Alfresco 3.2.
I did some monitoring tests and tomcat and JVM seems to be running OK. So i think that the weak point is MySQL DB. Maybe i have to rebuild the tables indexes. I serached the forum but did not find anything.
Does somebody have some ideas what to optimize or what to do?
thanx.
- Labels:
-
Archive

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-26-2010 08:45 AM
(15 minutes and is working……untill now).Delete is still working ……

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-26-2010 09:52 AM
(15 minutes and is working……untill now).Delete is still working ……
For a my mistake i have killed the process

All the folders are there after 1 hour(more or less)… now i don't know how to do, probably re-create DB and ALF_IND.
So i have to wait for a new version of the Community? My version (3.2.r2) is difficult to use with these performance inside a production Environment.
Expleror view seems to work untill 1000 folders on MySQL (250 on Oracle) , but the delete is not working.
Thank in advance
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-26-2010 12:00 PM

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-26-2010 03:06 PM
Hi,
i have a similar problem: a space (named as root) contains 40000 spaces (created by a java main external to the Alfresco application).
Let me explain. I have two enviroments (named as A , B).
Env. A:
- ALFRESCO: Community Current version 3.2.0 (r2 2440) schema 3300
- OS: Windows xp 32 bit , RAM 2,5G , Intel Core Duo 2,1 GHz
- AS: Tomcat 6.0.18
- JAVA: Jdk 6.0_11. Heap 986,125MB
- DB :Mysql 5.1.35 Community
Env.B:
- ALFRESCO: Community Current version 3.2.0 (r2 2440) schema 3300
- OS: Linux Oracle 64 bit (RedHat 5.4), RAM 6G
- AS: JBOSS 4.2.2 GA
- JAVA: Jdk 6.0_18. Heap 4GB
- DB: Oracle 11g
Problems:Env. A : when i try to display the folder root i wait for 5/6 minutes and after Alfresco Explorer displays sometime 3500 items , 3800 itmes but never 40000 (note that 3800 has one 0 minus that 40000).
I don't know what happen, always strange behaviour.
Env. B : see http://forums.alfresco.com/en/viewtopic.php?f=14&t=26355
Bye
I have updated Env.B to:
-ALFRESCO: Community Current version 3.2.0 (r2 2440) schema 3300(on OracleDB)
- OS: Linux Oracle 64 bit (RedHat 5.4), RAM 18G, 2xCPU QuadCore
- AS: Tomcat 6.0.18
- JAVA: JRockit 4 ( 64 Bit) Heap 8GB
-DB: Oracle 10g Rel 2 (10.2.0.3) on different server
by derek » 26 May 2010, 12:23I have news (i have applied 1 and 2 not last derek changes it is not simple to do tests with a lot of data)
Hi,
1: I'm not convinced on the behaviour of 1 (it says a distinct entity per row, if I remember) and I don't know if it's rolling the related properties and aspects up correctly. We don't have that in our tested code.
2: Yes. That eliminates the duplicates and prevents blow-outs of the subsequent queries.
3: Yes. That prevents blow-out of the resultset from left joins to nodes AND aspects in the same query. We have a Criteria query for nodes-and-properties and a Criteria query for nodes-and-aspects.
Env A (Development):
- Explorer view a folder with 1000 subfolders (every subfolder has others 5 subsubfolders) : OK
- Explorer view a folder with 10000 subfolders (every subfolder has others 5 subsubfolders) : OK
- Explorer delete a folder with 1000 subfolders (every subfolder has others 5 subsubfolders) : KO
Env B (Test):
- Explorer view a folder with 250 subfolders : OK
I have also an Env C (Production) but i don't have excuted tests on this environment:
This is an HA Alfresco installed with standard configuration see : http://wiki.alfresco.com/wiki/Cluster_Configuration_V2.1.3_and_Later
[img]http://wiki.alfresco.com/w/images/9/91/Alfresco_LB_Diagram.png[/img]
Web Server A: digit1 and Web Server:digit2
-ALFRESCO: Community Current version 3.2.0 (r2 2440) schema 3300(on OracleDB)
-OS: Enterprise Linux 5.4 x86_64, RAM 18G, 2xCPU QuadCore
-Software Cluster: Oracle Grid Infrastructure
-AS: Tomcat 6 (not session replication)
-JAVA: JRockit 4 ( 64 Bit) Heap 8GB
-DB: Oracle 10g Rel 2 (10.2.0.3) on different server
I have changed (after a lot of tests) alfresco-global.properties and ehcache-custom.xml to make the cluster working.
Regards

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-28-2010 06:14 AM
i see this inside SVN, so 4 classes/xml must be updated.
Revision: 20226Author: derekhDate: 17.59.22, giovedì 13 maggio 2010Message:Merged BRANCHES/V3.3 to HEAD: 20192: Merged PATCHES/V3.1.2 to BRANCHES/V3.3: 20182: Fixed ALF-2712: Performance degradation from 3.1.0 to 3.1.2 20207: Merged PATCHES/V3.1.2 to BRANCHES/V3.3: 20203: Fix fallout from ALF-2712 … move back to no results rather than AccessDeniedException 20222: Merged PATCHES/V3.2.1 to BRANCHES/V3.3: 20212: Fix ALF-2719: 'patch.convertContentUrls' can result in "No ContentData value exists for ID" errors/alfresco/HEAD/alfresco/HEAD/root/projects/repository/source/java/org/alfresco/repo/admin/patch/impl/ContentUrlConverterPatch.java/alfresco/HEAD/root/projects/repository/source/java/org/alfresco/repo/domain/hibernate/Node.hbm.xml/alfresco/HEAD/root/projects/repository/source/java/org/alfresco/repo/domain/patch/AbstractPatchDAOImpl.java/alfresco/HEAD/root/projects/repository/source/java/org/alfresco/repo/jscript/ScriptNode.java/alfresco/HEAD/root/projects/repository/source/java/org/alfresco/repo/node/db/hibernate/HibernateNodeDaoServiceImpl.java
Can you tell me the best way of work on my version (3.2.r2 version on SVN 17458 ) to solve this problem?
A possible solution is to upgrade to 3.3.
Thanks in advance
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-28-2010 08:11 AM
ALF-2839: Node pre-loading generates needless resultset rows (a blocker for 3.3)
Your svn logs are mainly related to:
ALF-2712: Performance degradation from 3.1.0 to 3.1.2 (a critical for 3.3)
Both have been solved for 3.3 Enterprise but only 2712 made it into 3.3g before cut-off.
You should proceed by applying the fixes for ALF-2839 if you are affected by it (it depends on the product of aspect count and property count). At the simplest level just remove the call to cacheNodes, but you will need to decide if you can live with ETHREEOH-2657.
You can look at the bug discussions and diffs for ALF-2712 to decide if you want to patch your version. Once again, at the simplest you can remove the call to cacheNodes.
Regards

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-28-2010 02:09 PM
i have found also a possible solution for a related problem:
http://forums.alfresco.com/en/viewtopic.php?f=14&t=26355&p=88594#p88594
Thanks derek

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-04-2010 04:12 AM
The algorithms have a performace problem (time and cpu leak) that appears when a parent node contains thousands of child nodes.
A possible (really sad, I know…) workaround that I have implemented is to use lucene search to find the parent node avoiding the use of PATH: (use instead an attribute search for name, type, etc.) and use the method getChildAssociations to find the child nodes (this method uses the db).
Regards.
Alberto Ferrini

- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-05-2010 09:18 AM
we had the same problem at several area in Alfresco. For example in previous version when we had too many completed task the "My completed tasks" killed the server with an outofmemory.
Because of that I sent a patch that was located at https://issues.alfresco.com/jira/browse/ALFCOM-3405
This is the same problem as with the "too many files". I had append functions to the nodeservice to have PagedListDataModel. This means that in Alfresco normally if you open a directory the metadata of all of the subdirectories and files are loaded into the memory. This is not really good as the more files and folders will need more memory and time to render. Well the number is big enough if we have much memory however when there were like 10000 files in one directory if somebody opened the page (and pushed refresh several times as it would have taken minutes to load) the server got killed with even 2 gigs of memory for the JVM.
So the solution is to add to all of the listing functions in NodeService a version like listNodes(…, int firstResult, int maxResults). With that it would be possible to load only a smaller amount of records at one time into the memory. In the managed beans it has to be handled that only one page should be loaded when one page is shown.
Also the left tree view has to be modified when we see the subdirs because if we open a dir branch that contains 10000 subdirs it will freeze as well. So the tree should be modified to have only ten subdirs shown at a time and for example "…" as the first in the subtree and … on the last. With this a pager could be shown in the tree view.
There is an article that might describes in a better way what I wanted to express in the myfaces wiki: http://wiki.apache.org/myfaces/WorkingWithLargeTables
Without these modifications Alfresco cannot be used in a way that it handles directories with many contents. We had to hack this logic into Alfresco at some places like at the jira issue in the beginning of my post. After that it worked great however as new versions came out I did not find the time to make it in the newest versions as well (based on svn checkout).
Regards,
Balazs
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
08-08-2011 05:27 AM
