Hyland Connect

gyro_gearless · ‎04-14-2010

Hi folks,

currently i am facing some serious performance problems when working with Explorer - e.g. it takes ages (some 30+ seconds) to display the first page of an folder containing some 600 documents.

Digging in the log files revealed that oviously Hibernate emits some less than optimal queries. Here is what i have found in PostgreSQL's log (Query formatted to make it more readable):

select 
  this_.id as id4_1_, 
  this_.version as version4_1_, 
  this_.store_id as store3_4_1_, 
  this_.uuid as uuid4_1_, 
  this_.transaction_id as transact5_4_1_, 
  this_.node_deleted as node6_4_1_, 
  this_.type_qname_id as type7_4_1_, 
  this_.acl_id as acl8_4_1_, 
  this_.audit_creator as audit9_4_1_, 
  this_.audit_created as audit10_4_1_, 
  this_.audit_modifier as audit11_4_1_, 
  this_.audit_modified as audit12_4_1_, 
  this_.audit_accessed as audit13_4_1_, 
  dbaccessco2_.id as id14_0_, 
  dbaccessco2_.version as version14_0_, 
  dbaccessco2_.acl_id as acl3_14_0_, 
  dbaccessco2_.latest as latest14_0_, 
  dbaccessco2_.acl_version as acl5_14_0_, 
  dbaccessco2_.inherits as inherits14_0_, 
  dbaccessco2_.inherits_from as inherits7_14_0_, 
  dbaccessco2_.type as type14_0_, 
  dbaccessco2_.inherited_acl as inherited9_14_0_, 
  dbaccessco2_.is_versioned as is10_14_0_, 
  dbaccessco2_.requires_version as requires11_14_0_, 
  dbaccessco2_.acl_change_set as acl12_14_0_, 
  properties3_.node_id as node1_3_, 
  properties3_.actual_type_n as actual2_3_, 
  properties3_.persisted_type_n as persisted3_3_, 
  properties3_.boolean_value as boolean4_3_, 
  properties3_.long_value as long5_3_, 
  properties3_.float_value as float6_3_, 
  properties3_.double_value as double7_3_, 
  properties3_.string_value as string8_3_, 
  properties3_.serializable_value as serializ9_3_, 
  properties3_.qname_id as qname10_3_, 
  properties3_.list_index as list11_3_, 
  properties3_.locale_id as locale12_3_, 
  aspects4_.node_id as node1_4_, 
  aspects4_.qname_id as qname2_4_ 
from 
  alf_node this_ 
  left outer join alf_access_control_list dbaccessco2_ on this_.acl_id=dbaccessco2_.id 
  left outer join alf_node_properties properties3_ on this_.id=properties3_.node_id 
  left outer join alf_node_aspects aspects4_ on this_.id=aspects4_.node_id 
where 
  this_.store_id=79 
  and this_.uuid in ('19390aa7-726d-4feb-ba3f-01cef29a6165',
 '6a325159-3cee-416d-8b53-4e974dabf9b3', 
 '9e8e75aa-d38f-4340-a8cd-210fbc496536', 
 '01c0e78e-9afa-4ddb-9c48-325e899d3196', 
 'c54bfec7-ba56-483b-84e4-7473842d4799', 
 '51384f79-f8ca-420b-8265-902acda7be41', 
 'e1099907-9d72-409f-a2f2-5e2c57eab936', 
 '867ee750-188a-42cf-b375-6069511959d4', 
 '2270c5a2-f575-4716-9593-694c70c002d9', 
 '96b5de04-ea21-4313-ab2f-ac004eadc785', 
 '7ec3fa46-9d0a-43dc-af4d-7876704aa506', 
 'b1150256-c748-40ab-9061-f7afaabc1afa', 
 'efdfad4e-b4b9-4b37-a2d0-45af623ebab0', 
 '9fe53976-7fe1-47c7-ba6b-b54416a05703', 
 'c680969d-54a6-42d6-9a08-3b137137956e', 
 '47d681b1-1c73-4c92-92f0-7d7aceaa24a5', 
 '84d03f41-3f0c-408c-9560-cfcd4b85a1b3', 
 'b9a8861e-b65c-4593-960f-b8bfdb76a21d', 
 '7e71ee2f-d6f0-4c62-9f1f-829c7f73b7a1', 
 '49c4e426-fa7a-48b7-93a2-0fe9935563b4', 
 'c6b28361-da5f-42d4-ab6f-e6d1fcb76942', 
 'a229d01e-5e95-404e-9a0d-bdefb8024aab', 
 '4a6919a8-5057-431f-830d-153179a9a24c'
);
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

Well, this looks reasonable, perhaps Alfresco fetches 23 nodes from the database. When i execute that command in PgAdmin, the query yields 2776 rows.

Next, some child associations are fetched, and here things go terribly wrong:

Execute <unnamed>: select this_.id as id7_0_, this_.version as version7_0_, this_.parent_node_id as parent3_7_0_, this_.type_qname_id as type4_7_0_, this_.child_node_name_crc as child5_7_0_, this_.child_node_name as child6_7_0_, this_.child_node_id as child7_7_0_, this_.qname_ns_id as qname8_7_0_, this_.qname_localname as qname9_7_0_, this_.qname_crc as qname10_7_0_, this_.is_primary as is11_7_0_, this_.assoc_index as assoc12_7_0_ from alf_child_assoc this_ where this_.child_node_id in ($1, …$2776)
‍‍

Yes, this query gets the 2776 id's from the previous query fed as parameters! :mrgreen: Where obviously most id's have the same value (educated guess: 23…). And i have seen the same query with more than 30000 (!) parameters passed, which explains the performance problem….

For handwritten SQL i would say that someone forgot a "distinct" somewhere, but as this all seems to be generated by Hibernate, i have no idea where to fix this.
Anyone here who has the same problem? Or an hint where to look for in the sources?

Cheers
Gyro

gyro_gearless · ‎04-23-2010

Update:
Out of despair, i tried the same with MySQL instead of PostgreSQL: same result.

Displaying the first page in a space containing some 500+ Documents takes about 35 seconds. Ouch, even slower than the legacy DMS Alfresco is supposed to replace :mrgreen:

Cheers
GYro

ssaravanan · ‎04-23-2010

The performance depends on many things, not just the DB alone.
Whats your memory, CPU speed, JVM tuning ,etc are some of the main factors
which probably need to be investigated. Chrome & Firefox seems to be much
faster in executing rich Ajax apps

marco_altieri · ‎05-05-2010

In my opinion, 500 documents in a single folder are too much for Alfresco.
With a better hardware, It could be faster, but it would be slow anyway.

mabayona · ‎05-05-2010

I´ve seen same sharp performance degradation in Alfresco Explorer after migrating from 3.0 to 3.2r2.

It happened in two different servers, both with linux and 2 and 4 Gbytes of RAM + MySQL.

It takes now about 10 seconds to display the first page of a space with 240 subspaces (paged in chunks of 20 as per default configuration). Before with 3.0 (same configuration, same data, same all, just upgraded from 3.0 to 3.2r2) it took less than a second to display the very same space.

Definitly something has changed in this area between 3.0 and 3.2r2.

It has to do with caching, but only partially, because if same space is reloaded, it takes less time (about 4 seconds) but still much more that in 3.0.

Any idea why?

mabayona · ‎05-07-2010

Probably this problem is explained with detail in:

http://forums.alfresco.com/en/viewtopic.php?f=6&t=26544

luca · ‎06-16-2010

I have similar problems, but I want to add some considerations.

Looking at the DB, PostgreSQl in my case, I noticed some difference from 3.0 schema and 3.2 that might cause some problems.
In particular adding some indexes I open the web client in 1.68 second instead of 10.38 seconds without indexes!

I'm still testing it, does anyone could say me if I'm running in others trouble or it could be a good solution?

Here's the SQL script to add indexes:


CREATE INDEX fk_alf_ace_auth ON alf_access_control_entry USING btree (authority_id);
CREATE INDEX fk_alf_ace_ctx ON alf_access_control_entry USING btree (context_id);
CREATE INDEX fk_alf_ace_perm ON alf_access_control_entry USING btree (permission_id);
CREATE INDEX fk_alf_acl_acs ON alf_access_control_list USING btree (acl_change_set);
CREATE INDEX fk_alf_aclm_ace ON alf_acl_member USING btree (ace_id);
CREATE INDEX fk_alf_aclm_acl ON alf_acl_member USING btree (acl_id);
CREATE INDEX fk_alf_adtf_conf ON alf_audit_fact USING btree (audit_conf_id);
CREATE INDEX fk_alf_adtf_date ON alf_audit_fact USING btree (audit_date_id);
CREATE INDEX fk_alf_adtf_src ON alf_audit_fact USING btree (audit_source_id);
CREATE INDEX fk_alf_attr_acl ON alf_attributes USING btree (acl_id);
CREATE INDEX fk_alf_autha_ali ON alf_authority_alias USING btree (alias_id);
CREATE INDEX fk_alf_autha_aut ON alf_authority_alias USING btree (auth_id);
CREATE INDEX fk_alf_cass_cnode ON alf_child_assoc USING btree (child_node_id);
CREATE INDEX fk_alf_cass_pnode ON alf_child_assoc USING btree (parent_node_id);
CREATE INDEX fk_alf_gatt_att ON alf_global_attributes USING btree (attribute);
CREATE INDEX fk_alf_lent_att ON alf_list_attribute_entries USING btree (attribute_id);
CREATE INDEX fk_alf_lent_latt ON alf_list_attribute_entries USING btree (list_id);
CREATE INDEX fk_alf_matt_att ON alf_map_attribute_entries USING btree (attribute_id);
CREATE INDEX fk_alf_matt_matt ON alf_map_attribute_entries USING btree (map_id);
CREATE INDEX fk_alf_nasp_n ON alf_node_aspects USING btree (node_id);
CREATE INDEX fk_alf_nass_snode ON alf_node_assoc USING btree (source_node_id);
CREATE INDEX fk_alf_nass_tnode ON alf_node_assoc USING btree (target_node_id);
CREATE INDEX fk_alf_node_acl ON alf_node USING btree (acl_id);
CREATE INDEX fk_alf_node_store ON alf_node USING btree (store_id);
CREATE INDEX fk_alf_node_txn ON alf_node USING btree (transaction_id);
CREATE INDEX fk_alf_nprop_n ON alf_node_properties USING btree (node_id);
CREATE INDEX fk_alf_qname_ns ON alf_qname USING btree (ns_id);
CREATE INDEX fk_alf_store_root ON alf_store USING btree (root_node_id);
CREATE INDEX fk_alf_txn_svr ON alf_transaction USING btree (server_id);
CREATE INDEX fk_alf_usaged_n ON alf_usage_delta USING btree (node_id);
CREATE INDEX fk_avm_s_acl ON avm_stores USING btree (acl_id);
‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

gyro_gearless · ‎06-16-2010

Hi,

i just tried the missing indexes on some lame system - cool, this turbocharged ALfresco to show some folder which took 40+ seconds before to show up in 10 seconds. Of course, i had to do a VACUUM ANALYZE first to get the speed-up :mrgreen:

Together with an increase of shared_buffers in postgresql.conf (1000 -> 32768) this yielded some 700% performance boost….

Thanks a lot, this saved our day!

Cheers
Gyro

luca · ‎06-17-2010

Hi gyro, glad to help you!

I also applied some patch on HibernateNodeDaoServiceImpl from Alfresco 3.3, in particular cacheNodesNoBatch method I changed the beginning like this:

    private void cacheNodesNoBatch(Store store, List<String> uuids)
    {
        Criteria criteria = getSession().createCriteria(NodeImpl.class, "node");
        criteria.setResultTransformer(Criteria.ROOT_ENTITY);
        criteria.add(Restrictions.eq("store.id", store.getId()));
        criteria.add(Restrictions.in("uuid", uuids));
        criteria.setCacheMode(CacheMode.PUT);
        criteria.setFlushMode(FlushMode.MANUAL);

        List<Node> nodeList = criteria.list();
        Set<Long> nodeIds = new HashSet<Long>(nodeList.size()*2);
        for (Node node : nodeList)
        {
            // We have duplicate nodes, so make sure we only process each node once
            Long nodeId = node.getId();
            if (!nodeIds.add(nodeId))
            {
                // Already processed
                continue;
            }
            storeAndNodeIdCache.put(node.getNodeRef(), nodeId);
            nodeIds.add(nodeId);
        } 
…‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍‍

I found also some problems related to the search that is very slow although the patches.
Are You experiencing this problem like me? Did you find a solution?

Hyland Connect

Performance issue with 3.2r2