cancel
Showing results for 
Search instead for 
Did you mean: 

Solr4 and ACLs

butterman
Champ in-the-making
Champ in-the-making
We have an alfresco community 5.0.c environment. Previously used 3.4.d and upgraded through several versions. We created a new solr4 index from scratch after the final upgrade.

We currently have about 1500 sites and 200 users assigned to 15 groups. We assign groups to sites as their involvement with each site requires.

When adding a group to a site membership (or any ACL change for that matter) I notice that solr consistently takes a lot of resources (CPU and memory) while indexing the change for about 15mins! I can’t figure out why this is the case even when there are no documents in the site. It’s almost as though it has to re-index all the nodes that group has access to. Can anybody explain this behavior? What exactly is solr doing when notified of an ACL change like this?

Whilst not hindering system performance yet I am worried that as our repo grows this will become unsustainable.

In the alfresco core we currently have:
300k nodes
307k transactions
56k ACLs
30k ACL transactions

server: Centos 5.4 VM, 16G RAM, 4 cores, 4GB JVM, MySQL 5.6, Tomcat 7


Thanks

1 REPLY 1

butterman
Champ in-the-making
Champ in-the-making
When re-indexing an acl solr4 pulls a json document from alfresco that includes a list of the nodes parent paths. In our case each of our groups could be members of over 1000 sites and therefore the group node had that many parents which were re-indexed every time a group was added or deleted from a site.

The fix was to stop solr indexing parent paths by adding the following two lines of code:
nmdp.setIncludePaths(false);
nmdp.setIncludeParentAssociations(false);
to:
src5.0.c\projects\solr4\source\java\org\alfresco\solr\SolrInformationServer.java at line1691.

Link to updated source: https://drive.google.com/file/d/0B4zqRLvm-_xIT2Zna1lYZFZraUk/view?usp=sharing
Link to patched jar: https://drive.google.com/file/d/0B4zqRLvm-_xIdTd0VHpOek11NUk/view?usp=sharing