cancel
Showing results for 
Search instead for 
Did you mean: 

Performance issue with alfresco 5.0d and solr4

ecarbenay
Star Contributor
Star Contributor

Hello,

we're having a strange issue with alfresco 5.0 d and SOLR4.

The server is running with this alfresco version since one year, without any problem.

Alfresco has been installed manually, by deploying the different war after installing tomcat and mysql.
Alfresco is running behind a reverse proxy (an apache server simply redirecting to ajp port).

Last month we worked to re-organize the way Alfresco is installed on that server, in order to be able manage multiple tomcat instances.
During the work SOLR4 “binaries” that were previously located in alf_data have been moved to another place.

The server is under permanent heavy load since that time, and we are searching why.
JavaMelody is deployed on that server, and we can see that the load is due to getContent calls by SOLR.

top during high load event

active java threads

cpu activity over one day

http hits per minute

mean age of http sessions

swap evolution over one year

tomcat connectors activity

api/solr/textContent activity reported by javamelody :

When we try to reproduce what is running Alfresco, we do not succeed :
https://[servername]:8443/alfresco/service/api/solr/textContent?nodeId=420 is responding HTTP Status 403 - X509 Authentication failure.
There is no log in catalina.out for this call.

When we run the same query on the test system we get the correct answer.

But we can’t find why this GET is not working !

We don’t find any other check in order to solve this issue, any help would be appreciated.


Software versions :
OS --> Red Hat Enterprise Linux Server release 6.4 (Santiago)
Apache --> 2.2.29
Mysql --> 5.6.17
Tomcat --> 7.0.57
Jvm --> 1.8.0 u101
Alfresco --> 5.0.0d
Solr --> 4.9.1 1625909 - mike - 2014-09-18 04:09:04

JVM arguments for Alfresco :

JVM Arguments Alfresco

JVM arguments for SOLR :

JVM arguments SOLR

Configuration files and log files are attached.

Thanks in advance for your help

Emmanuel

4 REPLIES 4

afaust
Legendary Innovator
Legendary Innovator

Looks like SOLR is doing full text indexing. If there is a misconfiguration in the SOLR tier that results in SOLR not finding its old (moved) index files, it will simply start and create a new index from scratch.

The SOLR configuration files seem to be fine - I thought I saw an issue for a moment but turned out I just misread something due to the bad default text reader for zipped content.

ecarbenay
Star Contributor
Star Contributor

Thank you for your answer Axel.

I understand that to your opinion the configuration used is the good one regarding SOLR.

Despit that the load on the server continues to be heavy, and we get no explanation about that issue.

Do you have some recommendation to search the reason about that heavy load ? We think this heavy load is caused by SOLR because of the number of call to textContent (/service/api/solr/textContent) which represents 86 % of the activities on the server, and the X509 Authentication failure we get when running textContent manually. We don't find any way to check why we get such X509 when running it when it seems Alfresco is able to run it because documents are content indexed. Do you have some ideas about that ?

afaust
Legendary Innovator
Legendary Innovator

I already mentioned one possible cause: If the /data/solr4 directory did not contain the old index data after the move, then SOLR started a new indexing process causing as many calls to /textContent as there are documents. How many documents do you have stored in that system? How many of those documents may need to be transformed into text first (i.e. Office documents / images to be OCR'ed)?

In the screenshot of top there really isn't that much CPU load on the user side - most of the load is on IO wait times which means that some storage device / filesystem level access is causing the load.

It /data/solr4 mapped to any kind of special drive / storage area? How is the general performance of that device, i.e. what is its latency / throughput rating?

tdaget
Champ on-the-rise
Champ on-the-rise

Hi all,
I'm working with Emmanuel CARBENAY about this problem. It's closed right now as it doesn't occur anymore.


We are not absolutely confident to have the right solution (maybe there is somethink else to focus to) but the slow down symptom doesn't appear at all since end of january ...

The solution lies in the MySql configuration. Too many database connections was allowed consuming a lot of memory. As some free memory was missing, tomcat try to free it too often, making system starting to swap and the overall platform go to slow down. It was occuring during ~15min at a random point in the day (maybe while final users try to work a little more ?).
Decrease 'max_connections' parameter was the first good thing to do. Alfresco 'db.pool.max' parameter has to be set according ...

We found this solution based on global memory usage analyse and detailed unix process memory consumtion analyse ... it was not funny and not trivial ...

Thanks for your answer
Thomas