cancel
Showing results for 
Search instead for 
Did you mean: 

Performance problems after a while that is not used

npasquetto
Champ in-the-making
Champ in-the-making
Hi,
I've got a strange behavior in terms of performance in a webscript that consists in a Lucene query with some condition (by path, by date, by aspect and sometimes by some imap aspect metadata).

The webscript is very slow when it is called after a while that is not used but is very very fast if it is called many times in a short range of time; it not seems to be related to the cache, because if i completely change the conditions (so the results will be a completely different) it remains extremely fast.

We are on a old 3.3 alfresco version and the webscript is written in javascript and it is executed always with the same user (that is not an admin).

I can't figure out which components could cause this behavior (eg: there is a loading phase of a webscript?)

Someone have an some ideas?
13 REPLIES 13

afaust
Legendary Innovator
Legendary Innovator
Hello,

a couple of reasons can influence the performance and there are multiple levels of caches to consider here, not just the Alfresco node and properties caches. E.g. Lucene can and will cache some index data necessary for performing the query. The database can and will cache some table data that even affects performance of queries with completely different results. And the OS may at a low level cache file system contents that Lucene or the DB use to execute queries when they have not cached data in memory.

What kind of magnitude of performance difference do you observe? How many results are we talking about here? What exactly does the Lucene query look like (keep in mind, PATH queries can be veerryyy expensive)?

Regards
Axel

npasquetto
Champ in-the-making
Champ in-the-making
<blockquote>What kind of magnitude of performance difference do you observe?</blockquote>
The difference is huge: 80 seconds for slow response, 2 seconds for fast response.

<blockquote>How many results are we talking about here?</blockquote>
Another thing that i forgot to specify in my post is that seems there is no relation between performance and number of results. Anyway we are talking of a maximum of 2000 results.

<blockquote>What exactly does the Lucene query look like (keep in mind, PATH queries can be veerryyy expensive)?</blockquote>
We have some files with the IMAP aspect, and we filter on metadatas as sentDate, sender, subject. All the files are organized in subspaces under a common spaces, so we introduce the PATH condition to limit the search, but in the path we need to use some wildcart to be able to go in subspaces. Could this condition affect the performance? If yes, I can remove it.

afaust
Legendary Innovator
Legendary Innovator
Ok - 80 vs 2 seconds is really a difference that are hard to explain away just by warm-up issues with some caches. Even Lucene index queries rarely rack up this kind of difference between cold+hot caches…

Does the webscript do anything else with the results of the Lucene query apart from generating the response via FTL? Can you insert timing measurements into the code of the web script to determine at what point in the execution what amount of time has elapsed?

Regards
Axel

npasquetto
Champ in-the-making
Champ in-the-making
<blockquote>Does the webscript do anything else with the results of the Lucene query apart from generating the response via FTL?</blockquote>
The webscript is very simple: based on the URL parameters it build a query string through a simple string concatenation; then there is a call to the search API
search.query(def);
where def is the object containing the search confinguration (query, language, sort…).

The result of the query will be assigned as variable in the model and passed to FTL to build the JSON response.

So there is no other elaboration except the query call.

npasquetto
Champ in-the-making
Champ in-the-making
<blockquote>Can you insert timing measurements into the code of the web script to determine at what point in the execution what amount of time has elapsed?</blockquote>
I've put some timing measurements in the code (query building, before the call to search API, after the call to search API, at the end of the script) and I confirm that all the time is spent by the search API. I noticed that sometimes the query is faster, a little more than 1 second, but I get the answer after 30 seconds. By raising the webscript immediately after, I get an instant response.

I noticed also that the clause is ignored if MaxItems use the admin user: there is no way to avoid this?

kaynezhang
World-Class Innovator
World-Class Innovator
How many memory did you allocate to JVM? have you tried to increase the heap size for the JVM? 
Searching can use a lot of memory for a large index,. I guess if your JVM is not running with a large enough HEAP size then the JVM will pays the price of initializing caches at your first query at some time.

Thank you for your help.
<blockquote>if your JVM is not running with a large enough HEAP size then the JVM will pays the price of initializing caches</blockquote>
The JVM has a very large HEAP size, the server is very powerful.

But I've notice that, often the time of execution of the query seems low, but the time of response is high. There is a caching mechanism for the template?

kaynezhang
World-Class Innovator
World-Class Innovator
Yes,first query will cause the index caches to be warmed up( especially for sorting) and this is why  the first query takes some time.
Where did you place your index on? keep the index on local disk will improve performence.
How much memory did you leave for  operating system ? It seems os also needs some momory to cache index files.

<blockquote>this is why the first query takes some time</blockquote>
OK, but after some investigating, it seems that is not the query the problem.

for example:
<ol>
<li>First request: total time 30 sec, query time 2 sec</li>
<li>Second query (the same): total time 2,5 sec, query time 2 sec</li>
<li>Third query (completely different resultset): total time 2,5 sec, query time 2 sec</li>
</ol>
And in my cluster, I verified this behaviour in every single node.